501 to 525 of 757 Observability Jobs in the UK

Senior AI Platform Engineer

Hiring Organisation: Jobleads-UK
Location: Greater London, England, United Kingdom

developer tooling that enable self‐service AI development across engineering teams. Design secure, scalable deployment pipelines for AI models and applications. Build AI observability capabilities including monitoring, tracing, evaluation, cost optimisation, and production quality measurement. Collaborate closely with AI Engineers, Backend Engineers and Engineering Leadership to define platform architecture … containerised deployments. Experience with solutions such as AWS Bedrock and AgentCore. Understand how to deploy, monitor, and operate AI services in production. AI Operations & Observability Experience implementing monitoring, tracing, evaluation, and cost optimisation for AI systems. Experience with observability solutions such as Arize Phoenix, Langfuse, or Langsmith. Understand the operational ...

Principal Platform Engineer

Hiring Organisation: SF Partners Admin
Location: Bristol, Avon, South West, United Kingdom
Employment Type: Permanent, Work From Home

capabilities. Design and operate production-grade Kubernetes platforms, including EKS, AKS or OpenShift. Define engineering standards, golden paths, reusable modules and platform patterns. Build observability strategies using Prometheus, Grafana, OpenTelemetry and modern APM tooling. Improve reliability through SLOs, incident reviews and Site Reliability Engineering (SRE) practises. Embed DevSecOps, supply-chain … Infrastructure as Code (IaC). CI/CD automation. GitOps tools such as ArgoCD or Flux. Internal Developer Platforms or self-service engineering. Observability tools including Prometheus, Grafana, OpenTelemetry, ELK, Datadog, Dynatrace or New Relic. DevSecOps and supply-chain security. SRE practises, SLOs, SLIs and incident management. Platform governance, cloud ...

Data Engineer (All Levels, Analytics & Platform) - UK Wide

Hiring Organisation: describe.me
Location: London, South East, England, United Kingdom
Employment Type: Full-Time
Salary: £40,000 - £120,000 per annum

science and ML all stand on. You'll work across the full data engineering lifecycle—from ingestion and modelling through to transformation, orchestration, quality, observability and platform operation. The role suits someone who pairs strong software engineering discipline with genuine interest in data modelling and a pragmatic view of when … equivalent) and the workflows around it Manage cloud data warehouses and lakehouses (Snowflake, BigQuery, Redshift, Synapse, Databricks) Implement data quality, testing, monitoring and observability across pipelines and models Build streaming pipelines where the use case warrants it (Kafka, Kinesis, Pub/Sub, Flink) Partner with analysts, scientists, BI developers ...

Senior DBA

Hiring Organisation: Morson Edge
Location: Manchester, North West, United Kingdom
Employment Type: Permanent, Work From Home
Salary: £75,000

deployment and ongoing maintenance Support Infrastructure as Code and configuration management using tools such as Ansible and Terraform Collaborate with engineering teams to improve observability, resilience and operational efficiency Provide technical guidance and mentorship to junior team members Participate in incident management, root cause analysis and continuous improvement activities Contribute … Code tools, including Ansible or Terraform Strong troubleshooting and problem-solving skills across database and system layers Familiarity with CI/CD pipelines and observability tooling Desirables Experience with cloud-managed database services Knowledge of schema migration tools Understanding of disaster recovery, retention and data protection strategies Experience with monitoring ...

AI Platform/ DevOps Engineer

Hiring Organisation: The Portfolio Group
Location: City of London, London, Castle Baynard, United Kingdom
Employment Type: Permanent
Salary: £70000 - £100000/annum + Benefits

Bedrock Knowledge Bases) and embedding pipelines Build and maintain CI/CD pipelines for inference services, retrievers, ingestion workflows, and RAG components Implement observability across AI workloads using CloudWatch, MLflow, and OpenTelemetry - covering latency, throughput, cost, and system health Apply secure-by-design principles including IAM, encryption, network controls … Terraform experience for infrastructure-as-code, provisioning and managing cloud infrastructure at scale Experience operating containerised services, managing CI/CD pipelines, and owning observability and reliability Familiarity with vector databases or search infrastructure (OpenSearch, Algolia) is a strong advantage Python proficiency for scripting, automation, and deploying production services Solid ...

Principal Platform Engineer

Hiring Organisation: SF Partners Admin
Location: Bristol, UK
Employment Type: Full-time

self-service capabilities.Design and operate production-grade Kubernetes platforms, including EKS, AKS or OpenShift.Define engineering standards, golden paths, reusable modules and platform patterns.Build observability strategies using Prometheus, Grafana, OpenTelemetry and modern APM tooling.Improve reliability through SLOs, incident reviews and Site Reliability Engineering (SRE) practises.Embed DevSecOps, supply-chain security and secure … role could suit a technical lead, a hands-on architect, a senior platform engineer ready to progress, or a deep SME in Kubernetes, AWS, observability, cloud platforms or developer enablement.Why this role? Work on genuinely national-scale digital services.Join a strong Platform Engineering community.Solve complex cloud and reliability challenges.Influence engineering ...

Azure Technical Architect

Hiring Organisation: JAM Recruitment Ltd
Location: United Kingdom
Employment Type: Contract
Contract Rate: Up to £507.51 per day

tagging standards, and cost management principles. Support CI/CD and DevOps patterns using GitHub Actions, DevOps pipelines, Infrastructure as Code, and automation. Define observability standards using Azure Monitor, Log Analytics, alerts, dashboards, and diagnostics. Review and validate engineering designs, ensuring alignment to standards and architectural patterns. Provide expert guidance … cloud, identity federation, Entra ID, service principals, and managed identities. Knowledge of storage architectures, resiliency patterns, backups, and Azure Site Recovery. Understanding of monitoring, observability, and operational readiness within Azure environments. Familiarity with DevOps, CI/CD, and Infrastructure-as-Code concepts and tooling. Knowledge of cloud security concepts including ...

ML Infrastructure Lead

Hiring Organisation: Jobleads-UK
Location: Greater London, England, United Kingdom

versioning, reproducibility, experimentation, feature management and release management Own and improve the production environment for machine learning systems, ensuring strong standards for availability, performance, observability and resilience Define and implement monitoring across model and platform layers, including system health, data quality, drift, latency, throughput and cost efficiency Build or optimise … pipelines, infrastructure-as-code and workflow orchestration Experience with tools such as Airflow or similar platform and orchestration technologies Good understanding of model observability, data quality, feature pipelines, lineage and reproducibility Experience designing scalable infrastructure for ML workloads, including training, batch inference and real-time serving Strong appreciation of reliability ...

Devops Engineer

Hiring Organisation: Jackson Hogg Ltd
Location: Newcastle upon Tyne, Tyne & Wear, United Kingdom
Employment Type: Permanent
Salary: £50000 - £60000/annum

We're looking for a DevOps Engineer to join a growing technology team responsible for building, supporting and evolving critical cloud platforms. This role sits at the heart of a modern AWS environment, helping to ...

SRE - Site Reliability Engineer - Observability & Performance

Hiring Organisation: Sanderson Recruitment
Location: Bristol, UK
Employment Type: Full-time

Description SRE - Observability and PerformanceUp to £600 per day outside IR356 month initial contractBristol - Largely remoteI'm currently working with a client who is looking for an SRE to implement and enhance observability across Java applications, middleware and Linux infrastructure using Grafana. The role is focused on monitoring, performance analysis … monitoring, alerting and instrumentation. The environment is currently hosted on traditional infrastructure, with an AWS migration planned, offering the opportunity to develop cloud-ready observability, automation and operational capabilities as the platform evolves.Essential Skills:Strong hands-on experience in DevOps, SRE, Platform Engineering or Systems Engineering environments.Expertise in Grafana, observability ...

Lead Cloud & AI Platform Engineer

Hiring Organisation: Jobleads-UK
Location: Leeds, England, United Kingdom

data orchestration toolsets (e.g., dbt, Apache Airflow), ETL/ELT methodologies, real‐time streaming (e.g., AWS Kinesis, Apache Kafka), Vector databases, and RAG architectures. Observability & FinOps: Experience implementing modern observability tooling (OpenTelemetry) alongside automated cost‐control systems (such as Karpenter, Infracost, OpenCost, or Cloud Custodian). Domain & Sector Experience Regulated ...

Lead Cloud & AI Platform Engineer

Hiring Organisation: Jobleads-UK
Location: Manchester, England, United Kingdom

Lead Cloud & AI Platform Engineer

Hiring Organisation: Jobleads-UK
Location: City of Edinburgh, Scotland, United Kingdom

Lead Cloud & AI Platform Engineer

Hiring Organisation: Jobleads-UK
Location: Greater London, England, United Kingdom

Principal Software Engineer, Distributed Identity Workflows

Hiring Organisation: Jobleads-UK
Location: United Kingdom

role covers system design, debugging complex cross‐service workflows, and mentoring engineers in distributed systems across provisioning workflows and pipelines. You will drive reliability, observability, and correctness in asynchronous execution, shaping legacy systems into modern, resilient architectures while collaborating across teams. #J-18808-Ljbffr ...

DevOps Engineer

Hiring Organisation: Fruition Group
Location: Leeds, Yorkshire, United Kingdom
Employment Type: Contract
Contract Rate: GBP Annual

Contract: Inside IR35 We're seeking an experienced Senior DevOps Engineer to join a small, highly skilled engineering team delivering a large-scale enterprise observability platform as they move away from Splunk This is an opportunity to work on a critical cloud platform supporting the migratio click apply for full ...

Principal Data Engineer – Crypto Market Data Platform

Hiring Organisation: Jobleads-UK
Location: Greater London, England, United Kingdom

REST and streaming, and own core infrastructure in a 24/7 financial environment. You’ll mentor senior engineers, drive architectural decisions, and advance observability, tooling, and testing. If you can navigate ambiguity with pragmatic choices and communicate effectively with #J-18808-Ljbffr ...

Head of Product Engineering — SaaS & Data Platforms

Hiring Organisation: Jobleads-UK
Location: Greater London, England, United Kingdom

senior engineering leader to own delivery across multiple squads focused on workforce intelligence and related domains. You will set standards for architecture, testing, and observability, partnering with product, design, and data science to turn customer problems into reliable, scalable solutions. You will drive the architectural roadmap for the SaaS platform ...

Lead AIOps Product Manager — Enterprise Reliability

Hiring Organisation: Jobleads-UK
Location: Greater London, England, United Kingdom

reduction, anomaly detection, and automated remediation across IT operations and SRE teams. You will partner with technical and security teams to deliver enterprise-grade observability while defining KPIs, SLOs, and adoption metrics to improve reliability and reduce operational #J-18808-Ljbffr ...

Zabbix Engineer

Hiring Organisation: Opus Recruitment Solutions Ltd
Location: London, South East, England, United Kingdom
Employment Type: Contractor
Contract Rate: £450 - £500 per day

engineering teams supporting a 24/7 operation Scripting and automation capability (e.g. Python, Bash or similar) Exposure to service-aware monitoring or observability approaches (desirable) CCNA certification Zabbix Engineer | 6 Month Initial contract | Remote | 450-500 OutsideIR35 Please apply with your most up to date CV. ...

Zabbix Administrator

Hiring Organisation: Flint UK Technology Services
Location: London, United Kingdom
Employment Type: Contract
Contract Rate: GBP Annual

Zabbix Administrator & Site Reliability Engineer Provide administration, support, and operational management of the Zabbix monitoring platform, ensuring reliable monitoring, alerting, and observability across enterprise infrastructure and services. Provide Tier 1 support including user access management, alert triage, and incident response. Configure and maintain Zabbix Servers, proxies, templates, hosts, triggers, dashboards ...

AI Security Engineer

Hiring Organisation: Experis
Location: London, United Kingdom
Employment Type: Contract

Embedding security within the SDLC * Exposure to AI and agentic systems (practical or conceptual), including: o Security and governance for AI Solutions (Identity, observability, risks) o Agentic Architecture and patterns (MCP, RAG, Tools, Memory, Harnesses) * Experience working in complex, enterprise-scale environments * Ability to translate security requirements into engineering-aligned ...

Lead AI Engineer / Head of Applied AI

Hiring Organisation: Jobleads-UK
Location: United Kingdom

models.* Model serving infrastructure, inference optimisation or high-performance distributed systems.* Tool use, agent harnesses or long-running AI workflows in production.* Rust, Leptos, observability or production AI workload analysis.Process## How the process runs.Each application is reviewed directly. Expect one or more conversations covering judgement, technical or craft depth ...

Lead Oracle Cloud Infrastructure Platform Engineer

Hiring Organisation: WRK DIGITAL LTD
Location: Leeds, West Yorkshire, Yorkshire, United Kingdom
Employment Type: Permanent
Salary: £80,000

/subject matter expertise for OCI related matters and lead on root cause analysis with a focus on resilience and prevention Establish a proactive observability strategy - dashboards, metrics, logs, traces - for critical Oracle services Design and implement enterprise grade logging and monitoring solutions using OCI Logging, OCI Monitoring, Events ...

Integration Domain Architect

Hiring Organisation: ARCA Consulting Ltd
Location: EC4R, Monument, Greater London, United Kingdom
Employment Type: Permanent

technologies such as Azure Integration Services, Azure API Management, Service Bus, or similar enterprise integration platforms. Strong understanding of hybrid cloud integration, security, resilience, observability and operational excellence. Experience with architecture frameworks and modelling approaches such as TOGAF, UML, etc. The ability to influence senior stakeholders and communicate complex technical ...