276 to 300 of 419 Observability Jobs in London

Engineering Manager, Developer Experience & AI Platform

Hiring Organisation
Jobleads-UK
Location
Greater London, England, United Kingdom
Space LLC is seeking an experienced Engineering Manager to lead a team of engineers in London. This role involves overseeing CI/CD pipelines, observability tooling, and developing AI tooling that enhances the productivity of engineers. You will be responsible for setting clear priorities, communicating with stakeholders, and building ...

Lead Data Engineer - Vector DB

Hiring Organisation
MLR Associates
Location
London, United Kingdom
Employment Type
Permanent
Salary
GBP 70,000 - 105,000 Annual
reliability, security, and governance across the platform. Collaborate with AI and Back End engineering teams to support training, inference, and product features. Implement monitoring, observability, and data quality frameworks. Key Candidates Criteria:- Classic data principles Agentic data principles - vector DB, permissions, evaluation End-to-end data architecture ownership AI active ...

Backend Engineer

Hiring Organisation
Wave Talent
Location
City of London, London, United Kingdom
load spikes Building secure execution environments for user-generated code (Firecracker micro-VMs) Server lifecycle management — spinning capacity up and down rapidly under load Observability and reliability tooling across the platform The stack TypeScript (99% of the codebase) PostgreSQL, Redis, ClickHouse Go emerging as a second language — experience ...

Engineering Manager - Platform Reliability

Hiring Organisation
Jobleads-UK
Location
Greater London, England, United Kingdom
Lakebase Platform Reliability team’s footprint spans multiple stacks, systems, and stakeholders. They include AI‐powered tooling and workflows for customer management, real‐time observability during incidents, monitoring and auditing systems that underpin compliance requirements, and customer‐facing operational APIs and maintenance workflows. You’ll contribute to the wider platform ...

Principal Product Manager

Hiring Organisation
Jobleads-UK
Location
Greater London, England, United Kingdom
boundaries, controls, escalation paths and human-in-the-loop mechanisms Ensure agentic behaviour is understandable, predictable and trustworthy through strong guardrails, safety mechanisms and observability Contribute and partner on core platform capabilities, including agent orchestration and lifecycle management, planning, reasoning and tool use frameworks, and memory, context and state management ...

Enterprise Account Executive (UK)

Hiring Organisation
Jobleads-UK
Location
Greater London, England, United Kingdom
together. LangChain is a place where your contributions can shape how this technology shows up in the real world. Today, our platform includes LangSmith (Observability, Evaluation, Deployment, Fleet, and Sandboxes), our open source frameworks (LangChain, LangGraph, and Deep Agents), and the newly launched LangSmith Engine for autonomous agent improvement. ...

Principal Engineering Lead

Hiring Organisation
Jobleads-UK
Location
Greater London, England, United Kingdom
teams build, validate and evolve experiences Raise the engineering bar across teams by turning standards into practical approaches for performance, reliability, code quality, testing, observability and operational readiness Coach and develop Engineering Leads, senior engineers and other technical leaders, acting as a multiplier for decision quality and team effectiveness Support ...

Senior Engineering Manager, Global Bank

Hiring Organisation
Jobleads-UK
Location
Greater London, England, United Kingdom
defining scope, aligning with product leadership, and driving delivery across squads and tribes Establish and uplift tribe-wide engineering practices across areas such as observability, incident response, security, or AI workflows, setting standards that go beyond a single squad Act as a senior escalation point for production incidents and complex ...

Principal Product Manager, Data Platform

Hiring Organisation
Jobleads-UK
Location
Greater London, England, United Kingdom
access via self‐service UIs and APIs. Build and evolve a data trust and certification framework, enabling users to assess dataset quality, ownership, observability, and SLAs with confidence. Embed AI‐driven discovery features such as semantic search, natural language query, and recommendations to improve data discoverability and reduce time ...

Regional Vice President

Hiring Organisation
Jobleads-UK
Location
Greater London, England, United Kingdom
advantage of all structured and unstructured data — securing and protecting private information more effectively — Elastic’s complete, cloud-based solutions for search, security, and observability help organizations deliver on the promise of AI. What is The Role: Elastic, the Search AI company, is looking for a high-energy Regional Vice ...

Platform Engineer

Hiring Organisation
itecopeople
Location
London, United Kingdom
Employment Type
Permanent, Work From Home
Salary
£60,000
large-scale enterprise environment. An exciting opportunity working on a greenfield Kubernetes platform built using modern engineering practices across Azure, GitOps, service mesh, observability and event-driven architecture. The Role You will be responsible for building, operating and improving a shared Kubernetes platform used by application, AI and integration engineering … teams. Hands-on role covering infrastructure as code, Kubernetes operations, CI/CD, networking, observability and platform reliability. Working closely with architects and engineering teams shaping the future of the platform while helping maintain high standards across automation, security, scalability and operational excellence. Key Responsibilities Build and operate Azure Kubernetes ...

Platform Engineer

Hiring Organisation
itecopeople
Location
London, England, United Kingdom
enterprise environment. This is an exciting opportunity to work on a greenfield Kubernetes platform built using modern engineering practices across Azure, GitOps, service mesh, observability and event-driven architecture. The Role As Platform Engineer, you will be responsible for building, operating and improving a shared Kubernetes platform used by application … integration engineering teams. This is a hands-on role covering infrastructure as code, Kubernetes operations, CI/CD, networking, observability and platform reliability. You'll work closely with architects and engineering teams to shape the future of the platform while helping maintain high standards across automation, security, scalability and operational ...

Senior DevOps Engineer

Hiring Organisation
Prism Digital
Location
London Area, United Kingdom
team of three DevOps Engineers and a Head of DevOps, you'll be responsible for maintaining and improving Kubernetes infrastructure, managing a self-hosted observability stack, owning CI/CD pipelines, and contributing to architecture and R&D initiatives. Upcoming projects include platform personalisation, edge computing, and preparing the infrastructure … operated Kubernetes in production at genuine scale within a fast-moving startup or scale-up environment. Non-Negotiables Kubernetes (K8s) AWS Terraform Open-source observability – (Grafana, Prometheus, Loki, or equivalent at scale Startup, scale-up or fast-growth background What You'll Work With AWS Kubernetes Terraform & Terragrunt Open-source ...

Senior DevOps Engineer

Hiring Organisation
Norton Blake
Location
London Area, United Kingdom
/CD pipelines and automation workflows Develop and maintain reusable infrastructure and application templates Evaluate and integrate new technologies to enhance platform capabilities Improve observability through monitoring, logging, and alerting solutions Troubleshoot infrastructure and deployment issues, ensuring rapid resolution Ideal Candidate Infrastructure as Code (IaC): Strong experience with Terraform … troubleshooting, Helm, and GitOps workflows Security: Hands-on experience with secrets management tools (e.g., Vault, Azure Key Vault) and authentication systems (e.g., SSO, Okta) Observability: Experience with tools such as Datadog, Grafana, or Azure Monitor Networking: Solid understanding of networking concepts, DNS, and related technologies CI/CD: Experience building ...

Data Engineer (All Levels, Analytics & Platform) - UK Wide

Hiring Organisation
describe.me
Location
London, South East, England, United Kingdom
Employment Type
Full-Time
Salary
£40,000 - £120,000 per annum
science and ML all stand on. You'll work across the full data engineering lifecycle—from ingestion and modelling through to transformation, orchestration, quality, observability and platform operation. The role suits someone who pairs strong software engineering discipline with genuine interest in data modelling and a pragmatic view of when … equivalent) and the workflows around it Manage cloud data warehouses and lakehouses (Snowflake, BigQuery, Redshift, Synapse, Databricks) Implement data quality, testing, monitoring and observability across pipelines and models Build streaming pipelines where the use case warrants it (Kafka, Kinesis, Pub/Sub, Flink) Partner with analysts, scientists, BI developers ...

AI Platform/ DevOps Engineer

Hiring Organisation
The Portfolio Group
Location
City of London, London, Castle Baynard, United Kingdom
Employment Type
Permanent
Salary
£70000 - £80000/annum + Benefits
Bedrock Knowledge Bases) and embedding pipelines Build and maintain CI/CD pipelines for inference services, retrievers, ingestion workflows, and RAG components Implement observability across AI workloads using CloudWatch, MLflow, and OpenTelemetry - covering latency, throughput, cost, and system health Apply secure-by-design principles including IAM, encryption, network controls … Terraform experience for infrastructure-as-code, provisioning and managing cloud infrastructure at scale Experience operating containerised services, managing CI/CD pipelines, and owning observability and reliability Familiarity with vector databases or search infrastructure (OpenSearch, Algolia) is a strong advantage Python proficiency for scripting, automation, and deploying production services Solid ...

Senior Azure DevOps Engineer

Hiring Organisation
ReVybe IT Recruitment Limited
Location
London, United Kingdom
Employment Type
Permanent
Salary
£85000 - £95000/annum
using PowerShell and scripting best practices Working closely with development teams to improve deployment efficiency, platform reliability, and developer experience Implementing monitoring, logging, and observability solutions to improve platform performance and availability Driving cloud governance, security, and operational best practices across the Azure estate What We're Looking For Proven … with Terraform and Infrastructure as Code Strong Azure DevOps experience, including CI/CD pipeline automation Experience scripting with PowerShell Knowledge of monitoring and observability tools Strong understanding of cloud security, networking, and automation principles Excellent communication skills and a collaborative mindset Why Join? Join a fast-growing fintech with ...

Go Full Stack Developer

Hiring Organisation
itecopeople
Location
London, United Kingdom
Employment Type
Permanent, Work From Home
Salary
£60,000
event-driven services Contribute to CI/CD pipelines and cloud-native deployments Review code and champion engineering best practices Improve application performance, observability and reliability Collaborate within Agile delivery teams across multiple projects Support technical decision-making and continuous improvement Skills & Experience We are looking for candidates with strong … reviews, testing and engineering governance Experience with any of the following would be highly advantageous: Microsoft Azure Python GitOps tooling (Argo CD/Flux) Observability tooling (Prometheus, Grafana, OpenTelemetry) AI/LLM-enabled applications Event-driven architectures and messaging platforms What's on Offer Opportunity to work on cutting-edge ...

Principal Artificial Intelligence (AI) Platform Engineer/Architect

Hiring Organisation
WTW
Location
Greater London, United Kingdom
Employment Type
Full Time
engagement—building credibility and driving adoption across the organization Provide escalation pathways for architecture questions and unblock teams on complex integration challenges Implement monitoring, observability, and governance systems that provide transparency without creating bottlenecks Collaborate with security, compliance, and data teams to embed safety guardrails into platform capabilities Participate … experience) Proven ability to design systems that abstract complexity and enable teams to self-serve at scale Strong software engineering fundamentals (system design, testing, observability, operational excellence, SDLC practices) Experience building or maintaining developer-facing platforms, SDKs, or internal tools Comfortable articulating technical architecture, vision, and strategy to both technical ...

Site Reliability Engineer

Hiring Organisation
CGI
Location
Greater London, United Kingdom
Employment Type
Full Time
Site Reliability Engineer (SRE) to join a team supporting multiple data product and platform groups. This role is focused on improving the reliability, scalability, observability, and operational performance of critical data-driven platforms and services across complex production environments. The successful candidate will work closely with engineering, platform, and support … across cloud and containerised environments. - Manage and support Kubernetes clusters and Helm-based deployments across multiple environments. - Implement and enhance monitoring, alerting, logging, and observability solutions to improve platform reliability and operational visibility. - Investigate incidents, analyse logs, identify root causes, and drive timely resolution of production issues. - Participate in incident ...

ML Infrastructure Lead

Hiring Organisation
Jobleads-UK
Location
Greater London, England, United Kingdom
versioning, reproducibility, experimentation, feature management and release management Own and improve the production environment for machine learning systems, ensuring strong standards for availability, performance, observability and resilience Define and implement monitoring across model and platform layers, including system health, data quality, drift, latency, throughput and cost efficiency Build or optimise … pipelines, infrastructure-as-code and workflow orchestration Experience with tools such as Airflow or similar platform and orchestration technologies Good understanding of model observability, data quality, feature pipelines, lineage and reproducibility Experience designing scalable infrastructure for ML workloads, including training, batch inference and real-time serving Strong appreciation of reliability ...

Observability Engineer - Bigpanda

Hiring Organisation
VIQU IT Recruitment
Location
Westminster, Greater London, UK
Observability Engineer Below, you will find a complete breakdown of everything required of potential candidates, as well as how to apply Good luck. AIOps & Observability | Permanent | Remote-first (UK) | Confidential Client Placed exclusively by Morela - Specialist AI & Engineering Recruitment THE OPPORTUNITY Morela is working exclusively with one of the fastest … boutique that is changing how enterprise organisations deal with alert noise and incident response. They are scaling their delivery team and looking for an Observability Engineer who wants to do some of the most interesting work in the space right now. This is a hands-on, client-facing role. ...

AI Engineer

Hiring Organisation
McCabe & Barton
Location
City, London, United Kingdom
Employment Type
Contract
Contract Rate
GBP 800 Daily
ROLE Design and build core AI platform components for a leading buy-side investor. You'll own the LLM gateway, MCP connector layer, observability tooling, and privacy proxy translating business use cases into governed, production-ready AI systems click apply for full job details ...

SRE Lead - Automate, Observe & Scale Reliability (Banking)

Hiring Organisation
Jobleads-UK
Location
Bromley, England, United Kingdom
Huxley is looking for an experienced SRE Lead in Bromley. This role involves leading SRE strategy and focusing on automation, observability, and reliability within a banking context. The ideal candidate will have over 8 years' experience in SRE, strong resilience engineering capabilities, and the ability to manage complex operations. This ...

Zabbix Engineer

Hiring Organisation
Opus Recruitment Solutions Ltd
Location
London, South East, England, United Kingdom
Employment Type
Contractor
Contract Rate
£450 - £500 per day
engineering teams supporting a 24/7 operation Scripting and automation capability (e.g. Python, Bash or similar) Exposure to service-aware monitoring or observability approaches (desirable) CCNA certification Zabbix Engineer | 6 Month Initial contract | Remote | 450-500 OutsideIR35 Please apply with your most up to date CV. ...