Observability Jobs in the UK

576 to 600 of 892 Observability Jobs in the UK

Software Engineer

London Area, United Kingdom
Zenith Search
UIs: time series views, blotters, RFQ/axe views, portfolio & risk dashboards. End-to-end systems: you design it, build it, productionise it, support it. CI/CD, testing, observability, the lot. What you're gaining: Direct line to PMs/traders/researchers. Zero “just keep the lights on” work: everything is about edge, speed, and better decisions. Top More ❯
Posted:

Artificial Intelligence Engineer

United Kingdom
Cubiq Recruitment
pipelines : hybrid retrieval, re-ranking, grounding, and context validation. Integrate real-time voice interfaces (STT/TTS, WebRTC, LiveKit) into intelligent conversational flows. Instrument and evaluate system performance using observability and model-faithfulness metrics. What we’re looking for Proven ability to build and ship agentic or multi-agent frameworks into production. Expert Python, FastAPI, and asyncio developer. Practical experience More ❯
Posted:

Lead Software Engineer

Caldecotte, Milton Keynes, Buckinghamshire, England, United Kingdom
Hybrid/Remote Options
Connells Group HQ
regarding best practices in software development and deployment Implement best practice coding in relation to development coding standards Provide direction for more Junior Software Engineers Foster a culture of observability across the team. Help teams use operational data to improve the stability and performance of their applications Maintain documentation and release notes Have awareness of application security considerations Lead incident More ❯
Employment Type: Full-Time
Salary: Competitive salary
Posted:

Founding AI Engineer

London Area, United Kingdom
Higher - AI recruitment
that create real-world value Integrate AI models into operational workflows Ensure reliability through fail-safes, self-healing, and fallback mechanisms Monitor & improve AI performance with feedback loops and observability tools Collaborate with Data Engineers to ensure AI has accurate, real-time data Implement human-in-the-loop systems where needed Skills and Qualifications Exceptional Python skills and experience with More ❯
Posted:

Founding AI Engineer

City of London, London, United Kingdom
Higher - AI recruitment
that create real-world value Integrate AI models into operational workflows Ensure reliability through fail-safes, self-healing, and fallback mechanisms Monitor & improve AI performance with feedback loops and observability tools Collaborate with Data Engineers to ensure AI has accurate, real-time data Implement human-in-the-loop systems where needed Skills and Qualifications Exceptional Python skills and experience with More ❯
Posted:

Integration Developer

Maidenhead, England, United Kingdom
MCS Rental Software
our suite of telematics integrations, including features like event tracking and CO2 reporting. Security & Compliance: Build with robust security, encryption, and data protection standards in mind. Operational Excellence: Promote observability with structured logging, monitoring, and clear documentation for all integrations. Knowledge Sharing: Contribute to internal knowledge bases and explore automation tools like Power Automate and Zapier to enhance both internal More ❯
Posted:

Full Stack Engineer

United Kingdom
Oracle
We are looking for hands-on engineers with expertise and passion in solving difficult problems in all areas of software engineering: distributed systems, identity, security, observability, and user experience. This is a greenfield opportunity to design and build new cloud centric applications from the ground up. We are growing fast, still at an early stage, and working on ambitious new More ❯
Posted:

Principal Member of Technical Staff

United Kingdom
Oracle
patients and healthcare providers. We are looking for hands-on engineers with expertise and passion in solving difficult problems in all areas of software engineering: distributed systems, identity, security, observability, and user experience. This is a greenfield opportunity to design and build new cloud centric applications from the ground up. We are growing fast, still at an early stage, and More ❯
Posted:

Central Data Steward FTC

Luton, England, United Kingdom
Hybrid/Remote Options
easyJet
and platforms to automate and optimise data management steps and gateways into data and analytical pipelines. • Expertise in implementing and managing statistical process controls for data quality measurement, continuous observability, and data quality remediation. • Strong SQL background – comfortable writing efficient SQL (Transact-SQL, Hive -HQL) to meet the requirement, having had exposure to working with large datasets on a distributed More ❯
Posted:

Central Data Steward

luton, bedfordshire, east anglia, united kingdom
Hybrid/Remote Options
easyJet
and platforms to automate and optimise data management steps and gateways into data and analytical pipelines. Expertise in implementing and managing statistical process controls for data quality measurement, continuous observability, and data quality remediation. Strong SQL background – comfortable writing efficient SQL (Transact-SQL, Hive -HQL) to meet the requirement, having had exposure to working with large datasets on a distributed More ❯
Posted:

Lead IT Architect - Platinion - Insurance

London, United Kingdom
Boston Consulting Group
mesh, API gateways, and commercial vs. open source software. Approaches to managing Architectural debt, Architecture governance and evolution in practice Micro services topologies, including operational concerns such as resiliency, observability, discovery and routing, security etc. Have experience with, and understand how to lead, legacy integration and remediation (facades, strangler approaches, et. al.). Deep understanding of different integration patterns and More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

DevOps Manager

London Area, United Kingdom
Hybrid/Remote Options
Queen Square Recruitment
DevOps Lead/Architect — Contract — London — Hybrid (3 days onsite) Inside IR35 | 6 months | FS sector We are looking for a DevOps Lead/Architect to drive observability, automation, and GitOps best practices within a global financial services environment. What you'll be doing Architect and scale observability platforms using Datadog + Geneos Lead infrastructure automation using Terraform/IaC More ❯
Posted:

DevOps Manager

City of London, London, United Kingdom
Hybrid/Remote Options
Queen Square Recruitment
DevOps Lead/Architect — Contract — London — Hybrid (3 days onsite) Inside IR35 | 6 months | FS sector We are looking for a DevOps Lead/Architect to drive observability, automation, and GitOps best practices within a global financial services environment. What you'll be doing Architect and scale observability platforms using Datadog + Geneos Lead infrastructure automation using Terraform/IaC More ❯
Posted:

Senior Python Software Engineer

London Area, United Kingdom
Creo Recruitment
hosted on AWS. Architect and optimise systems: Define service boundaries, data ownership, and failure-recovery patterns for scalable, high-availability systems. Raise engineering quality: Champion best practices for testing, observability, and security. Review critical PRs and guide technical decisions across the team. Operate and improve production systems: Monitor performance, reliability, and cost efficiency. Lead incident response and drive continuous improvement. … Django) Cloud: AWS (Lambda, ECS/Fargate, S3, DynamoDB, CloudWatch, API Gateway) Data & Messaging: PostgreSQL, Redis, Kafka or SQS CI/CD & Infrastructure: Docker, Terraform, GitHub Actions, CloudFormation Monitoring & Observability: Prometheus, Grafana, OpenTelemetry Testing: Pytest, integration and load testing frameworks Key Skills & Expertise Proven experience designing and delivering production systems using Python on AWS . Strong understanding of distributed systems … API design, and event-driven architectures. Deep knowledge of system observability, logging, and performance optimisation. Familiarity with modern security and data-privacy best practices. Excellent communicator who can document and articulate technical trade-offs clearly. Behaviours & Attributes Ownership: Takes full responsibility for systems from design to operation. Pragmatism: Balances long-term architecture with delivery velocity. Influence: Raises standards and mentors More ❯
Posted:

Senior Python Software Engineer

City of London, London, United Kingdom
Creo Recruitment
hosted on AWS. Architect and optimise systems: Define service boundaries, data ownership, and failure-recovery patterns for scalable, high-availability systems. Raise engineering quality: Champion best practices for testing, observability, and security. Review critical PRs and guide technical decisions across the team. Operate and improve production systems: Monitor performance, reliability, and cost efficiency. Lead incident response and drive continuous improvement. … Django) Cloud: AWS (Lambda, ECS/Fargate, S3, DynamoDB, CloudWatch, API Gateway) Data & Messaging: PostgreSQL, Redis, Kafka or SQS CI/CD & Infrastructure: Docker, Terraform, GitHub Actions, CloudFormation Monitoring & Observability: Prometheus, Grafana, OpenTelemetry Testing: Pytest, integration and load testing frameworks Key Skills & Expertise Proven experience designing and delivering production systems using Python on AWS . Strong understanding of distributed systems … API design, and event-driven architectures. Deep knowledge of system observability, logging, and performance optimisation. Familiarity with modern security and data-privacy best practices. Excellent communicator who can document and articulate technical trade-offs clearly. Behaviours & Attributes Ownership: Takes full responsibility for systems from design to operation. Pragmatism: Balances long-term architecture with delivery velocity. Influence: Raises standards and mentors More ❯
Posted:

SRE Lead

United Kingdom
Hybrid/Remote Options
Oliver Bernard
practices across product teams. Ideally from a technical/SWE background Own system design for reliability, scalability and performance. Lead platform reliability, availability and incident management. Drive automation, IaC, observability and continuous improvement. Guide root cause analysis and implement resilience strategies. Mentor and technically lead SRE/Platform engineers. Support large-scale re-architecture, capacity planning and FinOps alignment. Core … Technical Environment Cloud: AWS (high-throughput systems: 1,000–6,000+ req/sec) IaC: Terraform, configuration management Containers: Kubernetes, Docker (ECS beneficial) Languages: Python, Go or similar Observability: Prometheus, DataDog or equivalents CI/CD: Modern automated pipelines Systems: Distributed systems, microservices, resilience engineering Lead Site Reliability Engineer | Fully Remote | AWS, Kubernetes, Terraform | High-Scale SaaS | £90K More ❯
Posted:

Cloud Engineer: £70k + Bonus/benefits (Fintech Trading)

London Area, United Kingdom
Hunter Bond
healing architecture with GKE (Kubernetes) at its core Supporting key platforms: Airflow, BigQuery, PostgreSQL clusters Enhancing developer experience through GitLab CI/CD, Coder remote environments, and a modern observability stack (Prometheus, Grafana, Mimir) Driving automation and reliability across infrastructure and pipelines What we’re looking for 2–4 years’ experience in a Cloud, Platform, or DevOps role Solid hands … and optimise — with a pragmatic, problem-solving mindset Great communication skills and a collaborative, customer-focused approach Familiarity with CI/CD (GitLab), and an interest in data or observability tools is a plus A STEM degree (2:1 or higher) or equivalent hands-on experience Why join Work with a modern, cloud-native stack at scale Be part of More ❯
Posted:

Cloud Engineer: £70k + Bonus/benefits (Fintech Trading)

City of London, London, United Kingdom
Hunter Bond
healing architecture with GKE (Kubernetes) at its core Supporting key platforms: Airflow, BigQuery, PostgreSQL clusters Enhancing developer experience through GitLab CI/CD, Coder remote environments, and a modern observability stack (Prometheus, Grafana, Mimir) Driving automation and reliability across infrastructure and pipelines What we’re looking for 2–4 years’ experience in a Cloud, Platform, or DevOps role Solid hands … and optimise — with a pragmatic, problem-solving mindset Great communication skills and a collaborative, customer-focused approach Familiarity with CI/CD (GitLab), and an interest in data or observability tools is a plus A STEM degree (2:1 or higher) or equivalent hands-on experience Why join Work with a modern, cloud-native stack at scale Be part of More ❯
Posted:

Cloud Platform Engineer

London Area, United Kingdom
Hybrid/Remote Options
Harrington Starr
Manage and optimise key platforms such as Airflow , BigQuery , and PostgreSQL clusters. Developer Experience: Enhance internal developer productivity through Coder remote dev environments, GitLab CI/CD pipelines, and observability tooling. Collaboration: Partner closely with Data Engineering, Trading Technology, and Platform teams to deliver robust, scalable cloud solutions. Required Skills and Experience Experience: 2-4 years in a Cloud, Platform … and continuous integration concepts. Mindset: Pragmatic, customer-focused, and driven by efficiency and automation. Education: Minimum 2:1 degree in a STEM subject or equivalent experience. Desirable: Exposure to observability tooling (Grafana, Prometheus, Mimir). Interest in data platforms or AI-enabled development workflows. Learn More For more information, contact George Harris at Harrington Starr for a confidential conversation, or More ❯
Posted:

Cloud Platform Engineer

City of London, London, United Kingdom
Hybrid/Remote Options
Harrington Starr
Manage and optimise key platforms such as Airflow , BigQuery , and PostgreSQL clusters. Developer Experience: Enhance internal developer productivity through Coder remote dev environments, GitLab CI/CD pipelines, and observability tooling. Collaboration: Partner closely with Data Engineering, Trading Technology, and Platform teams to deliver robust, scalable cloud solutions. Required Skills and Experience Experience: 2-4 years in a Cloud, Platform … and continuous integration concepts. Mindset: Pragmatic, customer-focused, and driven by efficiency and automation. Education: Minimum 2:1 degree in a STEM subject or equivalent experience. Desirable: Exposure to observability tooling (Grafana, Prometheus, Mimir). Interest in data platforms or AI-enabled development workflows. Learn More For more information, contact George Harris at Harrington Starr for a confidential conversation, or More ❯
Posted:

Senior Full-Stack AI Engineer - Contract and Permanent roles available

Harwell, Oxfordshire, UK
Oxford Dynamics
service meshes, and container registries. - Implement GitHub Actions/Argo CD pipelines for automated, zero-touch deployments. - Lead security hardening efforts using GuardDuty, CloudWatch, IAM best practices. - Set up observability stacks for proactive monitoring and performance tuning. - Own backup, disaster recovery for services that youʼve created. Cross-Functional & Process - Collaborate closely with other engineers, product managers and CTO. - Mentor engineers … experience (Sagemaker, Kubeflow, ZenML). - Experience building RESTful services around AI pipelines. - ISO 27001, NIST SSDF, OWASP SAMM, or GDPR compliance literacy. - Experience with AWS Karpenter, Prometheus, or similar observability stacks. Soft Skills Research-driven mindset, eager to experiment and iterate. Able to bridge the gap between cutting-edge AI research and practical deployment. Strong communicator with the ability to More ❯
Posted:

Senior Full-Stack AI Engineer - Contract and Permanent roles available

Harwell, England, United Kingdom
Oxford Dynamics
service meshes, and container registries. - Implement GitHub Actions/Argo CD pipelines for automated, zero-touch deployments. - Lead security hardening efforts using GuardDuty, CloudWatch, IAM best practices. - Set up observability stacks for proactive monitoring and performance tuning. - Own backup, disaster recovery for services that youʼve created. Cross-Functional & Process - Collaborate closely with other engineers, product managers and CTO. - Mentor engineers … experience (Sagemaker, Kubeflow, ZenML). - Experience building RESTful services around AI pipelines. - ISO 27001, NIST SSDF, OWASP SAMM, or GDPR compliance literacy. - Experience with AWS Karpenter, Prometheus, or similar observability stacks. Soft Skills Research-driven mindset, eager to experiment and iterate. Able to bridge the gap between cutting-edge AI research and practical deployment. Strong communicator with the ability to More ❯
Posted:

Site Reliability Engineering (SRE) Manager

England, United Kingdom
SS&C
and postmortem processes, driving root cause analysis and long-term fixes. Automation & Tooling Champion automation to reduce toil and improve system reliability. Oversee the development and maintenance of internal observability, tools and platforms. Collaborate with engineering and DevOps teams to embed reliability into the software development lifecycle. Collaboration & Strategy Partner with product, engineering, DevOps and Customer Support teams to align … on priorities and roadmaps. Contribute to the strategic direction of infrastructure and reliability initiatives. Advocate for best practices in observability, CI/CD, and infrastructure as code. What You Will Bring: Proven experience managing or leading SRE, DevOps, or infrastructure teams. Strong background in systems engineering, cloud platforms (AWS, Azure), and container orchestration (Kubernetes) Excellent leadership, communication, and problem-solving More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Senior Engineer - Developer Experience (DevEx)

United Kingdom
Complexio
/CD systems (GitHub Actions, runners, caching, artifact storage). Maintain stability, scalability, and cost-effectiveness of pipelines. Build and maintain systems for our monorepo. Ensure CI/CD observability, with metrics flowing into Datadog/Slack. Pipeline Instrumentation & Optimisation Analyse pipelines for inefficiencies (e.g., flaky tests, redundant steps, lack of caching). Recommend and implement optimisations (parallelisation, test selection … CircleCI). Strong background in SDLC practices and developer productivity tooling. Hands-on experience with infrastructure automation (e.g., Docker, Kubernetes, IaC with Terraform, Ansible or Pulumi). Familiarity with observability & monitoring (Datadog, Prometheus, or similar). Experience managing or improving monorepo build systems. Strong ability to measure developer productivity gaps and define KPIs. Experience in driving adoption of change in More ❯
Posted:

🌳 Full-Stack Software Engineers SC/DV Cleared — Multiple Openings 🌳

City of London, London, United Kingdom
Hybrid/Remote Options
Areti Group | B Corp™
Own CI/CD pipelines and Docker -based runtime on AWS ; Infrastructure-as-Code via CDK/Terraform (CDKTF) . Apply secure-by-design and TDD ; instrument apps for observability and performance . Collaborate with product, platform, and security teams to meet operational and compliance requirements. The toolkit you’ll use Frontend: TypeScript, React.js, Vite, Material-UI, HTML5, CSS Backend … Docker , CI/CD . Building and consuming RESTful APIs ; JSON schemas; integration testing. Comfortable in AWS and modern Infrastructure-as-Code approaches. Strong engineering fundamentals: code reviews, testing, observability, performance tuning . Security Clearance: Active SC or DV (must be current). Nice-to-haves Military background (RAF/Army/Navy) or delivery in defence, aerospace, or government More ❯
Posted:
Observability
10th Percentile
£56,718
25th Percentile
£67,500
Median
£80,000
75th Percentile
£105,000
90th Percentile
£139,750