Observability Jobs in England

476 to 500 of 611 Observability Jobs in England

Data Platform Engineer

London, United Kingdom
Hybrid / WFH Options
Lyst Ltd
able to: Contribute to every part of our system, ranging from code and tests to infrastructure changes. Ensure the stability of our system by implementing and improving monitoring and observability tools. Write resilient code that is well tested. Be curious - not just the code, but the architecture of our platforms and everything that enables the business to thrive. Gain expertise … the rest of the organisation, and almost all of Lyst engineering engages with us on a regular basis. We care about robustness and integrity in our pipelines and use observability tools to monitor. Experience in developing robust and secure software solutions and data pipelines. Effective communication skills, comfortable working with technical and non-technical individuals and teams. Proficiency in developing … within public cloud technologies and architecture (perferably AWS exp). Experience with containers (Docker) and container orchastration. Experience with Infrastructure as Code (we use Terraform). Experience utilising monitoring, observability and logging tools. Experience with git, gitOps, github actions. Exposure or experience with cloud data warehouse/data platforms (we useSnowflake). Things that matter to us: You have a More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Cloud Engineer

London, United Kingdom
Hybrid / WFH Options
Picture More Ltd
project ownership, this is a greenfield role with space to shape and lead • Work with modern Azure technologies in a mature, enterprise setting • Exposure to CI/CD, security, observability, and containerised environments • Be a mentor and thought leader, influence others and grow professionally • Enjoy a collaborative, diverse, and inclusive team culture • A chance to work with global stakeholders in … to AWS. You'll: • Architect and deploy scalable infrastructure using Terraform and IaC • Design and enhance CI/CD pipelines (e.g. Azure DevOps, Jenkins) • Implement robust monitoring, logging and observability tools (Azure Monitor, SolarWinds) • Work with DevOps and Security teams to enforce cloud governance and zero-trust models • Stay close to the technology - supporting the business, guiding junior engineers, and More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Cloud Engineer

London, South East, England, United Kingdom
Hybrid / WFH Options
Picture More
project ownership, this is a greenfield role with space to shape and lead Work with modern Azure technologies in a mature, enterprise setting Exposure to CI/CD, security, observability, and containerised environments Be a mentor and thought leader, influence others and grow professionally Enjoy a collaborative, diverse, and inclusive team culture A chance to work with global stakeholders in … to AWS. You'll: Architect and deploy scalable infrastructure using Terraform and IaC Design and enhance CI/CD pipelines (e.g. Azure DevOps, Jenkins) Implement robust monitoring, logging and observability tools (Azure Monitor, SolarWinds) Work with DevOps and Security teams to enforce cloud governance and zero-trust models Stay close to the technology – supporting the business, guiding junior engineers, and More ❯
Employment Type: Full-Time
Salary: £82,000 - £86,000 per annum
Posted:

AWS Senior Platform Engineer

Bristol, Gloucestershire, United Kingdom
CACI Limited
at scale, leveraging AWS Organizations, Landing Zones, and multi-account best practices. Develop and maintain Infrastructure as Code solutions using Terraform, CloudFormation, and AWS CDK. Champion security, compliance, and observability by integrating services like AWS Security Hub, GuardDuty, and Inspector. Design CI/CD pipelines to enable seamless deployments and self-service models for customers. Innovate with AWS Networking, KMS … Proficiency in Python, Go, or similar languages for automation and scripting. Expert-level knowledge of AWS Networking, TLS, and security best practices. Experience with container orchestration (Kubernetes, EKS) and observability tools (Grafana, ELK). A passion for innovation, problem-solving, and delivering high-impact solutions. Why Work For Us? 25 days holiday + bank holidays Up to 5% employer pension More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Director of Platform Engineering

Manchester, Lancashire, United Kingdom
dunnhumby
fosters innovation, and delivers exceptional user interactions delivering robust internal developer platform (IDP) capabilities, strengthening CI/CD pipelines, enabling on-demand environments, and scaling platform foundations such as observability, security, and FinOps - while adhering to best practices in DevOps and modern software delivery What we expect from you Drive the development of a comprehensive IDP (e.g., based on Backstage … on-demand environments for development, QA, and staging through Infrastructure-as-Code and container orchestration. Support multi-tenancy and environment rationalization to reduce duplication and inefficiency. Define and implement observability standards, including logging, metrics, tracing, and alerting . Use tools like New Relic , Prometheus , and Grafana , alongside building custom instrumentation for key platform services. Drive incident readiness and operational resilience … tools. Proven success in building and operating developer platforms and enablement frameworks. Experience with cloud-native technologies, Kubernetes, and Infrastructure as Code (Terraform, Helm, etc.). Strong understanding of observability tooling (especially New Relic, Prometheus, Grafana) and incident response best practices. Familiarity with FinOps, platform cost tracking, and infrastructure efficiency techniques. Excellent communication, leadership, and stakeholder management skills. Attract, hire More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Machine Learning Ops Engineer (London)

Highbury, Greater London, UK
DigitalGenius
for those with strong Engineering and DevOps capabilities and a deep interest in operationalising AI solutions. We are looking for someone with complementary skills that extend into infrastructure and observability, preferably with experience in E-Commerce. The AI team owns all ML-related research, implementation and maintenance. In practice, this means keeping up to date with best practices in production … ownership of the deployment and monitoring pipelines within your expertise Contribute to the ongoing innovation R&D projects by enabling production readiness Maintain and implement CI/CD pipelines, observability, and infrastructure for ML services Requirements Degree in relevant field with 3+ years of industry experience Strong Technical Skills: Python, AWS, Docker, Terraform Experience deploying and maintaining machine learning models More ❯
Employment Type: Full-time
Posted:

Operations Site Reliability Engineer

London, United Kingdom
eBay Inc
major incidents and the overall health of our services, making sure they are both resilient and high-performing. You'll create strategies for availability and reliability, enhance domain ecosystem observability, and support a shift toward a more engineering-focused culture. Your contributions will ensure that eBay's technology remains cutting-edge and reliable for our global community. What you will … JVM configurations, and a deep understanding of UNIX, Linux, networking (TCP/IP), and databases (both relational and NoSQL). Experience in android and iOS application debugging. Experience with observability tools such as Grafana and Prometheus, and skills in documenting procedures for knowledge management. Strong interpersonal and communication skills to thrive in fast-paced, dynamic environments. NOTE: As part of More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Platform Engineer

London, England, United Kingdom
Allegis Global Solutions
disease together. Position Summary GSK are on the lookout for an experienced Platform Engineer to join the team. This person will be instrumental in helping drive automation, reliability, and observability on Google Cloud Platform (GCP). This role will involve working closely with development, platform engineering, and security teams to implement DevOps best practices, define and enforce service-level objectives … native tools and technologies. Develop capabilities which allow Platform Engineering teams in Onyx to operate with a DevOps ethos. Collaborate with development teams to optimise application performance, reliability, and observability on GCP. Implement and enforce Service Level Objectives (SLOs) and Error Budgets to ensure a balance between reliability and feature development. Develop and maintain a comprehensive monitoring and alerting platform More ❯
Posted:

Head of Data Engineering (London)

London, UK
Hybrid / WFH Options
Zego
About Zego At Zego, we understand that traditional motor insurance holds good drivers back. It's too complicated, too expensive, and it doesn't reflect how well you actually drive. Since 2016, we have been on a mission to change More ❯
Employment Type: Full-time
Posted:

Senior Data Engineer I

Oxford, Oxfordshire, United Kingdom
Hybrid / WFH Options
Elsevier
locations London UK - Cambridge (BioData Innovation Centre) Amsterdam Oxford Aalborg time type Full time posted on Posted 5 Days Ago job requisition id R98188 About the Team: The Academic Information Systems ( AI S) DataOps team is a shared technology group More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Head of Data Engineering

London, United Kingdom
Hybrid / WFH Options
Zego
About Zego At Zego, we understand that traditional motor insurance holds good drivers back. It's too complicated, too expensive, and it doesn't reflect how well you actually drive. Since 2016, we have been on a mission to change More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

BOMS Monitoring and Observability Engineer

Telford, Shropshire, West Midlands, United Kingdom
LA International Computer Consultants Ltd
insight, and proactive incident management. Key Responsibilities * Translate high-level monitoring non-functional requirements (NFRs) into actionable configurations across tools such as Splunk, Dynatrace, and AppDynamics. * Deliver full-stack observability solutions, including application-aware network performance monitoring (NPM), synthetics, log analytics, and infrastructure metrics. * Provide live support for monitoring technologies and assist with live service support, including key business events More ❯
Employment Type: Contract
Rate: £500 - £550 per day
Posted:

BOMS Monitoring and Observability Engineer

newport, midlands, united kingdom
LA International Computer Consultants Ltd
insight, and proactive incident management. Key Responsibilities * Translate high-level monitoring non-functional requirements (NFRs) into actionable configurations across tools such as Splunk, Dynatrace, and AppDynamics. * Deliver full-stack observability solutions, including application-aware network performance monitoring (NPM), synthetics, log analytics, and infrastructure metrics. * Provide live support for monitoring technologies and assist with live service support, including key business events More ❯
Posted:

Software Engineer, Observability New York

London, United Kingdom
Hybrid / WFH Options
vercel.com
building on our platform, supporting our customers, or shaping our story: You can just ship things. About the Role: We are looking for a Software Engineer to join our Observability team. Vercel users rely on Observability to monitor and understand their applications' health and behavior. In this role, you will design, implement, and maintain Observability products that meet high-quality … What You Will Do: Handle large-scale data ingestion, storage, and processing from distributed systems. Develop cutting-edge visualization tools to provide insights into application behavior and performance. Integrate observability features with popular frontend tools, frameworks, and build systems to enhance developer experience. Write clean, efficient, and well-documented code, ensuring platform reliability through thorough testing. Collaborate with product managers … community by contributing to projects and participating in discussions, supporting Vercel's developer-focused mission. Collect feedback from developers and users to drive continuous innovation and improvement in the observability product area. About You: You have at least 5+ years of relevant work experience. Strong proficiency in JavaScript/TypeScript/Go and experience with modern frontend development tools. Solid More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Senior React Engineer

City of London, London, United Kingdom
Hybrid / WFH Options
Rise Technical Recruitment
performance, and deployment of React-based user interfaces used by clients around the world. You'll work very closely with SRE, DevOps, and Product Teams, and drive front-end observability, automate deployments, and play a key part in optimising UX at scale. The ideal candidate will be someone who combines deep React expertise with infrastructure awareness. Candidates must be comfortable … play a leading role in building robust, scalable, and reliable front-end systems in a company that invests in both people and technology. The Role: *Owning React application reliability, observability, and performance in production environments *Designing and managing CI/CD Pipelines for automation testing, canary deployments, and rollbacks *Optimising front-end delivery via CDN configuration, caching strategies, and real … Next.js or SSR experience) *Comfortable managing deployments using CI/CD pipelines (Github Actions, Jenkins, etc.) *Solid understanding of cloud infrastructure including AWS, Kubernetes, and contect delivery *Exposure to observability tooling (Datadog, Sentry, Grafana) and performance tuning best practice Reference Number: BBBH(phone number removed) To apply for this role or for to be considered for further roles, please click More ❯
Employment Type: Permanent
Salary: £80000 - £90000/annum 38 Days Holiday, Healthcare, Pension
Posted:

Senior React Engineer

London, South East, England, United Kingdom
Hybrid / WFH Options
Rise Technical Recruitment Limited
performance, and deployment of React-based user interfaces used by clients around the world. You'll work very closely with SRE, DevOps, and Product Teams, and drive front-end observability, automate deployments, and play a key part in optimising UX at scale.The ideal candidate will be someone who combines deep React expertise with infrastructure awareness. Candidates must be comfortable working … play a leading role in building robust, scalable, and reliable front-end systems in a company that invests in both people and technology. The Role: *Owning React application reliability, observability, and performance in production environments*Designing and managing CI/CD Pipelines for automation testing, canary deployments, and rollbacks*Optimising front-end delivery via CDN configuration, caching strategies, and real … Next.js or SSR experience)*Comfortable managing deployments using CI/CD pipelines (Github Actions, Jenkins, etc.)*Solid understanding of cloud infrastructure including AWS, Kubernetes, and contect delivery*Exposure to observability tooling (Datadog, Sentry, Grafana) and performance tuning best practice Reference Number: BBBH259301 To apply for this role or for to be considered for further roles, please click "Apply Now" or More ❯
Employment Type: Full-Time
Salary: £80,000 - £90,000 per annum, Inc benefits
Posted:

Staff Machine Learning Engineer (London)

Whetstone, Greater London, UK
Hybrid / WFH Options
Compare the Market
systems that power real-time and batch predictions at scale Design production pipelines for training, deployment, and monitoring using modern MLOps tooling Take ownership of technical quality, resilience, and observability of critical ML services Build reusable tools and frameworks to enable fast, safe experimentation and deployment Platform, Standards & MLOps Foundations Define and build the core MLOps capabilities for the organisation … including training pipelines, deployment frameworks, and observability tooling Establish standardised patterns and best practices to accelerate model development, testing, and deployment Lead the evolution of our ML platform, working with engineering partners to improve scalability, governance, and developer experience Contribute to responsible ML practicessupporting auditability, explainability, and model health monitoring Technical Leadership & Collaboration Partner with data scientists to take models … GitHub Actions, ArgoCD) Proven ability to build reusable tooling, scalable services, and resilient pipelines for real-time and batch inference Strong understanding of ML system lifecycle: testing, monitoring, governance, observability Excellent collaboration and communication skills; able to influence cross-functional teams and lead complex technical work A background in software engineering, computer science, or a quantitative fieldor equivalent experience leading More ❯
Employment Type: Full-time
Posted:

Staff Machine Learning Engineer (London)

London, UK
Hybrid / WFH Options
Compare the Market
systems that power real-time and batch predictions at scale Design production pipelines for training, deployment, and monitoring using modern MLOps tooling Take ownership of technical quality, resilience, and observability of critical ML services Build reusable tools and frameworks to enable fast, safe experimentation and deployment Platform, Standards & MLOps Foundations Define and build the core MLOps capabilities for the organisation … including training pipelines, deployment frameworks, and observability tooling Establish standardised patterns and best practices to accelerate model development, testing, and deployment Lead the evolution of our ML platform, working with engineering partners to improve scalability, governance, and developer experience Contribute to responsible ML practicessupporting auditability, explainability, and model health monitoring Technical Leadership & Collaboration Partner with data scientists to take models … GitHub Actions, ArgoCD) Proven ability to build reusable tooling, scalable services, and resilient pipelines for real-time and batch inference Strong understanding of ML system lifecycle: testing, monitoring, governance, observability Excellent collaboration and communication skills; able to influence cross-functional teams and lead complex technical work A background in software engineering, computer science, or a quantitative fieldor equivalent experience leading More ❯
Employment Type: Full-time
Posted:

Lead Operations Engineer / Senior Operations Specialist DevOps London

London, United Kingdom
Hybrid / WFH Options
TOYOTA Connected
incident tooling (e.g., PagerDuty, Datadog). Technical Expertise required for this engagement: Guide operational practices across services built using Java (Spring Boot) , Kafka , MongoDB and related technologies. Oversee monitoring, observability, and performance tuning using Datadog , ELK , Prometheus , or similar tooling. Problem Management & Root Cause Elimination required: Lead proactive and reactive problem management efforts. Identify recurring production issues and collaborate with … rapid change practices including canary releases, feature flags, and progressive delivery. Continuous Improvement & DevOps Practices: Drive automation and self-service initiatives to reduce manual intervention and operational burden. Champion observability best practices (metrics, traces, logs) and error budget tracking. Promote DevOps culture and continuous feedback loops between engineering and operations. Governance, Risk & Compliance: Ensure operational processes comply with security, privacy More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Production Software Engineer - Forecasting

London, United Kingdom
Barlowe LLP
Research Lab. The role Ensuring resilience, uptime and operational efficiency is mission-critical to its success. As a Production Software Engineer, you will play a key role in driving observability, reliability, change safety and runtime optimisation across a complex, federated engineering environment. You will design and implement the systems, tooling and workflows that ensure the distributed platform is robust, observable … with infrastructure teams to own and evolve domain specific metrics, alerting and diagnostics infrastructure used to operate and monitor the platform Building and maintaining core systems for deployment automation, observability, runtime environment management and release readiness Promoting runtime engineering best practices, working with federated teams to align on standards, service ownership and fault tolerance Participating in a shared production support … are we looking for? Strong background in software engineering, ideally in distributed, real-time systems Experience with containerisation and orchestration technologies, such as Kubernetes, in production environments Familiarity with observability tooling and practices, such as Victoria Metrics, Prometheus, Grafana, OpenTelemetry and SLOs Well-developed debugging skills with the ability to navigate unfamiliar systems, identify root causes and deliver effective solutions More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Machine Learning Ops Engineer

London, United Kingdom
SLAMcore
for those with strong Engineering and DevOps capabilities and a deep interest in operationalising AI solutions. We are looking for someone with complementary skills that extend into infrastructure and observability, preferably with experience in E-Commerce. The AI team owns all ML-related research, implementation and maintenance. In practice, this means keeping up to date with best practices in production … ownership of the deployment and monitoring pipelines within your expertise Contribute to the ongoing innovation R&D projects by enabling production readiness Maintain and implement CI/CD pipelines, observability, and infrastructure for ML services Requirements Degree in relevant field with 3+ years of industry experience Strong Technical Skills: Python, AWS, Docker, Terraform Experience deploying and maintaining machine learning models More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Machine Learning Ops Engineer (London)

London, UK
DigitalGenius
for those with strong Engineering and DevOps capabilities and a deep interest in operationalising AI solutions. We are looking for someone with complementary skills that extend into infrastructure and observability, preferably with experience in E-Commerce. The AI team owns all ML-related research, implementation and maintenance. In practice, this means keeping up to date with best practices in production … ownership of the deployment and monitoring pipelines within your expertise Contribute to the ongoing innovation R&D projects by enabling production readiness Maintain and implement CI/CD pipelines, observability, and infrastructure for ML services Requirements Degree in relevant field with 3+ years of industry experience Strong Technical Skills: Python, AWS, Docker, Terraform Experience deploying and maintaining machine learning models More ❯
Employment Type: Full-time
Posted:

Principal AWS Platform Engineer

London, United Kingdom
CACI Limited
at scale, leveraging AWS Organizations, Landing Zones, and multi-account best practices. Develop and maintain Infrastructure as Code solutions using Terraform, CloudFormation, and AWS CDK. Champion security, compliance, and observability by integrating services like AWS Security Hub, GuardDuty, and Inspector. Design CI/CD pipelines to enable seamless deployments and self-service models for customers. Innovate with AWS Networking, KMS … Proficiency in Python, Go, or similar languages for automation and scripting. Expert-level knowledge of AWS Networking, TLS, and security best practices. Experience with container orchestration (Kubernetes, EKS) and observability tools (Grafana, ELK). A passion for innovation, problem-solving, and delivering high-impact solutions. Experience leading/managing junior engineers Significant experience with Control Tower and deploying landing zones. More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Engineering Manager

London, United Kingdom
Mindera
implementation, testing, and rollout - ensuring operational stability of delivered solutions. Provide hands-on technical guidance, support team planning, and facilitate delivery ceremonies. Champion engineering best practices across development, testing, observability, and operational support. Raise the team's maturity and drive progress towards or maintain Elite DORA standards. Build, mentor, and manage a high-performing software engineering team. Foster a culture … technical proficiency in: Languages: Java 17+ (Java 21 preferred) Frameworks: Micronaut (preferred), Spring Boot Testing: JUnit, Mockito Build Tools: Gradle Data & Messaging: Kafka, MongoDB APIs: GraphQL Federation, REST Infrastructure & Observability: Terraform, OpenTelemetry, Dynatrace Soft Skills & Leadership Exceptional communication skills - able to distill and present engineering decisions to executives and business teams. Experienced in managing relationships with third-party vendors and More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Engineering Excellence Lead

London, United Kingdom
Hybrid / WFH Options
Trili
Collaborate with People/HR and engineering leadership on career pathing, training, and coaching for engineering staff. Technology Enablement: Evaluate and deploy tools - especially AI - that support engineering productivity, observability, and collaboration. Work closely with DevOps, QA, and SRE teams to align infrastructure and operational excellence with engineering needs. Own key vendor relationships, evaluation of partnerships and represent technology on … scaling engineering orgs across multiple geographies or domains (e.g., front-end, back-end, infrastructure). Familiarity with tools like Linear, Asana, GitHub, Datadog, DORA metrics, or similar performance/observability platforms. Background in organisational change management or engineering program management. What you can expect from us Competitive salary with substantial incentive schemes Generous long-term incentive plan (LTIP) tez token More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:
Observability
England
10th Percentile
£57,500
25th Percentile
£70,000
Median
£80,000
75th Percentile
£99,500
90th Percentile
£120,000