Observability Job Vacancies

351 to 375 of 1,037 Observability Jobs

Infrastructure Engineer

London Area, United Kingdom
Coram AI
provisioning and management across hundreds of thousands of connected IoT devices deployed in the field Building CI and CD and automation pipelines for various parts of the stack Building observability and telemetry Helping maintain compliance with various security standards (SOC2, HIPAA ...) Maximising developer productivity by streamlining development workflows This is an onsite position based in London, UK Requirements and … Kubernetes (particularly EKS) 3+ years of experience with either Python or Go Building CI/CD pipelines and automation of various parts of the stack Self-hosting and maintaining observability tools such as Grafana/Prometheus It would be great if you also have experience with one or more Edge/IoT infrastructure (Yocto, IoT devices provisioning, over-the-air More ❯
Posted:

Director of Platform Engineering

London, United Kingdom
dunnhumby
fosters innovation, and delivers exceptional user interactions delivering robust internal developer platform (IDP) capabilities, strengthening CI/CD pipelines, enabling on-demand environments, and scaling platform foundations such as observability, security, and FinOps - while adhering to best practices in DevOps and modern software delivery. What we expect from you Drive the development of a comprehensive IDP (e.g., based on Backstage … on-demand environments for development, QA, and staging through Infrastructure-as-Code and container orchestration. Support multi-tenancy and environment rationalization to reduce duplication and inefficiency. Define and implement observability standards, including logging, metrics, tracing, and alerting . Use tools like New Relic , Prometheus , and Grafana , alongside building custom instrumentation for key platform services. Drive incident readiness and operational resilience … tools. Proven success in building and operating developer platforms and enablement frameworks. Experience with cloud-native technologies, Kubernetes, and Infrastructure as Code (Terraform, Helm, etc.). Strong understanding of observability tooling (especially New Relic, Prometheus, Grafana) and incident response best practices. Familiarity with FinOps, platform cost tracking, and infrastructure efficiency techniques. Excellent communication, leadership, and stakeholder management skills. Attract, hire More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

AWS Senior Platform Engineer

Bristol, Gloucestershire, United Kingdom
CACI Limited
at scale, leveraging AWS Organizations, Landing Zones, and multi-account best practices. Develop and maintain Infrastructure as Code solutions using Terraform, CloudFormation, and AWS CDK. Champion security, compliance, and observability by integrating services like AWS Security Hub, GuardDuty, and Inspector. Design CI/CD pipelines to enable seamless deployments and self-service models for customers. Innovate with AWS Networking, KMS … Proficiency in Python, Go, or similar languages for automation and scripting. Expert-level knowledge of AWS Networking, TLS, and security best practices. Experience with container orchestration (Kubernetes, EKS) and observability tools (Grafana, ELK). A passion for innovation, problem-solving, and delivering high-impact solutions. Why Work For Us? 25 days holiday + bank holidays Up to 5% employer pension More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Director of Platform Engineering

Manchester, Lancashire, United Kingdom
dunnhumby
fosters innovation, and delivers exceptional user interactions delivering robust internal developer platform (IDP) capabilities, strengthening CI/CD pipelines, enabling on-demand environments, and scaling platform foundations such as observability, security, and FinOps - while adhering to best practices in DevOps and modern software delivery What we expect from you Drive the development of a comprehensive IDP (e.g., based on Backstage … on-demand environments for development, QA, and staging through Infrastructure-as-Code and container orchestration. Support multi-tenancy and environment rationalization to reduce duplication and inefficiency. Define and implement observability standards, including logging, metrics, tracing, and alerting . Use tools like New Relic , Prometheus , and Grafana , alongside building custom instrumentation for key platform services. Drive incident readiness and operational resilience … tools. Proven success in building and operating developer platforms and enablement frameworks. Experience with cloud-native technologies, Kubernetes, and Infrastructure as Code (Terraform, Helm, etc.). Strong understanding of observability tooling (especially New Relic, Prometheus, Grafana) and incident response best practices. Familiarity with FinOps, platform cost tracking, and infrastructure efficiency techniques. Excellent communication, leadership, and stakeholder management skills. Attract, hire More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Data Engineer London, Singapore

London, United Kingdom
GSR Markets Limited
Monitor, troubleshoot, and optimize data pipelines to ensure performance and cost efficiency. Implement data governance, access controls, and security measures in line with best practices and regulatory standards. Develop observability and anomaly detection tools to support Tier 1 systems. Work with engineers and business teams to gather requirements and translate them into technical solutions. Maintain documentation, follow coding standards, and … to work across technical and non-technical teams. Additional Strengths Experience with orchestration tools like Apache Airflow. Knowledge of real-time data processing and event-driven architectures. Familiarity with observability tools and anomaly detection for production systems. Exposure to data visualization platforms such as Tableau or Looker. Relevant cloud or data engineering certifications. What we offer: A collaborative and transparent … ELT workflows with Apache Airflow (or similar) and integrating them into containerised CI/CD pipelines (Docker, GitHub Actions, Jenkins, etc.)? Select Which option best describes your experience building observability and automated anomaly detection tooling for data pipelines? Select What best describes your current location and working rights status? Select By submitting your application, you confirm that you have read More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Lead Infrastructure Architect - Fantastic Opportunity

City of London, London, United Kingdom
Hybrid / WFH Options
UST
domains. With over 20+ years of proven expertise, the ideal candidate will shape the strategy, design, and transformation of complex infrastructure landscapes—including Wintel, Linux, Network, Voice, Collaboration, Mobility, Observability, End-User Computing, End-User Services, and Service Desk. You will lead and drive architecture review boards and provide strategic direction. This role acts as a key advisor to senior … domains: Wintel & Linux platforms Network (LAN/WAN/SD-WAN, Wireless, Firewalls) Unified Communication/Voice/Collaboration (Cisco, MS Teams) Mobility & Endpoint Management (Intune, MDM/UEM) Observability and Monitoring (ELK, Prometheus, AppDynamics, etc.) End-User Computing (VDI, physical endpoints, OS lifecycle) End-User Services and Service Desk (ITSM, automation, FCR, CSAT) Serve as a trusted advisor to More ❯
Posted:

Lead Infrastructure Architect - Fantastic Opportunity

London Area, United Kingdom
Hybrid / WFH Options
UST
domains. With over 20+ years of proven expertise, the ideal candidate will shape the strategy, design, and transformation of complex infrastructure landscapes—including Wintel, Linux, Network, Voice, Collaboration, Mobility, Observability, End-User Computing, End-User Services, and Service Desk. You will lead and drive architecture review boards and provide strategic direction. This role acts as a key advisor to senior … domains: Wintel & Linux platforms Network (LAN/WAN/SD-WAN, Wireless, Firewalls) Unified Communication/Voice/Collaboration (Cisco, MS Teams) Mobility & Endpoint Management (Intune, MDM/UEM) Observability and Monitoring (ELK, Prometheus, AppDynamics, etc.) End-User Computing (VDI, physical endpoints, OS lifecycle) End-User Services and Service Desk (ITSM, automation, FCR, CSAT) Serve as a trusted advisor to More ❯
Posted:

Cloud Data Platform Engineer

England, United Kingdom
BMC Software, Inc
scale event-driven workflows using EventBridge and Lambda. Work with DynamoDB for fast, scalable key-value storage. Develop and maintain Java Spring Boot microservices deployed on EC2 instances. Ensure observability, monitoring, and fault-tolerance across the system. Collaborate with DevOps, Data Engineering, and Product teams to design scalable, cost-effective cloud solutions. Maintain security best practices in a cloud-native … performance tuning, and cost-optimization in cloud environments with Kafka for data streaming. Familiarity with CI/CD and infrastructure-as-code tools (e.g., Terraform, CloudFormation). Experience with observability tools (e.g., CloudWatch, OpenTelemetry). Experience working in a global enterprise software company. Our commitment to you! BMC's culture is built around its people. We have 6000+ brilliant minds More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Site Reliability Engineer

United Kingdom
Hybrid / WFH Options
Unitary
coming year and beyond! The role We are now looking for a Site Reliability Engineer to ensure our systems run smoothly and reliably at scale. Your expertise in monitoring, observability, and system automation will help maintain the high availability and performance our customers depend on. You will work at the intersection of development and operations, using your technical skills to … Design and implement comprehensive alerting systems that detect issues early and provide actionable insights to streamline the resolution of these issues. Collaborate with our development teams to ensure our observability stack provides clear visibility into system health and performance. Optimise on-call processes, including creating and maintaining detailed runbooks that enable efficient incident response and knowledge sharing across teams. Build More ❯
Posted:

Engineering Excellence Lead

London, United Kingdom
Hybrid / WFH Options
Trili
Collaborate with People/HR and engineering leadership on career pathing, training, and coaching for engineering staff. Technology Enablement: Evaluate and deploy tools - especially AI - that support engineering productivity, observability, and collaboration. Work closely with DevOps, QA, and SRE teams to align infrastructure and operational excellence with engineering needs. Own key vendor relationships, evaluation of partnerships and represent technology on … scaling engineering orgs across multiple geographies or domains (e.g., front-end, back-end, infrastructure). Familiarity with tools like Linear, Asana, GitHub, Datadog, DORA metrics, or similar performance/observability platforms. Background in organisational change management or engineering program management. What you can expect from us Competitive salary with substantial incentive schemes Generous long-term incentive plan (LTIP) tez token More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Restaurant Technology Problem Manager

London, United Kingdom
Hybrid / WFH Options
McDonald's Corporation
as follows: Own ITIL Problem & Change Management Take ownership of ITIL Problem Management activities, proactively identifying, addressing and fixing root causes of incidents and recurring issues within the system. Observability lead, promoting stability across the estate by collaborating with cross-functional teams to implement preventive measures. Actively take part in ITIL Change Management processes, ensuring that changes to the system … efficiently. Experience in implementing changes while following ITIL change management processes. Understanding of basic security principles and best practices for securing infrastructure. Optional but advantageous technical skills: Proficient using observability tools (NewRelic and Thousand Eyes), BI platform and data visualisation tools (such as Tableau and Power BI) and technology tools (Jira, Confluence). System Administration: Proficiency in Linux/Unix More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

DevOps Engineer

London Area, United Kingdom
Miller Maxwell Ltd
orchestration Deep familiarity with Windows Server environments, performance tuning, and patching best practices 🔹 Bonus Skills Understanding of Azure DevOps and CI/CD workflows Experience with monitoring tools and observability strategies Exposure to security best practices in cloud-native environments 🔹 Soft Skills & Approach Collaborative mindset with the ability to partner effectively with developers, engineers, and trading teams. Strong communication skills More ❯
Posted:

DevOps Engineer

City of London, London, United Kingdom
Miller Maxwell Ltd
orchestration Deep familiarity with Windows Server environments, performance tuning, and patching best practices 🔹 Bonus Skills Understanding of Azure DevOps and CI/CD workflows Experience with monitoring tools and observability strategies Exposure to security best practices in cloud-native environments 🔹 Soft Skills & Approach Collaborative mindset with the ability to partner effectively with developers, engineers, and trading teams. Strong communication skills More ❯
Posted:

Java Software Engineer

London Area, United Kingdom
Oliver Bernard
developers. Experience with cloud platforms (AWS, GCP, or Azure). A strong security mindset or a keen interest in cybersecurity. Bonus: experience with Kubernetes, CI/CD pipelines, and observability tools. The role will require 5 days a week onsite in London, please apply for immediate consideration. More ❯
Posted:

Java Software Engineer

City of London, London, United Kingdom
Oliver Bernard
developers. Experience with cloud platforms (AWS, GCP, or Azure). A strong security mindset or a keen interest in cybersecurity. Bonus: experience with Kubernetes, CI/CD pipelines, and observability tools. The role will require 5 days a week onsite in London, please apply for immediate consideration. More ❯
Posted:

Staff Engineer

London Area, United Kingdom
Hybrid / WFH Options
Arrows
architecture and development of backend services using C#, ASP.NET, .NET Core Automate infrastructure, CI/CD pipelines, and cloud operations (AWS/Azure) Promote engineering best practices, security, and observability Mentor engineers and foster a culture of continuous improvement Contribute to technology direction, including adoption of tools like Go and Python What We’re Looking For Deep expertise in C# More ❯
Posted:

Staff Engineer

City of London, London, United Kingdom
Hybrid / WFH Options
Arrows
architecture and development of backend services using C#, ASP.NET, .NET Core Automate infrastructure, CI/CD pipelines, and cloud operations (AWS/Azure) Promote engineering best practices, security, and observability Mentor engineers and foster a culture of continuous improvement Contribute to technology direction, including adoption of tools like Go and Python What We’re Looking For Deep expertise in C# More ❯
Posted:

Devops .Net Tech Lead

Kent, England, United Kingdom
Wipro
C#, ASP.Net, Web APIs Leading cloud-native and DevOps practices Desirable Skills Ideally, you’ll be familiar with: Containerisation and microservices PowerShell or scripting for automation Application monitoring and observability tools Agile delivery methodologies Cloud service cost optimisation Equal Opportunities Wipro is an advocate for positive change and conscious inclusion. As a global employer, we strive to create a diverse More ❯
Posted:

Lead Platform Engineer

Croydon, Cambridgeshire, UK
WeDo
autonomy, clean code, and continuous delivery The technical landscape: Azure (AKS, Functions, App Services, Event Grid, etc.) Infrastructure as Code (Terraform) CI/CD using Azure DevOps Monitoring and Observability (Application Insights, Azure Monitor, Prometheus/Grafana) GitHub for version control, and a modern SDLC with automated testing and security baked in What we’re looking for: Someone who can More ❯
Posted:

Lead Platform Engineer

Croydon, England, United Kingdom
WeDo
autonomy, clean code, and continuous delivery The technical landscape: Azure (AKS, Functions, App Services, Event Grid, etc.) Infrastructure as Code (Terraform) CI/CD using Azure DevOps Monitoring and Observability (Application Insights, Azure Monitor, Prometheus/Grafana) GitHub for version control, and a modern SDLC with automated testing and security baked in What we’re looking for: Someone who can More ❯
Posted:

DevOps Specialist

Knutsford, Cheshire, United Kingdom
Experis - ManpowerGroup
is required to assist in upgrading the Elastic DP estate to Kubernetes, moving away from obsolete technology (Cloudera), upgrading to RHEL 8, and contributing to improving the stability and observability of the platform. The role also involves providing advanced analytics tooling and services for modeling analytics. Responsibilities include: Supporting production application support in AWS, with experience in incident and change More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Senior Java Software Engineer

City of London, London, United Kingdom
Hybrid / WFH Options
Inara
and accelerate platform delivery Deploy and monitor services in AWS using Kubernetes Work in a high-frequency release environment — deploying multiple times per day Use Grafana (or similar) for observability and maintain production-grade reliability Work onsite 3 days/week in London for the first 4–6 weeks (hybrid flexibility beyond this) We’re Looking For: 5+ years of More ❯
Posted:

Senior Java Software Engineer

London Area, United Kingdom
Hybrid / WFH Options
Inara
and accelerate platform delivery Deploy and monitor services in AWS using Kubernetes Work in a high-frequency release environment — deploying multiple times per day Use Grafana (or similar) for observability and maintain production-grade reliability Work onsite 3 days/week in London for the first 4–6 weeks (hybrid flexibility beyond this) We’re Looking For: 5+ years of More ❯
Posted:

Platform Engineer

London Area, United Kingdom
Ascendion
build and scale service mesh architecture using Kong Mesh , Envoy , and Kubernetes . We're looking for someone with deep experience in service discovery , zero-trust networking , and microservices observability . If you have hands-on skills with Kuma , Istio , or similar, and enjoy owning mesh strategy end-to-end, we’d love to talk. Pay range and compensation package More ❯
Posted:

Platform Engineer

City of London, London, United Kingdom
Ascendion
build and scale service mesh architecture using Kong Mesh , Envoy , and Kubernetes . We're looking for someone with deep experience in service discovery , zero-trust networking , and microservices observability . If you have hands-on skills with Kuma , Istio , or similar, and enjoy owning mesh strategy end-to-end, we’d love to talk. Pay range and compensation package More ❯
Posted:
Observability
10th Percentile
£57,500
25th Percentile
£65,000
Median
£80,000
75th Percentile
£97,500
90th Percentile
£120,000