Observability Jobs in London

351 to 375 of 821 Observability Jobs in London

Senior ML Engineer

London, United Kingdom
Hybrid / WFH Options
Method-Resourcing
teams to operationalize models and ship ML-powered features into production. Continuously assess and iterate on production models, balancing long-term ML strategy with tactical improvements. Champion code quality, observability, and resilience within their ML systems through reviews and hands-on contributions. Help shape their internal ML standards and practices, ensuring they stay ahead of industry advancements. Offer technical mentorship More ❯
Employment Type: Permanent, Work From Home
Posted:

Senior Infrastructure Engineer

London, South East, England, United Kingdom
Hybrid / WFH Options
Computer Futures
Be Doing Architect and manage Azure-based infrastructure using Infrastructure as Code (IaC) tools like ARM, Bicep, or Terraform. Optimize .NET application stacks for speed, reliability, and scalability. Implement observability and monitoring solutions to ensure system health and performance. Conduct performance reviews and troubleshooting across production environments. Drive automation, growth planning, and continuous improvement across the platform. What You Bring More ❯
Employment Type: Full-Time
Salary: £75,000 - £100,000 per annum
Posted:

Senior ML Engineer

London, South East, England, United Kingdom
Hybrid / WFH Options
Method Resourcing
teams to operationalize models and ship ML-powered features into production. Continuously assess and iterate on production models, balancing long-term ML strategy with tactical improvements. Champion code quality, observability, and resilience within their ML systems through reviews and hands-on contributions. Help shape their internal ML standards and practices, ensuring they stay ahead of industry advancements. Offer technical mentorship More ❯
Employment Type: Full-Time
Salary: £150,000 - £160,000 per annum
Posted:

Senior Software Engineer

London Area, United Kingdom
Hybrid / WFH Options
Kura
maintainable, and well-tested code across the full stack Mentor and support junior engineers through code reviews, knowledge sharing, and pair programming Continuously improve CI/CD workflows, system observability, and overall platform performance at scale What we’re seeking: 8+ years of experience as a full-stack engineer, building and maintaining complex web applications Experience touching across React, TypeScript More ❯
Posted:

Senior Software Engineer

City of London, London, United Kingdom
Hybrid / WFH Options
Kura
maintainable, and well-tested code across the full stack Mentor and support junior engineers through code reviews, knowledge sharing, and pair programming Continuously improve CI/CD workflows, system observability, and overall platform performance at scale What we’re seeking: 8+ years of experience as a full-stack engineer, building and maintaining complex web applications Experience touching across React, TypeScript More ❯
Posted:

ML Infrastructure Engineer

London Area, United Kingdom
Hybrid / WFH Options
Cubiq Recruitment
including work on caching, I/O, and data locality across compute and storage Benchmark, profile, and fix performance issues across compute, network, and orchestration layers Set up clear observability, resilience, and security controls for sensitive research environments Work with Research, Data, and Applied teams to plan GPU and storage capacity and support smoother ML experimentation Technical Skills: Strong experience More ❯
Posted:

ML Infrastructure Engineer

City of London, London, United Kingdom
Hybrid / WFH Options
Cubiq Recruitment
including work on caching, I/O, and data locality across compute and storage Benchmark, profile, and fix performance issues across compute, network, and orchestration layers Set up clear observability, resilience, and security controls for sensitive research environments Work with Research, Data, and Applied teams to plan GPU and storage capacity and support smoother ML experimentation Technical Skills: Strong experience More ❯
Posted:

ML Infrastructure Engineer

london, south east england, united kingdom
Hybrid / WFH Options
Cubiq Recruitment
including work on caching, I/O, and data locality across compute and storage Benchmark, profile, and fix performance issues across compute, network, and orchestration layers Set up clear observability, resilience, and security controls for sensitive research environments Work with Research, Data, and Applied teams to plan GPU and storage capacity and support smoother ML experimentation Technical Skills: Strong experience More ❯
Posted:

ML Infrastructure Engineer

london (city of london), south east england, united kingdom
Hybrid / WFH Options
Cubiq Recruitment
including work on caching, I/O, and data locality across compute and storage Benchmark, profile, and fix performance issues across compute, network, and orchestration layers Set up clear observability, resilience, and security controls for sensitive research environments Work with Research, Data, and Applied teams to plan GPU and storage capacity and support smoother ML experimentation Technical Skills: Strong experience More ❯
Posted:

Azure Cloud Engineer - SC CLEARED

City of London, London, United Kingdom
Zero Plus Ltd
and WAF. Support hub-and-spoke topologies and hybrid connectivity solutions. Monitor and optimise performance across hybrid and multi-site environments. Troubleshoot connectivity issues and resolve incidents quickly. Implement observability using Azure Monitor, Log Analytics, and Network Watcher. Automate provisioning and configuration using Terraform, Azure CLI, and PowerShell. Contribute to CI/CD integration for infrastructure as code. Ensure compliance More ❯
Posted:

Azure Cloud Engineer - SC CLEARED

London Area, United Kingdom
Zero Plus Ltd
and WAF. Support hub-and-spoke topologies and hybrid connectivity solutions. Monitor and optimise performance across hybrid and multi-site environments. Troubleshoot connectivity issues and resolve incidents quickly. Implement observability using Azure Monitor, Log Analytics, and Network Watcher. Automate provisioning and configuration using Terraform, Azure CLI, and PowerShell. Contribute to CI/CD integration for infrastructure as code. Ensure compliance More ❯
Posted:

Azure Cloud Engineer - SC CLEARED

london (city of london), south east england, united kingdom
Zero Plus Ltd
and WAF. Support hub-and-spoke topologies and hybrid connectivity solutions. Monitor and optimise performance across hybrid and multi-site environments. Troubleshoot connectivity issues and resolve incidents quickly. Implement observability using Azure Monitor, Log Analytics, and Network Watcher. Automate provisioning and configuration using Terraform, Azure CLI, and PowerShell. Contribute to CI/CD integration for infrastructure as code. Ensure compliance More ❯
Posted:

Azure Cloud Engineer - SC CLEARED

london, south east england, united kingdom
Zero Plus Ltd
and WAF. Support hub-and-spoke topologies and hybrid connectivity solutions. Monitor and optimise performance across hybrid and multi-site environments. Troubleshoot connectivity issues and resolve incidents quickly. Implement observability using Azure Monitor, Log Analytics, and Network Watcher. Automate provisioning and configuration using Terraform, Azure CLI, and PowerShell. Contribute to CI/CD integration for infrastructure as code. Ensure compliance More ❯
Posted:

Senior DevOps Platform Engineer

London, England, United Kingdom
CDW UK
including Salesforce-specific pipelines. Build and maintain Infrastructure as Code (IaC) using Terraform and Ansible. Design highly reliable, scalable, and secure infrastructure supporting performance-critical workloads. Build proactive monitoring, observability, and alerting with Prometheus, Grafana, Azure Monitor, DataDog, and Dynatrace. Troubleshoot complex system issues spanning applications, networks, and infrastructure. Define platform SLAs, SLOs, and governance standards for self-service use. … Infrastructure as Code with Terraform and Ansible, along with scripting in PowerShell, Python, or Bash Experience implementing GitOps workflows and managing platform SLAs, SLOs, and governance standards Familiarity with observability and monitoring tools including Prometheus, Grafana, Azure Monitor, DataDog, or Dynatrace Preferred experience supporting Salesforce DevOps pipelines and working with Java, .NET, or Node.js application environments Exposure to AI/ More ❯
Posted:

Senior DevOps Platform Engineer

london, south east england, united kingdom
CDW UK
including Salesforce-specific pipelines. Build and maintain Infrastructure as Code (IaC) using Terraform and Ansible. Design highly reliable, scalable, and secure infrastructure supporting performance-critical workloads. Build proactive monitoring, observability, and alerting with Prometheus, Grafana, Azure Monitor, DataDog, and Dynatrace. Troubleshoot complex system issues spanning applications, networks, and infrastructure. Define platform SLAs, SLOs, and governance standards for self-service use. … Infrastructure as Code with Terraform and Ansible, along with scripting in PowerShell, Python, or Bash Experience implementing GitOps workflows and managing platform SLAs, SLOs, and governance standards Familiarity with observability and monitoring tools including Prometheus, Grafana, Azure Monitor, DataDog, or Dynatrace Preferred experience supporting Salesforce DevOps pipelines and working with Java, .NET, or Node.js application environments Exposure to AI/ More ❯
Posted:

Lead Engineer

City of London, London, United Kingdom
Hybrid / WFH Options
Sanderson
AWS services (SNS, SQS, Lambda, DynamoDB). Drive automation across CI/CD pipelines using tools like GitHub Actions , Terraform , and Argo CD for seamless and secure deployments. Enhance observability using Prometheus , Grafana , Datadog , and CloudWatch , enabling proactive incident prevention. Own incident management and post-mortem practices — guiding the team through challenges calmly and driving meaningful improvement. Collaborate with global … Terraform, Ansible) and CI/CD automation (GitHub Actions, Jenkins, Harness). Familiarity with messaging, caching, and database systems — Kafka, Redis, MongoDB, Cassandra, PostgreSQL. Hands-on experience in monitoring, observability, and incident response frameworks using modern tooling. Strong leadership, mentoring, and stakeholder management skills — able to scale teams, set OKRs, and foster engineering excellence. An ability to remain composed, analytical More ❯
Posted:

Lead Engineer

London Area, United Kingdom
Hybrid / WFH Options
Sanderson
AWS services (SNS, SQS, Lambda, DynamoDB). Drive automation across CI/CD pipelines using tools like GitHub Actions , Terraform , and Argo CD for seamless and secure deployments. Enhance observability using Prometheus , Grafana , Datadog , and CloudWatch , enabling proactive incident prevention. Own incident management and post-mortem practices — guiding the team through challenges calmly and driving meaningful improvement. Collaborate with global … Terraform, Ansible) and CI/CD automation (GitHub Actions, Jenkins, Harness). Familiarity with messaging, caching, and database systems — Kafka, Redis, MongoDB, Cassandra, PostgreSQL. Hands-on experience in monitoring, observability, and incident response frameworks using modern tooling. Strong leadership, mentoring, and stakeholder management skills — able to scale teams, set OKRs, and foster engineering excellence. An ability to remain composed, analytical More ❯
Posted:

Lead Engineer

london, south east england, united kingdom
Hybrid / WFH Options
Sanderson
AWS services (SNS, SQS, Lambda, DynamoDB). Drive automation across CI/CD pipelines using tools like GitHub Actions , Terraform , and Argo CD for seamless and secure deployments. Enhance observability using Prometheus , Grafana , Datadog , and CloudWatch , enabling proactive incident prevention. Own incident management and post-mortem practices — guiding the team through challenges calmly and driving meaningful improvement. Collaborate with global … Terraform, Ansible) and CI/CD automation (GitHub Actions, Jenkins, Harness). Familiarity with messaging, caching, and database systems — Kafka, Redis, MongoDB, Cassandra, PostgreSQL. Hands-on experience in monitoring, observability, and incident response frameworks using modern tooling. Strong leadership, mentoring, and stakeholder management skills — able to scale teams, set OKRs, and foster engineering excellence. An ability to remain composed, analytical More ❯
Posted:

Lead Engineer

london (city of london), south east england, united kingdom
Hybrid / WFH Options
Sanderson
AWS services (SNS, SQS, Lambda, DynamoDB). Drive automation across CI/CD pipelines using tools like GitHub Actions , Terraform , and Argo CD for seamless and secure deployments. Enhance observability using Prometheus , Grafana , Datadog , and CloudWatch , enabling proactive incident prevention. Own incident management and post-mortem practices — guiding the team through challenges calmly and driving meaningful improvement. Collaborate with global … Terraform, Ansible) and CI/CD automation (GitHub Actions, Jenkins, Harness). Familiarity with messaging, caching, and database systems — Kafka, Redis, MongoDB, Cassandra, PostgreSQL. Hands-on experience in monitoring, observability, and incident response frameworks using modern tooling. Strong leadership, mentoring, and stakeholder management skills — able to scale teams, set OKRs, and foster engineering excellence. An ability to remain composed, analytical More ❯
Posted:

Staff Software Engineer

London Area, United Kingdom
Hybrid / WFH Options
KE Technology
focusing on scale, performance, and reliability. Why You’ll Love It Build and optimise real-time distributed systems at a global scale Lead deep dives into latency, throughput, and observability Stay close to the code while shaping architecture and direction Be part of an engineering-led culture with standout benefits: Full private health insurance Extended maternity and paternity leave In More ❯
Posted:

Staff Software Engineer

City of London, London, United Kingdom
Hybrid / WFH Options
KE Technology
focusing on scale, performance, and reliability. Why You’ll Love It Build and optimise real-time distributed systems at a global scale Lead deep dives into latency, throughput, and observability Stay close to the code while shaping architecture and direction Be part of an engineering-led culture with standout benefits: Full private health insurance Extended maternity and paternity leave In More ❯
Posted:

Staff Software Engineer

london, south east england, united kingdom
Hybrid / WFH Options
KE Technology
focusing on scale, performance, and reliability. Why You’ll Love It Build and optimise real-time distributed systems at a global scale Lead deep dives into latency, throughput, and observability Stay close to the code while shaping architecture and direction Be part of an engineering-led culture with standout benefits: Full private health insurance Extended maternity and paternity leave In More ❯
Posted:

Staff Software Engineer

london (city of london), south east england, united kingdom
Hybrid / WFH Options
KE Technology
focusing on scale, performance, and reliability. Why You’ll Love It Build and optimise real-time distributed systems at a global scale Lead deep dives into latency, throughput, and observability Stay close to the code while shaping architecture and direction Be part of an engineering-led culture with standout benefits: Full private health insurance Extended maternity and paternity leave In More ❯
Posted:

AWS Solution Architect

City of London, London, United Kingdom
La Fosse
adoption of infrastructure-as-code and GitOps principles for consistent, automated delivery. Lead design forums and provide architectural governance across multiple projects. Develop cloud roadmaps covering network segmentation, identity, observability, and resilience. Embed security, compliance, and resilience into all architectural designs. Manage cost optimisation, including RI/SP planning and right-sizing. Mentor engineers and architects on AWS best practices More ❯
Posted:

AWS Solution Architect

London Area, United Kingdom
La Fosse
adoption of infrastructure-as-code and GitOps principles for consistent, automated delivery. Lead design forums and provide architectural governance across multiple projects. Develop cloud roadmaps covering network segmentation, identity, observability, and resilience. Embed security, compliance, and resilience into all architectural designs. Manage cost optimisation, including RI/SP planning and right-sizing. Mentor engineers and architects on AWS best practices More ❯
Posted:
Observability
London
10th Percentile
£64,500
25th Percentile
£73,750
Median
£90,000
75th Percentile
£115,000
90th Percentile
£158,500