Observability Jobs in London

726 to 750 of 818 Observability Jobs in London

System Engineer: £120k + Bonus/benefits (AI Trading)

City of London, London, United Kingdom
Hunter Bond
storage environments that power a global trading platform. The successful candidate will be involved in every layer of the technology stack—from hardware and operating systems to automation and observability—while gaining exposure to how a world-class investment firm manages its technology infrastructure. Key Responsibilities Manage a distributed compute environment and several petabyte-scale storage systems Install, configure, and … software development practices (version control, agile methodologies) Familiarity with infrastructure automation and configuration management tools (Chef, Puppet, or Ansible) Exposure to distributed storage systems and related protocols Experience with observability and monitoring tools (Elasticsearch, Logstash, Kibana, Datadog, Prometheus, Grafana) Strong written and verbal communication skills Demonstrated ability to learn quickly and adapt to evolving technologies Ability to work effectively in More ❯
Posted:

System Engineer: £120k + Bonus/benefits (AI Trading)

london, south east england, united kingdom
Hunter Bond
storage environments that power a global trading platform. The successful candidate will be involved in every layer of the technology stack—from hardware and operating systems to automation and observability—while gaining exposure to how a world-class investment firm manages its technology infrastructure. Key Responsibilities Manage a distributed compute environment and several petabyte-scale storage systems Install, configure, and … software development practices (version control, agile methodologies) Familiarity with infrastructure automation and configuration management tools (Chef, Puppet, or Ansible) Exposure to distributed storage systems and related protocols Experience with observability and monitoring tools (Elasticsearch, Logstash, Kibana, Datadog, Prometheus, Grafana) Strong written and verbal communication skills Demonstrated ability to learn quickly and adapt to evolving technologies Ability to work effectively in More ❯
Posted:

System Engineer: £120k + Bonus/benefits (AI Trading)

london (city of london), south east england, united kingdom
Hunter Bond
storage environments that power a global trading platform. The successful candidate will be involved in every layer of the technology stack—from hardware and operating systems to automation and observability—while gaining exposure to how a world-class investment firm manages its technology infrastructure. Key Responsibilities Manage a distributed compute environment and several petabyte-scale storage systems Install, configure, and … software development practices (version control, agile methodologies) Familiarity with infrastructure automation and configuration management tools (Chef, Puppet, or Ansible) Exposure to distributed storage systems and related protocols Experience with observability and monitoring tools (Elasticsearch, Logstash, Kibana, Datadog, Prometheus, Grafana) Strong written and verbal communication skills Demonstrated ability to learn quickly and adapt to evolving technologies Ability to work effectively in More ❯
Posted:

Linux Production Engineer

London Area, United Kingdom
Autonomai Recruitment
experience building technology 0→1 , owning systems end-to-end, and working close to the metal. They will operate across everything from bare-metal Linux to modern build and observability stacks . Linux Platform Engineer – Trading Infrastructure Overview The firm is seeking a Linux Platform Engineer to join a small, high-impact engineering group supporting ML/AI-driven trading. … latency . Contribute to kernel-level debugging and system improvements . Automate Linux fleet builds—creating consistent, reproducible systems . Manage Kubernetes cluster infrastructure, networking, and container orchestration. Enhance observability Analyze and optimize networking across the full TCP/IP stack . Investigate core dumps, memory bottlenecks, and CPU performance issues across distributed systems. Develop Python tooling for internal automation More ❯
Posted:

Linux Production Engineer

City of London, London, United Kingdom
Autonomai Recruitment
experience building technology 0→1 , owning systems end-to-end, and working close to the metal. They will operate across everything from bare-metal Linux to modern build and observability stacks . Linux Platform Engineer – Trading Infrastructure Overview The firm is seeking a Linux Platform Engineer to join a small, high-impact engineering group supporting ML/AI-driven trading. … latency . Contribute to kernel-level debugging and system improvements . Automate Linux fleet builds—creating consistent, reproducible systems . Manage Kubernetes cluster infrastructure, networking, and container orchestration. Enhance observability Analyze and optimize networking across the full TCP/IP stack . Investigate core dumps, memory bottlenecks, and CPU performance issues across distributed systems. Develop Python tooling for internal automation More ❯
Posted:

Staff Data Engineer

london, south east england, united kingdom
Hybrid / WFH Options
Fruition Group
Job Title: Staff Data Engineer Location: London, Hybrid Salary: c.£140,000 + bonus + share options Why Apply? This is a unique opportunity to take a leading role in shaping the data strategy of a fast growing Insurtech scale More ❯
Posted:

Staff Data Engineer

London, United Kingdom
Hybrid / WFH Options
Fruition Group
Job Title: Staff Data Engineer Location: London, Hybrid Salary: c.£140,000 + bonus + share options Why Apply? This is a unique opportunity to take a leading role in shaping the data strategy of a fast growing Insurtech scale More ❯
Employment Type: Permanent
Posted:

Staff Site Reliability Engineer - Observability

City of London, London, United Kingdom
Hybrid / WFH Options
Motive Group
Senior/Staff Site Reliability Engineer - Observability | London (Hybrid) If you care deeply about building and operating world-class infrastructure for AI at scale , this one’s worth your time. We’re working with a company that builds the backbone powering some of the most demanding AI workloads on the planet. Think large-scale GPU clusters, global telemetry systems, and … distributed training environments used by leading research and enterprise teams. They’re looking for a Senior or Staff SRE with deep experience in observability at massive scale - someone who’s tuned Prometheus/Mimir, Loki, or Tempo clusters beyond 100M+ series or 10TB/day logs, and who thrives in highly technical, fast-moving environments. You’ll be working on … Designing and scaling observability for globally distributed GPU infrastructure Building automation that cuts operational toil and improves reliability Partnering with platform and infrastructure teams to deliver true visibility across complex AI systems If you’ve built or operated telemetry stacks for large-scale, GPU-heavy, or multi-tenant environments - and want to work on cutting-edge problems in a business More ❯
Posted:

Staff Site Reliability Engineer - Observability

London Area, United Kingdom
Hybrid / WFH Options
Motive Group
Senior/Staff Site Reliability Engineer - Observability | London (Hybrid) If you care deeply about building and operating world-class infrastructure for AI at scale , this one’s worth your time. We’re working with a company that builds the backbone powering some of the most demanding AI workloads on the planet. Think large-scale GPU clusters, global telemetry systems, and … distributed training environments used by leading research and enterprise teams. They’re looking for a Senior or Staff SRE with deep experience in observability at massive scale - someone who’s tuned Prometheus/Mimir, Loki, or Tempo clusters beyond 100M+ series or 10TB/day logs, and who thrives in highly technical, fast-moving environments. You’ll be working on … Designing and scaling observability for globally distributed GPU infrastructure Building automation that cuts operational toil and improves reliability Partnering with platform and infrastructure teams to deliver true visibility across complex AI systems If you’ve built or operated telemetry stacks for large-scale, GPU-heavy, or multi-tenant environments - and want to work on cutting-edge problems in a business More ❯
Posted:

Staff Site Reliability Engineer - Observability

london, south east england, united kingdom
Hybrid / WFH Options
Motive Group
Senior/Staff Site Reliability Engineer - Observability | London (Hybrid) If you care deeply about building and operating world-class infrastructure for AI at scale , this one’s worth your time. We’re working with a company that builds the backbone powering some of the most demanding AI workloads on the planet. Think large-scale GPU clusters, global telemetry systems, and … distributed training environments used by leading research and enterprise teams. They’re looking for a Senior or Staff SRE with deep experience in observability at massive scale - someone who’s tuned Prometheus/Mimir, Loki, or Tempo clusters beyond 100M+ series or 10TB/day logs, and who thrives in highly technical, fast-moving environments. You’ll be working on … Designing and scaling observability for globally distributed GPU infrastructure Building automation that cuts operational toil and improves reliability Partnering with platform and infrastructure teams to deliver true visibility across complex AI systems If you’ve built or operated telemetry stacks for large-scale, GPU-heavy, or multi-tenant environments - and want to work on cutting-edge problems in a business More ❯
Posted:

Staff Site Reliability Engineer - Observability

london (city of london), south east england, united kingdom
Hybrid / WFH Options
Motive Group
Senior/Staff Site Reliability Engineer - Observability | London (Hybrid) If you care deeply about building and operating world-class infrastructure for AI at scale , this one’s worth your time. We’re working with a company that builds the backbone powering some of the most demanding AI workloads on the planet. Think large-scale GPU clusters, global telemetry systems, and … distributed training environments used by leading research and enterprise teams. They’re looking for a Senior or Staff SRE with deep experience in observability at massive scale - someone who’s tuned Prometheus/Mimir, Loki, or Tempo clusters beyond 100M+ series or 10TB/day logs, and who thrives in highly technical, fast-moving environments. You’ll be working on … Designing and scaling observability for globally distributed GPU infrastructure Building automation that cuts operational toil and improves reliability Partnering with platform and infrastructure teams to deliver true visibility across complex AI systems If you’ve built or operated telemetry stacks for large-scale, GPU-heavy, or multi-tenant environments - and want to work on cutting-edge problems in a business More ❯
Posted:

Site Reliability Engineer - AWS - Grafana - Cloudwatch - ELK - UK Remote

London, United Kingdom
Hybrid / WFH Options
Opus Recruitment Solutions
AWS GCP SRE Site Reliability Engineer Terraform Cloudformation ECS ELK Elasticsearch Logstash Kabana Cloudwatch Grafana Windows Observability Are you looking for a genuinely Remote opportunity? Somewhere you're part of something bigger, working on a global product within a close-knit SRE team? I've partnered a WebApp that provide an end to end event management for some of the … planet's biggest artists and they're now looking for a SRE. Someone that knows their way around classic Observability with Grafana, ELK stack, and cost optomisation for the product as they continue scaling. Working across the glove their multi-tenanted, AWS environments requires someone who is able to reverse engineer product faults, or post incident audits to ensure future … you'd like to hear more, send over a CV to or apply! AWS GCP SRE Site Reliability Engineer Terraform Cloudformation ECS ELK Elasticsearch Logstash Kabana Cloudwatch Grafana Windows Observability More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Site Reliability Engineer - AWS - Grafana - Cloudwatch - ELK - UK Remote

City of London, London, United Kingdom
Hybrid / WFH Options
Opus Recruitment Solutions
AWS | GCP | SRE | Site Reliability Engineer | Terraform | Cloudformation | ECS | ELK | Elasticsearch | Logstash | Kabana | Cloudwatch | Grafana | Windows | Observability Are you looking for a genuinely Remote opportunity? Somewhere you're part of something bigger, working on a global product within a close-knit SRE team? I've partnered a WebApp that provide an end to end event management for some of the … planet's biggest artists and they're now looking for a SRE. Someone that knows their way around classic Observability with Grafana, ELK stack, and cost optomisation for the product as they continue scaling. Working across the glove their multi-tenanted, AWS environments requires someone who is able to reverse engineer product faults, or post incident audits to ensure future … like to hear more, send over a CV to robin.shaw@opusrs.com or apply! AWS | GCP | SRE | Site Reliability Engineer | Terraform | Cloudformation | ECS | ELK | Elasticsearch | Logstash | Kabana | Cloudwatch | Grafana | Windows | Observability More ❯
Posted:

Site Reliability Engineer - AWS - Grafana - Cloudwatch - ELK - UK Remote

East London, London, United Kingdom
Hybrid / WFH Options
Opus Recruitment Solutions
AWS | GCP | SRE | Site Reliability Engineer | Terraform | Cloudformation | ECS | ELK | Elasticsearch | Logstash | Kabana | Cloudwatch | Grafana | Windows | Observability Are you looking for a genuinely Remote opportunity? Somewhere you're part of something bigger, working on a global product within a close-knit SRE team? I've partnered a WebApp that provide an end to end event management for some of the … planet's biggest artists and they're now looking for a SRE. Someone that knows their way around classic Observability with Grafana, ELK stack, and cost optomisation for the product as they continue scaling. Working across the glove their multi-tenanted, AWS environments requires someone who is able to reverse engineer product faults, or post incident audits to ensure future … like to hear more, send over a CV to robin.shaw@opusrs.com or apply! AWS | GCP | SRE | Site Reliability Engineer | Terraform | Cloudformation | ECS | ELK | Elasticsearch | Logstash | Kabana | Cloudwatch | Grafana | Windows | Observability More ❯
Posted:

Site Reliability Engineer - AWS - Grafana - Cloudwatch - ELK - UK Remote

Central London / West End, London, United Kingdom
Hybrid / WFH Options
Opus Recruitment Solutions
AWS | GCP | SRE | Site Reliability Engineer | Terraform | Cloudformation | ECS | ELK | Elasticsearch | Logstash | Kabana | Cloudwatch | Grafana | Windows | Observability Are you looking for a genuinely Remote opportunity? Somewhere you're part of something bigger, working on a global product within a close-knit SRE team? I've partnered a WebApp that provide an end to end event management for some of the … planet's biggest artists and they're now looking for a SRE. Someone that knows their way around classic Observability with Grafana, ELK stack, and cost optomisation for the product as they continue scaling. Working across the glove their multi-tenanted, AWS environments requires someone who is able to reverse engineer product faults, or post incident audits to ensure future … like to hear more, send over a CV to robin.shaw@opusrs.com or apply! AWS | GCP | SRE | Site Reliability Engineer | Terraform | Cloudformation | ECS | ELK | Elasticsearch | Logstash | Kabana | Cloudwatch | Grafana | Windows | Observability More ❯
Posted:

Senior Data Engineer

London Area, United Kingdom
Hybrid / WFH Options
Identify Solutions
the past year and aggressive expansion across the UK, US, and EU, the company is scaling at pace. Data is the backbone: from APIs and pipelines to governance and observability, their data platform directly powers customer-facing products and AI-driven insights. They’re now hiring a Senior Data Engineer to own and shape this platform, building scalable, production-grade … systems that become the foundation for global brands. Why join? ✨ Greenfield impact – inherit a live but early platform, define best practice across structure, testing, observability, and governance. ✨ Direct product impact – your APIs, pipelines, and orchestration power the platform that 1,000+ brands rely on every day. ✨ AI at the core – work on infrastructure that enables machine learning and intelligent decision … doing: API strategy & development – own and scale FastAPI endpoints that deliver real-time access to platform data. Data pipeline development – build ingestion and replication pipelines with best-in-class observability, latency, and resilience. Platform technical vision – influence architecture and orchestration, shaping how the business handles data at scale. Data quality & governance – embed testing, freshness, lineage, and monitoring to ensure reliability More ❯
Posted:

Senior Data Engineer

City of London, London, United Kingdom
Hybrid / WFH Options
Identify Solutions
the past year and aggressive expansion across the UK, US, and EU, the company is scaling at pace. Data is the backbone: from APIs and pipelines to governance and observability, their data platform directly powers customer-facing products and AI-driven insights. They’re now hiring a Senior Data Engineer to own and shape this platform, building scalable, production-grade … systems that become the foundation for global brands. Why join? ✨ Greenfield impact – inherit a live but early platform, define best practice across structure, testing, observability, and governance. ✨ Direct product impact – your APIs, pipelines, and orchestration power the platform that 1,000+ brands rely on every day. ✨ AI at the core – work on infrastructure that enables machine learning and intelligent decision … doing: API strategy & development – own and scale FastAPI endpoints that deliver real-time access to platform data. Data pipeline development – build ingestion and replication pipelines with best-in-class observability, latency, and resilience. Platform technical vision – influence architecture and orchestration, shaping how the business handles data at scale. Data quality & governance – embed testing, freshness, lineage, and monitoring to ensure reliability More ❯
Posted:

Senior Data Engineer

london, south east england, united kingdom
Hybrid / WFH Options
Identify Solutions
the past year and aggressive expansion across the UK, US, and EU, the company is scaling at pace. Data is the backbone: from APIs and pipelines to governance and observability, their data platform directly powers customer-facing products and AI-driven insights. They’re now hiring a Senior Data Engineer to own and shape this platform, building scalable, production-grade … systems that become the foundation for global brands. Why join? ✨ Greenfield impact – inherit a live but early platform, define best practice across structure, testing, observability, and governance. ✨ Direct product impact – your APIs, pipelines, and orchestration power the platform that 1,000+ brands rely on every day. ✨ AI at the core – work on infrastructure that enables machine learning and intelligent decision … doing: API strategy & development – own and scale FastAPI endpoints that deliver real-time access to platform data. Data pipeline development – build ingestion and replication pipelines with best-in-class observability, latency, and resilience. Platform technical vision – influence architecture and orchestration, shaping how the business handles data at scale. Data quality & governance – embed testing, freshness, lineage, and monitoring to ensure reliability More ❯
Posted:

Senior Data Engineer

london (city of london), south east england, united kingdom
Hybrid / WFH Options
Identify Solutions
the past year and aggressive expansion across the UK, US, and EU, the company is scaling at pace. Data is the backbone: from APIs and pipelines to governance and observability, their data platform directly powers customer-facing products and AI-driven insights. They’re now hiring a Senior Data Engineer to own and shape this platform, building scalable, production-grade … systems that become the foundation for global brands. Why join? ✨ Greenfield impact – inherit a live but early platform, define best practice across structure, testing, observability, and governance. ✨ Direct product impact – your APIs, pipelines, and orchestration power the platform that 1,000+ brands rely on every day. ✨ AI at the core – work on infrastructure that enables machine learning and intelligent decision … doing: API strategy & development – own and scale FastAPI endpoints that deliver real-time access to platform data. Data pipeline development – build ingestion and replication pipelines with best-in-class observability, latency, and resilience. Platform technical vision – influence architecture and orchestration, shaping how the business handles data at scale. Data quality & governance – embed testing, freshness, lineage, and monitoring to ensure reliability More ❯
Posted:

DevOps Engineer

London Area, United Kingdom
Tribus
frameworks that support thousands of real-time processes across global markets. This isn’t a maintenance role - it’s an opportunity to modernise the firm’s CI/CD, observability, and runtime environments from the ground up. What you’ll be doing: Engineering and optimising CI/CD pipelines and container orchestration at scale Modernising Linux-based deployment and runtime … low-latency environment What we’re looking for: 5+ years’ experience in DevOps, Systems, or Platform Engineering Deep knowledge of Linux, Python, and shell scripting Proven experience with Kubernetes, observability tooling, and CI/CD frameworks Strong grasp of distributed systems and performance tuning Trading, Crypto or Hedge Fund Experience Experience working in low-latency, high-frequency trading environments Why More ❯
Posted:

DevOps Engineer

City of London, London, United Kingdom
Tribus
frameworks that support thousands of real-time processes across global markets. This isn’t a maintenance role - it’s an opportunity to modernise the firm’s CI/CD, observability, and runtime environments from the ground up. What you’ll be doing: Engineering and optimising CI/CD pipelines and container orchestration at scale Modernising Linux-based deployment and runtime … low-latency environment What we’re looking for: 5+ years’ experience in DevOps, Systems, or Platform Engineering Deep knowledge of Linux, Python, and shell scripting Proven experience with Kubernetes, observability tooling, and CI/CD frameworks Strong grasp of distributed systems and performance tuning Trading, Crypto or Hedge Fund Experience Experience working in low-latency, high-frequency trading environments Why More ❯
Posted:

DevOps Engineer

london, south east england, united kingdom
Tribus
frameworks that support thousands of real-time processes across global markets. This isn’t a maintenance role - it’s an opportunity to modernise the firm’s CI/CD, observability, and runtime environments from the ground up. What you’ll be doing: Engineering and optimising CI/CD pipelines and container orchestration at scale Modernising Linux-based deployment and runtime … low-latency environment What we’re looking for: 5+ years’ experience in DevOps, Systems, or Platform Engineering Deep knowledge of Linux, Python, and shell scripting Proven experience with Kubernetes, observability tooling, and CI/CD frameworks Strong grasp of distributed systems and performance tuning Trading, Crypto or Hedge Fund Experience Experience working in low-latency, high-frequency trading environments Why More ❯
Posted:

DevOps Engineer

london (city of london), south east england, united kingdom
Tribus
frameworks that support thousands of real-time processes across global markets. This isn’t a maintenance role - it’s an opportunity to modernise the firm’s CI/CD, observability, and runtime environments from the ground up. What you’ll be doing: Engineering and optimising CI/CD pipelines and container orchestration at scale Modernising Linux-based deployment and runtime … low-latency environment What we’re looking for: 5+ years’ experience in DevOps, Systems, or Platform Engineering Deep knowledge of Linux, Python, and shell scripting Proven experience with Kubernetes, observability tooling, and CI/CD frameworks Strong grasp of distributed systems and performance tuning Trading, Crypto or Hedge Fund Experience Experience working in low-latency, high-frequency trading environments Why More ❯
Posted:

Staff Software Engineer

City of London, London, United Kingdom
Burns Sheehan
to solve complex challenges. Drive innovation around cloud-native technologies and platform automation. Balance strategic vision with ~30% hands-on coding and design work. Promote best practice in reliability, observability, and scalability. The Ideal Staff Software Engineer Proven experience operating at Staff+ level within a fast-paced engineering organisation. Strong background in cloud platforms (AWS or GCP) and deep knowledge … ability to build operators. Strong coding skills in Golang, Java, or C#, with experience in distributed systems. Demonstrated leadership across multiple squads and technical roadmaps. Expertise in operational excellence: observability, reliability, automation. This is an outstanding opportunity for a Staff Software Engineer join a rapidly scaling company where you’ll play a pivotal role in shaping the technical foundations of More ❯
Posted:

Staff Software Engineer

London Area, United Kingdom
Burns Sheehan
to solve complex challenges. Drive innovation around cloud-native technologies and platform automation. Balance strategic vision with ~30% hands-on coding and design work. Promote best practice in reliability, observability, and scalability. The Ideal Staff Software Engineer Proven experience operating at Staff+ level within a fast-paced engineering organisation. Strong background in cloud platforms (AWS or GCP) and deep knowledge … ability to build operators. Strong coding skills in Golang, Java, or C#, with experience in distributed systems. Demonstrated leadership across multiple squads and technical roadmaps. Expertise in operational excellence: observability, reliability, automation. This is an outstanding opportunity for a Staff Software Engineer join a rapidly scaling company where you’ll play a pivotal role in shaping the technical foundations of More ❯
Posted:
Observability
London
10th Percentile
£64,500
25th Percentile
£73,750
Median
£90,000
75th Percentile
£115,000
90th Percentile
£158,500