Observability Jobs in the City of London

Employment Type

Remote Jobs

Hybrid/WFH 110

Sort By

Relevance
Date

Locations

Central London 297

Job Titles

Senior TypeScript Back-End Engineer

london (city of london), south east england, united kingdom

Wave Talent

design, build, and support extensible, low-maintenance back-end services. Partner with Product, Design, Operations, and Growth to prioritise customer-facing and internal problems that drive value. Champion security, observability, and reliability best practices. Mentor teammates and help cultivate a healthy, innovative engineering culture. Tech Stack: Core: TypeScript/JavaScript (server-side) Cloud & Infra: AWS (cloud-based architectures), Docker, CI …/CD; IaC such as Terraform or CloudFormation Quality & Ops: Security and observability tooling/best practices Bonus exposure: React on the front end (nice to have) What we’re looking for We value skill and impact over strict year counts. You should have: Strong TypeScript/JavaScript fundamentals and experience building high-traffic server-side web applications. Solid understanding … of cloud-based application architecture (preferably AWS). Hands-on experience with Docker and CI/CD tooling. Practical grasp of security and observability best practices. Clear, collaborative communication and leadership skills. Why join? Work on meaningful problems in a fun, healthy, productive environment. Competitive package: £80k–£100k + Bonus, up to 10% employer pension, 28 days holiday (plus bank More ❯

Posted: 3 days ago

AWS Cloud Engineer

City of London, London, United Kingdom
Hybrid / WFH Options

Advanced Resource Managers

manage and support a customer’s AWS and Data platform To be technical hands on Provide Incident and problem management on the AWS IaaS and PaaS Platform Monitoring and observability of system and platform performance Collaboration with development and build teams on application and platform deployments and changes Involvement in the resolution of Incidents and problems in an efficient and … timely manner Actively monitor an AWS platform and components for technical issues Implement and improve on existing monitoring and observability solution To be involved in the resolution of technical incidents tickets Assist in the root cause analysis of incidents Assist with improving efficiency and processes within the team Examining traces and logs Escalate incidents and problems to the appropriate teams More ❯

Posted: Yesterday

AWS Cloud Engineer

london (city of london), south east england, united kingdom
Hybrid / WFH Options

Advanced Resource Managers

Posted: Yesterday

GenAI Engineer

City of London, London, United Kingdom

Clarity (formerly Anecdote)

up and harden RAG pipelines (indexing, retrieval policies, grounding, guardrails) and agent frameworks. Take basic infra ownership on GCP (or AWS/Azure): networking, autoscaling, CI/CD, IaC, observability, and cost tuning. Participate in on‐call for your area and drive root‐cause analysis with crisp follow‐ups. 15% Collaborate Pair with back‐end & front‐end to wire extractors … evals; hands‐on with time‐series analysis (forecasting, change‐point, drift). Cloud & ops: Basic infra ownership on GCP (or AWS/Azure): networking, autoscaling, CI/CD, IaC, observability, and cost control. Communication: You explain results clearly, align stakeholders, and write crisp docs. Bonus points DevOps wizardry; GPU/accelerator experience. Multimodal pipelines (text + voice + screenshots). More ❯

Posted: 2 days ago

GenAI Engineer

london (city of london), south east england, united kingdom

Clarity (formerly Anecdote)

Posted: 2 days ago

Staff Site Reliability Engineer - Observability

City of London, London, United Kingdom
Hybrid / WFH Options

Motive Group

Senior/Staff Site Reliability Engineer - Observability | London (Hybrid) If you care deeply about building and operating world-class infrastructure for AI at scale , this one’s worth your time. We’re working with a company that builds the backbone powering some of the most demanding AI workloads on the planet. Think large-scale GPU clusters, global telemetry systems, and … distributed training environments used by leading research and enterprise teams. They’re looking for a Senior or Staff SRE with deep experience in observability at massive scale - someone who’s tuned Prometheus/Mimir, Loki, or Tempo clusters beyond 100M+ series or 10TB/day logs, and who thrives in highly technical, fast-moving environments. You’ll be working on … Designing and scaling observability for globally distributed GPU infrastructure Building automation that cuts operational toil and improves reliability Partnering with platform and infrastructure teams to deliver true visibility across complex AI systems If you’ve built or operated telemetry stacks for large-scale, GPU-heavy, or multi-tenant environments - and want to work on cutting-edge problems in a business More ❯

Posted: 2 days ago

Staff Site Reliability Engineer - Observability

london (city of london), south east england, united kingdom
Hybrid / WFH Options

Motive Group

Posted: 2 days ago

Site Reliability Engineer - AWS - Grafana - Cloudwatch - ELK - UK Remote

City of London, London, United Kingdom
Hybrid / WFH Options

Opus Recruitment Solutions

Posted: Yesterday

Senior Data Engineer

City of London, London, United Kingdom
Hybrid / WFH Options

Identify Solutions

the past year and aggressive expansion across the UK, US, and EU, the company is scaling at pace. Data is the backbone: from APIs and pipelines to governance and observability, their data platform directly powers customer-facing products and AI-driven insights. They’re now hiring a Senior Data Engineer to own and shape this platform, building scalable, production-grade … systems that become the foundation for global brands. Why join? ✨ Greenfield impact – inherit a live but early platform, define best practice across structure, testing, observability, and governance. ✨ Direct product impact – your APIs, pipelines, and orchestration power the platform that 1,000+ brands rely on every day. ✨ AI at the core – work on infrastructure that enables machine learning and intelligent decision … doing: API strategy & development – own and scale FastAPI endpoints that deliver real-time access to platform data. Data pipeline development – build ingestion and replication pipelines with best-in-class observability, latency, and resilience. Platform technical vision – influence architecture and orchestration, shaping how the business handles data at scale. Data quality & governance – embed testing, freshness, lineage, and monitoring to ensure reliability More ❯

Posted: 2 days ago

Senior Data Engineer

london (city of london), south east england, united kingdom
Hybrid / WFH Options

Identify Solutions

Posted: 2 days ago

DevOps Engineer

City of London, London, United Kingdom

Tribus

frameworks that support thousands of real-time processes across global markets. This isn’t a maintenance role - it’s an opportunity to modernise the firm’s CI/CD, observability, and runtime environments from the ground up. What you’ll be doing: Engineering and optimising CI/CD pipelines and container orchestration at scale Modernising Linux-based deployment and runtime … low-latency environment What we’re looking for: 5+ years’ experience in DevOps, Systems, or Platform Engineering Deep knowledge of Linux, Python, and shell scripting Proven experience with Kubernetes, observability tooling, and CI/CD frameworks Strong grasp of distributed systems and performance tuning Trading, Crypto or Hedge Fund Experience Experience working in low-latency, high-frequency trading environments Why More ❯

Posted: Yesterday

DevOps Engineer

london (city of london), south east england, united kingdom

Tribus

Posted: 6 days ago

Staff Software Engineer

City of London, London, United Kingdom

Burns Sheehan

to solve complex challenges. Drive innovation around cloud-native technologies and platform automation. Balance strategic vision with ~30% hands-on coding and design work. Promote best practice in reliability, observability, and scalability. The Ideal Staff Software Engineer Proven experience operating at Staff+ level within a fast-paced engineering organisation. Strong background in cloud platforms (AWS or GCP) and deep knowledge … ability to build operators. Strong coding skills in Golang, Java, or C#, with experience in distributed systems. Demonstrated leadership across multiple squads and technical roadmaps. Expertise in operational excellence: observability, reliability, automation. This is an outstanding opportunity for a Staff Software Engineer join a rapidly scaling company where you’ll play a pivotal role in shaping the technical foundations of More ❯

Posted: 2 days ago

Staff Software Engineer

london (city of london), south east england, united kingdom

Burns Sheehan

Posted: 2 days ago

Principal Platform Engineer | Fintech | London | Up to £180k + Equity

City of London, London, United Kingdom

Maze

at Tier 1 banks. We're looking for a Principal Platform Engineer to drive the infrastructure behind mission-critical systems: think active-active, five-nines uptime, and real-time observability at global scale. What You'll Do: Own platform architecture for our next-gen ledger infrastructure Scale multi-region Kubernetes environments across cloud & on-prem Harden distributed systems (Kafka, Redis … CockroachDB) for global banking workloads Lead our AI-powered SRE approach: observability, remediation, and auto-response Enforce zero-trust, multi-tenant security and compliance (SOC2, ISO 27001) Define IaC foundations (Terraform, GitOps, Helm) What We're Looking For: Expert with Kubernetes and Distributed Systems Experience building production infrastructure at scale (multi-region, high-availability) Extensive experience building both on-Prem More ❯

Posted: 2 days ago

Principal Site Reliability Engineer | Stealth Fintech | London | Up to 180k + Equity

City of London, Greater London, UK

Maze

Employment Type: Part-time

Posted: 2 days ago

Principal Platform Engineer | Fintech | London | Up to £180k + Equity

london (city of london), south east england, united kingdom

Maze

Posted: 7 days ago

Data Engineer

City of London, London, United Kingdom

83zero Ltd

to translate complex business requirements into data-driven solutions. Write production-grade SQL and ensure data quality through testing, documentation, and version control. Promote best practices around data reliability, observability, and maintainability. (Optional but valued) Contribute to Infrastructure as Code and CI/CD pipelines (e.g., Terraform, GitHub Actions). Skills & Experience 5+ years of experience in data-focused roles … other data visualisation tools. Familiarity with orchestration tools such as Airflow, Prefect, or Dagster. Understanding of CI/CD practices in data and analytics engineering. Knowledge of data governance, observability, and security best practices in cloud environments. More ❯

Employment Type: Permanent

Salary: £90000 - £100000/annum

Posted: Yesterday

Data Engineer

london (city of london), south east england, united kingdom

83data

Posted: 3 days ago

Reliability & Operations Engineer

City of London, London, United Kingdom

Heart Mind Talent

Heart Mind Talent are partnering with Verified Global to hire a Reliability & Operations Engineer based in Central London. Verified Global builds cutting-edge algorithms to flip the odds in sports betting. Every hour, millions of fans place sub-optimal bets. More ❯

Posted: 3 days ago

Reliability & Operations Engineer

london (city of london), south east england, united kingdom

Heart Mind Talent

Posted: 3 days ago

Staff Backend Engineer (AI Lab | £170,000)

City of London, London, United Kingdom
Hybrid / WFH Options

Paradigm Talent

Role: Staff Software Engineer (Python | Backend | Infrastructure) Location: Hybrid - 2-3 days in London Office Compensation: Up to £170,000 + equity We’re working with a frontier AI lab pushing the boundaries of computational biology, combining machine learning, cloud More ❯

Posted: 4 days ago

Staff Backend Engineer (AI Lab | £170,000)

london (city of london), south east england, united kingdom
Hybrid / WFH Options

Paradigm Talent

Posted: 14 days ago

Staff Software Engineer

City of London, London, United Kingdom

SR2 | Socially Responsible Recruitment | Certified B Corporation™

Rust is a bonus) and experience designing distributed systems, REST APIs, and microservices Data & Infrastructure: Experience with Kafka or similar streaming technologies, PostgreSQL, and time-series data management DevOps & Observability: Familiar with Docker, monitoring, and observability tools Bonus Points For: Prior experience with energy markets, trading systems, industrial control systems, or IoT environments If you’re passionate about applying your More ❯

Posted: 2 days ago

Staff Software Engineer

london (city of london), south east england, united kingdom

SR2 | Socially Responsible Recruitment | Certified B Corporation™

Posted: 2 days ago

8 9 101112

Salary Guide

Observability
the City of London

10th Percentile: £73,125
25th Percentile: £73,750
Median: £85,000
75th Percentile: £105,000

More Observability insights

251 to 275 of 289 Observability Jobs in the City of London