201 to 225 of 243 Observability Jobs in London

Contract Software Engineer (Observability & Telemetry)

Hiring Organisation
Xpertise
Location
London, United Kingdom
Employment Type
Contract
Contract Rate
GBP Annual
Real Time Kernel Observability Engineer | Contract | Inside IR35 | Hybrid London We're working with a global Real Time data platform operating in ultra-low latency, high-throughput environments across distributed systems. This role sits in a specialist engineering team building Kernel-level observability and telemetry infrastructure used to monitor … understand system behaviour in Real Time. What you'll be doing Build Kernel-level observability and instrumentation systems for distributed Real Time infrastructure Develop telemetry pipelines using eBPF-based tooling (metrics, logs, traces at Kernel level) Design and implement system-wide visibility across latency-critical services Work with hotspot detection ...

Performance and Monitoring Engineer

Hiring Organisation
Solus Accident Repair Centres
Location
London, United Kingdom
Employment Type
Permanent
Salary
GBP 50,000 Annual
talented Performance and Monitoring Engineer to help us strengthen the stability, reliability and performance of our systems. If you're passionate about monitoring, observability and using data to proactively improve service health, this is a great opportunity to make a real impact across a large, click apply for full ...

DevOps Engineer ( Azure )

Hiring Organisation
Experis
Location
Wembley, England, United Kingdom
Responsibilities Observability & Monitoring Platform Design, implement, and own an Azure observability playbook, delivering comprehensive dashboards, alerting rules, and operational runbooks using Application Insights, Log Analytics, and Kusto Query Language (KQL). AIOps & Intelligent Automation Develop AI‐driven alerting and detection mechanisms to surface early‐warning signals, including IP reputation degradation … scale. Infrastructure as Code Expertise Deep proficiency in Terraform, including module design, remote state management, workspace strategies, and multi‐environment deployment patterns. Monitoring & Observability Expertise Advanced experience with KQL for Azure Log Analytics, with the ability to design and build custom Azure Monitor Workbooks for operational insight and reporting. Security ...

Head of Infrastructure

Hiring Organisation
Jobleads-UK
Location
Greater London, England, United Kingdom
platform and infrastructure strategy Design and evolve cloud architecture to support scale, resilience, and performance Set standards for infrastructure, CI/CD, environments, and observability Make architectural decisions and trade‐offs Developer Experience (DevEx) Provide infrastructure for the development team to code, test and deploy efficiently Advise during design sessions … growing company Ability to operate production systems under pressure Deep hands‐on experience with the AWS cloud platform Strong background in reliability, observability, and incident management Experience leading or mentoring engineers What we offer in return 💰 Competitive salary depending on experience 🏝️ 27 days of annual leave (including 3 days Christmas ...

Senior Site Reliability Engineer

Hiring Organisation
Realm
Location
City of London, London, United Kingdom
High-growth infrastructure company focused on delivering large-scale compute, data centre capacity, and power solutions for advanced machine learning workloads. Platforms support leading research and industry teams requiring high-performance computing at significant scale. ...

SRE Observability Engineer

Hiring Organisation
Access Computer Consulting
Location
City of London, London, United Kingdom
Employment Type
Contract
Contract Rate
£350 - £450/day
recruiting for an SRE Observability Engineer to work in London 2-3 days a week, remaining time remote. The role falls inside IR35 so you will be required to work through an umbrella company for the duration of the contract. This is a 6 month contract which will transfer … permanent role after the initial contract term. You will be responsible for collaborating across various organisations within the client to understand and develop observability solutions for enterprise-wide deployment at scale. You will also manage the legacy monitoring stack across the Production Management organisation within the client. You must have ...

DevOps Engineer

Hiring Organisation
Autonomai Recruitment
Location
London Area, United Kingdom
performance and resilience. Build and extend network automation workflows to configure and manage trading infrastructure (routers, switches, security, and connectivity). Define and implement observability for services and infrastructure using metrics, logging, and alerting (e.g., Prometheus, Grafana, and related tooling). Key requirements Strong backend development experience with Python , including … experience building APIs (e.g., FastAPI or similar frameworks). Experience with Prometheus ‐style observability: metrics, alerting, and dashboards; familiarity with Grafana is a plus. Hands‐on experience with ClickHouse or similar high‐performance data stores is a strong advantage. Practical experience with network automation ; Ansible or similar configuration‐management tools ...

Senior Platform Engineer (Fully Remote) - GKE, GCP, Terraform

Hiring Organisation
Sanderson Recruitment
Location
London, United Kingdom
Employment Type
Permanent, Work From Home
manage workloads using Helm with strong isolation and configuration practices Own and improve CI/CD pipelines using Azure DevOps and GitOps Embed observability across the platform (monitoring, logging, alerting, tracing) Define and enforce platform standards, patterns and best practices Produce and maintain high-quality documentation, diagrams and runbooks Lead … expertise, particularly Azure DevOps Git-based workflows, GitOps and tools such as Argo CD Experience with service mesh technologies (e.g. Istio) Exposure to observability/APM tooling Confident technical leader with experience setting standards and mentoring others Comfortable working in shared platform environments Reasonable Adjustments: Respect and equality are core ...

Site Reliability Engineer

Hiring Organisation
EQUALS
Location
Greater London, England, United Kingdom
recommendation engine that matches people by musical taste. THE ROLE We're looking for a Site Reliability Engineer to own the infrastructure, observability, and operational health of the Equals platform. You'll be the person who monitors systems needs and health to provide a seamless user experience while providing traceability … 1B+ rows) - Manage Cloudflare (WAF, bot management, DNS, firewall rules) - Make cost-conscious infrastructure decisions - right-sizing instances, storage tiering, optimizing spend Monitoring & Observability - Own the Datadog APM setup: tracing, alerting, dashboards, log management - Maintain and tune alert channels integrated with Slack - Reduce alert fatigue by tuning thresholds, suppressing false ...

Data Reliability Engineer

Hiring Organisation
Ashdown Group
Location
City of London, London, United Kingdom
Employment Type
Permanent, Work From Home
work from home 2 days per week. This is a high-impact role focused on improving data quality, reducing incidents, and building scalable observability across a modern enterprise data platform. Youll help ensure data across the organisation is accurate, reliable, and trusted for critical business decision-making. Youll take ownership … style roles, with strong SQL and Python skills and experience working in modern cloud-based data environments. Hands-on experience with data observability tools such as Grafana, Monte Carlo, or Acceldata, and data governance/quality platforms like Informatica, Collibra or Microsoft Purview is highly desirable. Experience within the Azure ...

Forward Deployed Engineer

Hiring Organisation
Novatus
Location
London Area, United Kingdom
Novatus Global is a Series B scale-up RegTech SaaS provider and boutique advisory firm, helping financial institutions manage their most complex regulatory requirements. We combine deep consulting expertise with cutting-edge SaaS solutions, enabling ...

Senior Software Engineer – AI / Agentic Systems

Hiring Organisation
MA (Montreal Associates)
Location
City of London, London, United Kingdom
grade AI platform. You’ll operate at the core of the product engineering function—designing systems that power autonomous agents, orchestrate workflows, and enable observability at scale. This is not just another backend role. You’ll influence architecture, mentor engineers, and help define the technical direction of a rapidly growing … Lead design and code reviews , ensuring high standards of quality and security Collaborate closely with AI research, product, and infrastructure teams Improve system reliability, observability, and scalability Mentor engineers and act as a technical multiplier across teams Champion best practices, tooling, and engineering excellence Proactively identify and resolve technical debt ...

Director - Principal Engineer (Java/Angular/AI)

Hiring Organisation
Robert Walters
Location
London, South East, England, United Kingdom
Employment Type
Full-Time
Salary
£140,000 - £170,000 per annum
volumes of financial and transactional data Contribute directly to architecture, system design, and hands-on software development Drive engineering best practices across automation, testing, observability, and performance Build resilient, production-grade systems with a strong focus on reliability and scalability Work across the full software development lifecycle from design through … scalability, and high-availability systems Experience building automated, production-grade platforms with minimal manual intervention Familiarity with cloud-native technologies, CI/CD, and observability tooling Strong engineering mindset with a hands-on approach to development Interest in modern engineering tooling, including AI-assisted development workflows Robert Walters Operations Limited ...

Platform Engineer: £120k + Bonus/benefits (AI Trading)

Hiring Organisation
Hunter Bond
Location
City of London, London, United Kingdom
global trading platform. The successful candidate will be involved in every layer of the technology stack—from hardware and operating systems to automation and observability—while gaining exposure to how a world-class investment firm manages its technology infrastructure. Key Responsibilities Manage a distributed compute environment and several petabyte-scale … agile methodologies) Familiarity with infrastructure automation and configuration management tools (Chef, Puppet, or Ansible) Exposure to distributed storage systems and related protocols Experience with observability and monitoring tools (Elasticsearch, Logstash, Kibana, Datadog, Prometheus, Grafana) Strong written and verbal communication skills Demonstrated ability to learn quickly and adapt to evolving technologies ...

Lead Software Engineer

Hiring Organisation
5V Video
Location
City of London, London, United Kingdom
+ AWS (Lambda, API Gateway, S3, DynamoDB) Handling event-driven architectures (Kafka, SNS/SQS, etc.) Driving system design decisions across distributed systems Improving observability, reliability, and performance in production Debugging complex issues and leading resolution across teams Staying hands-on while setting technical direction and standards Tech Stack Python … Lambda, API Gateway, S3, DynamoDB, IAM) Event-driven systems (Kafka, SNS/SQS) CI/CD (Concourse, Git workflows) Databases (Postgres, DynamoDB, Couchbase) Observability (Prometheus, Grafana, CloudWatch) What You’ll Bring Strong backend engineering experience (Python preferred) Proven experience building distributed systems at scale Deep understanding of microservices + event ...

Head of Infrastructure

Hiring Organisation
Jobleads-UK
Location
Greater London, England, United Kingdom
cloud architecture, operational resilience, developer experience and infrastructure team leadership. You will be responsible for shaping the long term infrastructure roadmap, improving reliability and observability, strengthening incident response and ensuring the platform can support a growing customer base and increasingly critical product suite. This is a role for someone … platform strategy Design and evolve the AWS cloud architecture to support scale, resilience and performance Set standards across infrastructure, CI/CD, environments and observability Lead production reliability, uptime, incident response and post incident reviews Improve monitoring, alerting and on call practices to ensure they are effective and sustainable Partner ...

Remote Network Monitoring Specialist - Streaming Telemetry

Hiring Organisation
Akkodis
Location
London, United Kingdom
Employment Type
Permanent
Salary
£70000 - £75000/annum
ensure the environment is fully visible, measurable and supportable from day one. The role would suit someone with strong experience across network observability, alerting, telemetry, dashboards, service health, performance baselining and operational handover. The client is open to different monitoring backgrounds, particularly where candidates have worked with tools such … solutions across newly delivered network infrastructure. Build monitoring capability that provides clear visibility of network health, performance and service availability. Work with monitoring and observability platforms such as VictoriaMetrics, Prometheus, Grafana, Nagios, Zabbix, InfluxDB, SolarWinds, PRTG, Datadog, Elastic or similar. Support metrics ingestion, retention, alerting, dashboarding and performance visibility. Build ...

Remote Network Monitoring Engineer - VictoriaMetrics

Hiring Organisation
Akkodis
Location
London, United Kingdom
Employment Type
Permanent
Salary
£70000 - £75000/annum
VictoriaMetrics in a production environment, including configuration, optimisation, ingestion, retention and performance tuning. You will also work across streaming telemetry, Nagios, Grafana and wider observability tooling. This would suit someone with strong network monitoring experience who is comfortable taking ownership of a critical technical workstream in a project-led environment. … Looking for: Strong hands-on experience with VictoriaMetrics in a production environment. Previous experience in a senior network monitoring, network engineering or observability-focused role. Experience working in a telecoms, ISP, managed network or large-scale infrastructure environment. Good understanding of time-series monitoring, metrics ingestion, retention and performance tuning. ...

Site Reliability Engineer (Bare Metal Infrastructure)

Hiring Organisation
Hunter Bond
Location
London Area, United Kingdom
multi-petabyte infrastructure Writing Python to automate anything manual or repetitive Working closely with engineers across the business to improve reliability and performance Enhancing observability, monitoring and system transparency Driving automation across config management and container environments What they’re looking for (or a willingness to learn the below … years’ experience working deeply with Linux Familiarity with monitoring/observability tooling (ELK, OpenTelemetry, VictoriaMetrics) Strong Python skills (automation/scripting) Experience with CI/CD tooling is a plus Exposure to Docker and container ecosystems is a plus Experience with Ansible, Chef, Puppet or similar Experience working with large ...

AI Architect

Hiring Organisation
Stackstudio Digital Ltd
Location
London, United Kingdom
Employment Type
Contract
Contract Rate
From £450 to £500 per day
into high value solutions Enforce IAM least privilege with IAM Conditions, organisation policies, and scoped service accounts; integrate BeyondCorp for zero trust access Operationalise observability using Cloud Logging, Cloud Monitoring, Error Reporting, Trace, and Profiler; build model/LLM telemetry dashboards and alerts Identify the right AI/ML frameworks … patterns, vector databases, embeddings, and prompt/guardrail engineering Desirable Skills/Knowledge/Experience Knowledge of MLOps/AgentOps, CI/CD, and observability Strong understanding of regulated financial services environments Proven experience implementing AI risk controls, model governance, and auditability Ensure alignment with FCA, PRA, data privacy, model ...

SRE Consultant

Hiring Organisation
Akkodis
Location
City of London, London, United Kingdom
Employment Type
Permanent
Salary
£90000 - £100000/annum
include: Define and embed SRE engagement models aligned to modern engineering and traditional ITSM/ITIL practices Establish SLIs, SLOs, and Error Budgets Shape observability strategies using metrics, logs, and traces Design incident response models and post-incident learning loops Reduce toil through automation and engineering excellence Deliver SRE capability … Looking For Extensive experience in SRE, cloud operations, or DevOps Proven consulting or advisory background Experience with AWS, Azure, or GCP Strong observability and incident management expertise Ability to obtain UK SC clearance Modis International Ltd acts as an employment agency for permanent recruitment and an employment business ...

Azure Integration Engineer

Hiring Organisation
McCabe & Barton
Location
London Area, United Kingdom
Duration until 31/12/2026 3 days in office in London Daily rate £600 inside or match your expectation Job Description Our client are looking for an experienced Azure Integration Developer with SRE ...

Staff Software Engineer I

Hiring Organisation
Stepstone UK
Location
South East London, London, United Kingdom
Employment Type
Permanent
Company Description At The Stepstone Group, we have a simple yetvery importantmission: The right job for everyone. Using our data, platform, and technology, we create opportunities for jobseekers and companies around the world to find ...

Telemetry and Observability Engineer

Hiring Organisation
Oscar Associates (UK) Limited
Location
London, United Kingdom
Employment Type
Contract
Contract Rate
GBP 475 - 515 Daily
Telemetry & Observability Engineer Contract £475-£515 p/d Inside IR35 London 3 days on site 6 month contract We are seeking a highly skilled Telemetry & Observability Engineer to join a large-scale enterprise engineering environment focused on improving system reliability, visibility, and operational intelligence across complex distributed platforms click ...

Vice President - Full Stack Engineer (Java/Angular/AI)

Hiring Organisation
Robert Walters
Location
London, South East, England, United Kingdom
Employment Type
Full-Time
Salary
£85,000 - £100,000 per annum
APIs using modern engineering practices Contribute to system design, architecture discussions, and technical decision-making Build resilient, automated systems with strong focus on reliability, observability, and performance Work closely with engineering and product teams to deliver production-grade solutions Contribute to CI/CD, testing, monitoring, and operational improvements across … working with high-volume, scalable systems Familiarity with event-driven architecture and messaging systems Experience with cloud-native technologies, CI/CD pipelines, and observability tooling Strong hands-on engineering mindset and interest in modern development tooling, including AI-assisted workflows Robert Walters Operations Limited is an employment business ...