|
1 to 25 of 186 Observability Jobs in Central London
City of London, Greater London, UK TEKsystems
Storage, Compute G Cloud CLI, VPC, IAM, GCE, GCS, GKE, Pub Sub, Cloud Run, Cloud SQL, Big Query, Dataflow, Bigtable, Fire store GCP – Networking, Security tool/Best Practices Observability - Operations suite, Logging, Monitoring, Alerting. Additional Skills: Good understanding of Linux OS. Bash, Scripting, Automation, Ansible, Networking, Security. Hands-on experience with DevOps Principles and Tools. Hands-on with Terraform More ❯
City of London, London, United Kingdom Hybrid / WFH Options Amber Labs
working in Agile teams using tools like Git , Jira , and Confluence Eligible for SC and NPPV3 clearance Desirable: Container orchestration with Kubernetes HashiCorp tools: Vault , Consul , Packer Monitoring and observability with Grafana , Prometheus , or similar Familiarity with cloud networking, VPCs, NAT Gateways, security groups, etc. Personal Attributes: Proactive and self-driven with a passion for technology Strong problem-solving mindset More ❯
City of London, London, United Kingdom BGC Group
built on Solace PubSub+, ensuring high availability, optimal performance, and reliability across production and non-production environments. This includes working on incident response, capacity planning, WAN optimization, and system observability using tools like Prometheus and Grafana . Key Responsibilities: Administer and maintain Solace PubSub+ appliances and software brokers across environments (on-prem and cloud). Provide production support for messaging More ❯
City of London, London, United Kingdom Hybrid / WFH Options Explore Group
financial institutions. What You'll Do Maintain and improve our AWS-based infrastructure using Infrastructure-as-Code (Terraform) Support and scale Kubernetes clusters hosting critical microservices Design and enhance observability, alerting, and incident response processes Collaborate closely with engineers to ensure systems are reliable, secure, and performant Lead root cause analysis for production incidents and help prevent recurrence Build tooling More ❯
City of London, London, United Kingdom Hybrid / WFH Options Vertus Partners
and scalability of a real-time trading environment used by both internal and external clients. While production support remains an important aspect, this position is heavily weighted toward improving observability, driving proactive engineering practices, and developing tooling to eliminate repetitive manual tasks. You'll collaborate closely with developers, traders, and global colleagues to make meaningful changes to how the environment … is monitored, managed, and scaled. Key Responsibilities: Lead the development of automation and monitoring solutions to improve system resilience and eliminate recurring manual work Own and evolve observability practices using tools like Prometheus, Grafana, Splunk, Geneos, Corvil, etc. Engage directly with senior traders and engineers to troubleshoot complex trading system issues and improve end-to-end workflows Drive post-incident More ❯
City of London, London, United Kingdom Radley James
services environment Strong technical skills in Linux/Unix systems, SQL, and scripting Strong experience with a programming language such as Python, Java, etc Strong experience with monitoring and observability tools (Prometheus, Grafana, Splunk, Geneos, OpenTelemetry, Corvil) Familiarity with cloud platforms, containerization (e.g., Kubernetes, Docker), and CI (Continuous Integration)/CD (continuous Delivery) pipelines Strong understanding of the trade lifecycle More ❯
City of London, London, United Kingdom Ultralytics
architectures , as described by thought leaders like Martin Fowler. Hands-on experience building and maintaining complex CI/CD pipelines , preferably with GitHub Actions . Familiarity with monitoring and observability tools (e.g., Prometheus, Grafana, Google Cloud's operations suite). A solid understanding of networking principles and cloud security best practices. Experience with other cloud platforms like Amazon Web Services More ❯
City of London, London, United Kingdom Oliver Bernard
build and deployment infrastructure. The company builds high-performance, secure web trading platforms and delivers real-time financial data to global clients. This role is key to driving automation, observability, and infrastructure efficiency across hybrid environments, ensuring compliance with secure engineering and ISO-aligned practices. Key Responsibilities Enhance and maintain a hybrid build infrastructure across on-prem and AWS Administer … and optimise Kubernetes clusters and containerised pipelines Implement and maintain Infrastructure as Code using Terraform Improve observability and resilience using tools like Prometheus Manage and monitor GitLab CI/CD pipelines for multi-platform builds (Linux, Windows, macOS) Collaborate with engineering teams to optimise developer workflows and apply DevOps best practices Set clear policies and self-healing automation standards across More ❯
City of London, England, United Kingdom Whitehall Resources Ltd
in managing cloud infrastructure, ensuring the reliability of production systems, and improving end-to-end deployment pipelines. This role combines deep operational responsibilities with a strong focus on automation, observability, and continuous improvement. You will be responsible for maintaining high system availability, enabling rapid delivery through CI/CD, and supporting development teams with robust infrastructure and tooling. A key … incidents with root cause analysis and preventive measures. 3. Handle change requests, track recurring issues, and work on long-term fixes to improve system stability. 4. Implement and maintain observability solutions using Prometheus, Grafana, and Splunk. 5. Write PromQL queries for custom monitoring dashboards, alerting, and diagnostics. 6. Manage and optimize CI/CD pipelines for automated testing, deployment, and … at the DevOps Engineer level 2. Incident, change & problem management experience. This role is heavily operation-oriented, including on-call requirements 3. Strong background in setup & operation of enterprise observability tooling, specifically Prometheus, Grafana and Splunk, including usage of PromQL 4. Proficient in one or more languages of Python, Go, Bash, SQL 5. Familiar with GitHub/GitOps/container More ❯
City of London, Greater London, UK Hybrid / WFH Options Arcus Search
practices with a strong emphasis on automation, self-service, and operational excellence. Tech You'll Use: Azure & AWS (production experience) Kubernetes (EKS preferred) Terraform & GitHub Actions CI/CD, observability tooling (Grafana, Prometheus), containerisation (Docker) What You'll Be Doing: Designing and implementing secure, resilient AWS infrastructure Building CI/CD pipelines and reusable deployment patterns Advising on cloud-native More ❯
City of London, London, United Kingdom Marlin Selection Recruitment
For: 3+ years’ hands-on experience with Solace PubSub+ in a production environment Strong knowledge of WAN-based distributed systems and networking fundamentals Experience with Prometheus and Grafana for observability and alerting Confident in Linux/Unix systems and scripting (Bash, Python, etc.) Excellent problem-solving instincts and attention to detail Strong communicator who works well across technical teams Bonus More ❯
City of London, England, United Kingdom JAM Recruitment
Site Reliability/DevOp Engineer London - 5 Days Onsite Up to £550 per day (Umbrella, Inside IR35) 12-Month Contract Must hold live and transferrable DV Clearance Are you passionate about reliability, automation, and supporting mission-critical systems? Join this More ❯
City of London, London, United Kingdom H&P Executive Search
on Solace PubSub+, ensuring high availability, optimal performance, and reliability across production and non-production environments. You will be working on incident response, capacity planning, WAN optimization, and system observability so should have experience with tools such as Prometheus and Grafana . Key Responsibilities: Administer and maintain Solace PubSub+ appliances and software brokers Provide production support for messaging-related incidents More ❯
City of London, London, United Kingdom AI71
multi-tenant SaaS or large enterprise application. Certifications: AWS Certified Solutions Architect, Google Professional Cloud Architect, Azure Solutions Architect Expert. Experience in data architecture, AI/ML integration, and observability frameworks . More ❯
City of London, London, United Kingdom Ownera
optimize data flow, connectivity, and interoperability Help to implement best practices and process improvements to enhance delivery efficiency and team performance Work with various internal teams to continuously improve observability and supportability capabilities of the company platform Key Requirements A highly motivated, technical and detail-oriented support engineer, able to work autonomously with minimal direction, passionate about learning new things More ❯
City of London, England, United Kingdom The Boston Consulting Group GmbH
Locations : Canary Wharf | Boston Who We Are Boston Consulting Group partners with leaders in business and society to tackle their most important challenges and capture their greatest opportunities. BCG was the pioneer in business strategy when it was founded in More ❯
City of London, London, United Kingdom Levy Global
We’re seeking an experienced contractor to support the delivery of observability solutions for a new, large-scale infrastructure environment. This role focuses on developing insightful and automated Grafana dashboards, with a strong emphasis on data integration and actionable telemetry. Required Skills Excellent, concise communication skills - essential for collaborating with technical teams to shape observability outputs. Deep experience with Grafana … dashboard creation, templating, and performance optimization. Strong understanding of PromQL, VictoriaMetrics, or VictoriaLogs query languages. Ability to interpret and map RESTful API data into observability pipelines and dashboards. Familiarity with IaC outputs and tooling (e.g., Terraform) as data sources for observability. Solid programming ability in Golang (preferred) or Python for automation and integration. Strong collaboration skills to work with cross More ❯
City of London, London, United Kingdom Caspian One
throughput applications Develop and refine automation solutions using Ansible, Python, and Terraform Troubleshoot hardware, networking, and performance issues in various environments Deploy monitoring and log aggregation tools to improve observability Collaborate with teams to identify bottlenecks and deploy scalable, automated solutions What We're Looking For: 6+ years of Linux system administration and engineering experience in performance-critical environments Proficiency … in Python and bash Scripting, with hands-on Ansible experience Solid networking fundamentals: IP Addressing, VLANs, etc. Familiarity with observability tools like Prometheus, Grafana, and ELK Infrastructure-as-code experience with Terraform and CI/CD pipelines Proven ability to resolve complex system-level issues and performance challenges Knowledge of container orchestration tools (Docker/containers, Kubernetes) Desirable: Experience with More ❯
City of London, London, United Kingdom Ascendion
service mesh solutions across our distributed systems. In this role, you will lead the design and operation of Kong Mesh (based on Kuma) for managing microservices communication, security, and observability at scale. You’ll play a crucial role in defining service-to-service architecture and ensuring platform reliability, scalability, and security. Key Responsibilities: • Lead the design and deployment of Kong … Mesh across our environments (on-prem and cloud). • Define and enforce best practices for service mesh architecture, traffic routing, zero-trust security, observability, and policy enforcement. • Collaborate with infrastructure, security, and development teams to integrate Kong Mesh with CI/CD, monitoring, and logging solutions. • Develop custom policies, plugins, and automation scripts to enhance Kong Mesh capabilities. • Monitor mesh More ❯
City of London, London, United Kingdom Infoplus Technologies UK Limited
like ArgoCD, gitlab CI to enhance developer experience, alongside developing secure and cost-effective CI/CD pipelines. Good experience with monitoring tools and providing the right level of observability and monitoring for product engineering teams. Demonstrate ability to be cost aware and experience on how to optimize cloud costs. Ability to collaborate and work effectively as a team, providing … writers to complete development cycles Supports in the building, testing, and integration of complex interfaces between different systems, working with the team on complex integration Provides guidance, monitoring and observability, improving developer experience and enabling teams to become autonomous. Troubleshoots customer environments and assist with escalations Supports efforts to remediate security gaps to help strengthen security posture More ❯
City of London, London, United Kingdom Hybrid / WFH Options Annapurna
or Azure). Proven experience with CI/CD pipelines and container technologies like Docker and Kubernetes. Deep understanding of networking, distributed systems, and databases. Expertise in monitoring and observability tools such as DataDog, Prometheus, Grafana, ELK stack, or Splunk. Excellent communication skills and a meticulous approach to problem-solving. Desirable Experience: Familiarity with Azure. Experience working in the autonomous More ❯
City of London, London, United Kingdom Hybrid / WFH Options developrec
large-scale .NET systems to Golang Own key architecture and platform decisions to improve system performance, reliability, and scalability. Champion DevOps best practices: CI/CD, automation, IaC (Terraform), observability and security. Collaborate across teams, build strong engineering practices, and foster a culture of continuous improvement. Mentor and guide engineers, shaping both tech strategy and team capability (70% hands on More ❯
City of London, London, United Kingdom Hybrid / WFH Options Zettafleet
Cloud-native technologies: Experience in architecting and deploying in cloud platforms (e.g., AWS, GCP or Azure), an understanding of containerisation (e.g., Docker), infrastructure-as-code software (e.g., Terraform), and observability platforms (e.g., Datadog or Grafana). Leadership: A track record of leading complex projects. Problem solving: Strong analytical problem-solving skills and attention to detail. You have the ability to More ❯
Central London / West End, London, United Kingdom Hybrid / WFH Options Zettafleet
Cloud-native technologies: Experience in architecting and deploying in cloud platforms (e.g., AWS, GCP or Azure), an understanding of containerisation (e.g., Docker), infrastructure-as-code software (e.g., Terraform), and observability platforms (e.g., Datadog or Grafana). Leadership: A track record of leading complex projects. Problem solving: Strong analytical problem-solving skills and attention to detail. You have the ability to More ❯
City of London, London, United Kingdom Zettafleet
Cloud-native technologies: Experience in architecting and deploying in cloud platforms (e.g., AWS, GCP or Azure), an understanding of containerisation (e.g., Docker), infrastructure-as-code software (e.g., Terraform), and observability platforms (e.g., Datadog or Grafana). Leadership: A track record of leading complex projects. Problem solving: Strong analytical problem-solving skills and attention to detail. You have the ability to More ❯
|
Salary Guide Observability Central London - 10th Percentile
- £65,000
- 25th Percentile
- £71,250
- Median
- £77,500
- 75th Percentile
- £83,750
- 90th Percentile
- £90,000
|