Observability Jobs in England

151 to 175 of 454 Observability Jobs in England

Principal Software Engineer (Cloud Applications)

Hemel Hempstead, Hertfordshire, South East, United Kingdom
Hybrid / WFH Options
Eckoh PLC
DynamoDB, SQS, and EventBridge Develop robust CI/CD pipelines for applications running in EKS and serverless environments Embrace microservices and event-driven architecture patterns Implement logging, tracing, and observability practices from day one Contribute to the design and development of cloud-native data platforms that support real-time and batch processing AI & LLM Enablement: Collaborate with data scientists and More ❯
Employment Type: Permanent, Work From Home
Posted:

Director of Product Engineering (London)

London, UK
Hybrid / WFH Options
Morae Services India Private Limited
product planning, roadmap discussions, and strategic prioritization. Operational Excellence Own key engineering KPIs including system uptime, velocity, tech debt reduction, and deployment frequency. Drive cloud infrastructure cost-efficiency, system observability, and DevSecOps maturity. Lead incident management and escalation processes with customer sensitivity and transparency. Qualifications: 10+ years in software engineering, including 5+ years in engineering leadership roles. Proven experience building More ❯
Employment Type: Full-time
Posted:

Site Reliability Engineer - London

London, United Kingdom
Hybrid / WFH Options
Valarian Technologies Limited
you thrive in a fast-paced environment where you can make a real difference, we want to hear from you! Required skills/expertise: Develop and implement a comprehensive observability strategy for self-hosted deployments, including infrastructure and tooling for monitoring, alerting, and troubleshooting. This will involve designing and implementing robust metrics and logging systems. Engineer the ACRA platform for More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Lead Engineer - Personalisation and Recommendation Engine

London, United Kingdom
Hybrid / WFH Options
LEGO
building resilient, scalable data driven personalisation services end to end Setting up and maintaining robust CI/CD pipelines Running highly available systems exposing data via API with strong observability practices Collaborating effectively across teams Nice to have ML engineering experience Experience supporting data scientists with tooling, workflows and model optimisation Domain driven design experience Applications are reviewed on an More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Platform Product & Delivery Owner - Data Layer

London, United Kingdom
Boston Consulting Group
proprietary data that will power AI products. Champion best practices for orchestrating multi-cloud environments (AWS, Azure, GCP) to enhance platform performance, scalability, and cost efficiency. Implement robust security, observability, monitoring frameworks, and data governance to ensure data reliability, minimize downtime, and maintain compliance. Manage budget, implement charge-back models for platforms and services you provide to your customers. Lead More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Senior Machine Learning Engineer

London, United Kingdom
Hybrid / WFH Options
Ravelin Technology Ltd
teams to align on data architecture and ensure our ML systems meet overarching business objectives. Evolve our MLOps infrastructure, driving the strategy for model versioning, automated deployments, monitoring, and observability using modern tools like Prefect. Mentor and guide other members of the team, fostering a culture of technical excellence and continuous improvement through code reviews, design discussions, and knowledge sharing. More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Data Engineering Lead

London, United Kingdom
Hybrid / WFH Options
Elsevier
enablement teams, to promote these through regular knowledge sharing sessions. Accountable for operational efficiency - drive improvements in efficien cy , reliab ility , and scala bility supported by logging , monitoring and observability as a foundational capability. Responsible for adoption - promote the platform capabilities through technical communities of practice leadership, high internal standards for documented processes and internal guides, an d take steps More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Staff Software Engineer

London, United Kingdom
Hybrid / WFH Options
eBay Inc
APIs Various testing methodologies System design at high scale and commercial experience with: SQL and NoSQL databases Async processing Cloud native applications Working in a Continuous Delivery environment Modern observability practices Nice to have Not vital, but you'll have the edge if you also have experience with: Grafana Prometheus Kotlin or a least the willingness to learn it Batch More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Senior Software Engineer (CI) (London)

London, UK
Hybrid / WFH Options
Object Splendor
to join our Client Impact Team. The Client Impact Team was established to provide fast turnaround for client requests, small features, and defect resolution. The team also owns the observability and health of our operational platform. The team has made enormous improvements in these areas by building tooling. The vision for the coming year is to build on this foundation More ❯
Employment Type: Full-time
Posted:

Solutions Engineer/Site Reliability Engineer

London, United Kingdom
Hybrid / WFH Options
Zefr
California, with additional locations across the globe. What you'll do: As a Site Reliability Engineer at Zefr, you'll apply your expertise in cloud infrastructure, CI/CD, Observability, and core SRE concepts, to deliver high-quality, reliable, and scalable solutions. A significant aspect of this role involves working closely with Zefr's Engineering and Data Science teams ensuring … EKS expected), Helm, Kustomize Service Mesh: Istio CI/CD & Automation: CI/CD Pipelines: GitHub Actions GitOps/Continuous Delivery: Argo CD Primary Scripting/Automation Language: Python Observability & Monitoring: Monitoring & Alerting: Prometheus, Datadog, Pagerduty Telemetry Standards: OpenTelemetry Application & Data Ecosystem (Supporting): Application Languages/Frameworks: Python, FastAPI, Flask, Node.js, React Data Streaming: Apache Kafka Data Processing/Transformation … CircleCI, Argo CD, Flux) Knowledge of IaC and configuration management tools (Terraform, OpenTofu, Crossplane, Pulumi, Ansible, CloudFormation) Strong problem-solving experience, focusing on automation Production experience with Monitoring and Observability tools (Prometheus, Grafana, Datadog, Thanos, New Relic, Open Telemetry) Understanding of Cloud Networking concepts (Mesh Networking, NAT, Load Balancers, SSL Certificates and TLS termination, API Gateways, proxies, etc) Strong written More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Platform Engineer

Oldham, Greater Manchester, North West, United Kingdom
Innovative Technology
consistency, repeatability, and auditability across environments Develop and maintain developer tooling and golden templates (CI/CD pipelines, scaffolds, environments) to standardize best practices across teams Design and implement observability frameworks (metrics, tracing, logging, alerting) that are easy to consume and part of the platform baseline Eliminate repetitive tasks through automation and opinionated defaults, so teams are not blocked by … and orchestration (Docker, Kubernetes) Familiarity with CI/CD systems (GitHub Actions, GitLab CI, Jenkins, etc.) Hands-on experience with infrastructure-as-code tools (e.g., Terraform, CloudFormation) Knowledge of observability tools (Prometheus, Grafana, ELK stack, Datadog, etc.). Solid grasp of Linux systems and networking fundamentals Strong problem-solving and debugging skills Your Package & Perks: A competitive salary Flexible working More ❯
Employment Type: Contract
Posted:

Platform Engineer

Oldham, Lancashire, England, United Kingdom
Innovative Technology
consistency, repeatability, and auditability across environments Develop and maintain developer tooling and golden templates (CI/CD pipelines, scaffolds, environments) to standardize best practices across teams Design and implement observability frameworks (metrics, tracing, logging, alerting) that are easy to consume and part of the platform baseline Eliminate repetitive tasks through automation and opinionated defaults, so teams are not blocked by … and orchestration (Docker, Kubernetes) Familiarity with CI/CD systems (GitHub Actions, GitLab CI, Jenkins, etc.) Hands-on experience with infrastructure-as-code tools (e.g., Terraform, CloudFormation) Knowledge of observability tools (Prometheus, Grafana, ELK stack, Datadog, etc.). Solid grasp of Linux systems and networking fundamentals Strong problem-solving and debugging skills Your Package & Perks: A competitive salary Flexible working More ❯
Employment Type: Contractor
Rate: Competitive salary
Posted:

Cloud Engineering Lead - Public Cloud Observability

London, United Kingdom
Hybrid / WFH Options
Citigroup Inc
efficiency, and innovation. We're providing our businesses with a competitive edge by leveraging public cloud scale and enabling new infrastructure economics. As the Cloud Engineering Lead - Public Cloud Observability - SVP you will play a pivotal role in shaping and executing our public cloud strategy. You will be part of a team that continues to deliver big! From building cloud … at scale, all the way to enabling payments solutions, this team is at the forefront of innovation. What You'll Do Lead the Charge: own the Public Cloud Foundations - Observability strategy and its execution, enabling Citi's secure and enterprise-scale adoption of public cloud. You will provide technical authority for all foundational services. Build and Inspire: lead and grow … and a passion for engineering best practices. You have: Cloud Engineering Expertise: A deep understanding of public cloud services adoption at scale. Expert-level understanding of AWS/GCP Observability across: Proficiency in working with cloud-native APIs from AWS (e.g. AWS Config, CloudWatch) and GCP (e.g. Cloud Asset Inventory, Cloud Monitoring) Experience with Python to automate API integrations and More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Site Reliability Engineer

London, United Kingdom
Hybrid / WFH Options
Rise Technical Recruitment Limited
and resolve application-level production incidents The Person: 5+ years in SRE, DevOps, or infrastructure engineering Strong experience with AWS, EKS/Kubernetes, and Terraform Familiar with Kafka and observability tools like Datadog or Grafana Able to troubleshoot issues across infrastructure and application layers Reference number: BBBH259300 To apply for this role or for to be considered for further roles More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Site Reliability Engineer

City of London, London, United Kingdom
Hybrid / WFH Options
Rise Technical Recruitment
and resolve application-level production incidents The Person: *5+ years in SRE, DevOps, or infrastructure engineering *Strong experience with AWS, EKS/Kubernetes, and Terraform *Familiar with Kafka and observability tools like Datadog or Grafana *Able to troubleshoot issues across infrastructure and application layers Reference number: BBBH(phone number removed) To apply for this role or for to be considered More ❯
Employment Type: Permanent
Salary: £80000 - £90000/annum 38 Days Holiday, Healthcare, Pension
Posted:

Site Reliability Engineer

London, South East, England, United Kingdom
Hybrid / WFH Options
Rise Technical Recruitment Limited
and resolve application-level production incidents The Person: *5+ years in SRE, DevOps, or infrastructure engineering*Strong experience with AWS, EKS/Kubernetes, and Terraform*Familiar with Kafka and observability tools like Datadog or Grafana*Able to troubleshoot issues across infrastructure and application layers Reference number: BBBH259300 To apply for this role or for to be considered for further roles More ❯
Employment Type: Full-Time
Salary: £80,000 - £90,000 per annum, Inc benefits
Posted:

Lead/Staff Backend Engineer

London, United Kingdom
Gorilla
and ability to drive clarity in ambiguous, complex technical situations. Leadership experience through mentoring, leading initiatives, or shaping engineering practices across teams. Experience in defining and improving DevOps pipelines , observability, and platform reliability. Strong communication skills and a collaborative mindset-able to build alignment across stakeholders. Proactive and pragmatic: able to balance technical excellence with delivery impact. More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Senior Software Engineer

Nottingham, Nottinghamshire, East Midlands, United Kingdom
Hybrid / WFH Options
Rebel Recruitment
resolve incidents, analysing logs, data, and reports from the service desk. Work closely with engineering leadership and product owners to prioritise incidents and drive preventative measures. Take ownership of observability strategiesmonitoring standards, alerting practices, and visibility improvements. Engage in Agile ceremonies and collaborate across disciplines to support efficient delivery and operational excellence. Why This Role Stands Out No immediate on More ❯
Employment Type: Permanent
Salary: £70,000
Posted:

Senior Machine Learning Engineer

London, South East, England, United Kingdom
Hybrid / WFH Options
Method Resourcing
teams to operationalize models and ship ML-powered features into production. Continuously assess and iterate on production models, balancing long-term ML strategy with tactical improvements. Champion code quality, observability, and resilience within their ML systems through reviews and hands-on contributions. Help shape their internal ML standards and practices, ensuring they stay ahead of industry advancements. Offer technical mentorship More ❯
Employment Type: Full-Time
Salary: £150,000 - £160,000 per annum
Posted:

Low Latency Network Engineer

London, United Kingdom
Millennium Management LLC
optimization, anomaly detection, and predictive analytics. Understanding of AI frameworks and libraries (e.g., TensorFlow, PyTorch, Scikit-learn) and their application in network automation and monitoring. Experience with telemetry and observability frameworks (e.g., Prometheus, Grafana) for real-time network monitoring and troubleshooting. Experience : Minimum of 7 years' of experience in network engineering, operations, and support. Proven ability to work hands-on More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Senior Software Engineer - Backend (GO)

London, United Kingdom
Hybrid / WFH Options
China-Britain Business Council
bug/incident resolution and technical documentation Mentor junior engineers and champion engineering best practices Support CI/CD pipelines and uphold high standards of code quality, security, and observability Location: This is a remote role. Occasionally, travel will be required to our London Hub and other H&B locations. The Person: Key Requirements: Experience building and maintaining backend services More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Senior Software Engineer - Backend GO

London, South East, England, United Kingdom
Hybrid / WFH Options
Holland & Barrett International Limited
bug/incident resolution and technical documentation Mentor junior engineers and champion engineering best practices Support CI/CD pipelines and uphold high standards of code quality, security, and observability Location: This is a remote role. Occasionally, travel will be required to our London Hub and other H&B locations. The Person: Key Requirements: Experience building and maintaining backend services More ❯
Employment Type: Contractor
Rate: Competitive salary
Posted:

Technology Manager (Finova)

London, United Kingdom
InterQuest Group (UK) Limited
cloud-native, and scalable platforms that comply with financial regulations and industry standards. Drive DevOps maturity, CI/CD pipelines, and automation to support agile and resilient delivery. Establish observability, quality assurance, and performance metrics across the engineering lifecycle. Provide leadership in incident and problem management, ensuring resilient runbooks, root cause analysis, and strong operational readiness. Collaborate with Finova and More ❯
Employment Type: Permanent
Posted:

Service Operations Manager

London, United Kingdom
Saab UK
concerns and driving service excellence. Communicate effectively with internal and external stakeholders, providing insights and updates on service health and operational performance. Continuous Improvement Lead initiatives to increase automation, observability, and operational resilience. Stay abreast of industry trends, emerging technologies, and best practices, fostering a culture of continuous learning within the team. Requirements Proven experience in IT Service Operations, ideally More ❯
Employment Type: Permanent
Posted:

Software Engineer (TypeScript)

Manchester Area, United Kingdom
Accenture
services/message buses and other architectural elements Deploy these applications using features such as containers to cloud leveraging CI/CD to support this process backed with good observability when running these in production Ensure quality through the creation of documentation and use of unit/integration/contract testing with a consideration of security/performance requirements We More ❯
Posted:
Observability
England
10th Percentile
£57,500
25th Percentile
£70,000
Median
£80,000
75th Percentile
£98,750
90th Percentile
£120,000