Service-Level Objective Jobs in London

26 to 50 of 69 Service-Level Objective Jobs in London

Director of Software Asset and Configuration Management

London, England, United Kingdom
Boston Consulting Group (BCG)
observability platforms to support real-time decision-making. Support incident prevention, root cause analysis, and continuous improvement through data-driven insights. Define and enforce service level objectives (SLOs) and key performance indicators (KPIs) for SACM health and value. Governance, Compliance & Asset Management: Ensure accurate, complete, and up-to-date asset and More ❯
Posted:

Sr Lead Infrastructure Engineer

London, England, United Kingdom
ZipRecruiter
demonstrated ability to implement site reliability within an application or platform Advanced knowledge and experience in observability such as white and black box monitoring, service level objectives, alerting, and telemetry collection using tools such as Grafana, Dynatrace, Prometheus, Datadog, Splunk, etc. Ability to communicate data-based solutions with complex reporting and More ❯
Posted:

Global IT Security Platform Senior Director

City of London, England, United Kingdom
The Boston Consulting Group GmbH
automated response. Apply SRE principles to improve reliability, performance, and maintainability of security services. Lead platform health, patching automation, and vulnerability remediation workflows. Define service level objectives (SLOs) and key performance indicators (KPIs) for all security services. Compliance, Governance & Risk Management: Ensure alignment with global compliance requirements such as ISO More ❯
Posted:

Head of Site Reliability Engineering

London, England, United Kingdom
Rewardgateway
closely with our SecOps teams to ensure timely vulnerability management Educating teams in SRE practices and maintaining high standards of compliance Implementing world-class observability standards utilising SLI/SLO/Error Budgets Continually evolving our observability platforms for greater coverage Liaising with Product & Engineering teams for constant evolution of metrics Aligning SRE Sprints & Backlog with our roadmaps to meet More ❯
Posted:

DevSecOps and Site Reliability Engineering Lead

London, England, United Kingdom
Hybrid / WFH Options
NatWest Group
pipelines and automation to help manage our product and services. You’ll work closely with our feature team and other colleagues to meet defined service level objectives and continually improve systems and environments. You’ll define error budgets that support finding the right balance between risk and reliability. You’ll also More ❯
Posted:

Infrastructure Engineer (f/m/d)

London, England, United Kingdom
Contentful
debugging distributed systems issues across Edge, Network, Compute, and Storage layers Experience with observability stacks (metrics, logs, tracing) and tools like Splunk and New Relic Familiarity with SRE practices: SLO, SLA, etc. Excellent English communication skills, verbal and written (German not required). A collaborative mindset: you're helpful, respectful, and enjoy sharing knowledge Ability to context switch, work through More ❯
Posted:

Infrastructure Engineer (f/m/d)

London, England, United Kingdom
Contentful
debugging distributed systems issues across Edge, Network, Compute, and Storage layers. Experience with observability stacks (metrics, logs, tracing) and tools like Splunk and New Relic. Familiarity with SRE practices: SLO, SLA, etc. Excellent English communication skills, verbal and written (German not required). A collaborative mindset: you’re helpful, respectful, and enjoy sharing knowledge. Ability to context switch, work through More ❯
Posted:

Senior Application Support Engineer

London, United Kingdom
Just Group plc
configurations across legacy and modern applications to ensure their continued performance and reliability. System Monitoring & Performance: Maintain and improve logging, monitoring, and alerting systems. Define service-level objectives and indicators for business applications. Continuously review performance metrics against SLO/SLIs and proactively address performance bottlenecks or underperforming systems. Manage system More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Site Reliability Engineer

London, England, United Kingdom
Thredd
performance. What You’ll Do Define strategies for Application Performance Monitoring, Unit Cost, and Chaos Engineering. Continuously optimize production environments to enhance reliability and efficiency. Implement and apply MTTR, SLO, and SLI principles to ensure high service standards. Respond to incidents, analyze root causes, and drive long-term improvements. Maintain fault-tolerant, scalable, and cost-effective … role in shaping the core technology layers that drive our platform’s success. What You Need Proven experience implementing SRE principles at scale, including deep knowledge of SLI/SLO/SLA differences. A product engineering background with strong coding skills in Python, C#, or similar. Experience with incident management frameworks and evolving them for efficiency. Expertise in cloud platforms More ❯
Posted:

Restaurant Technology Problem Manager

London, United Kingdom
Hybrid / WFH Options
McDonald's Corporation
issues. Experience managing and contributing to mid-large projects related to system reliability improvements. Knowledge of Site Reliability Engineering (SRE) Practices: including error budgeting, service level objectives (SLOs), and service level indicators (SLIs). Demonstrated ability to collaborate with cross-functional teams, including More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Software Engineering Manager - Financial Services

London, England, United Kingdom
MARKS&SPENCER
Quality, Stability & Standards: Establish quality standards to meet performance, reliability, and maintainability of the systems. With a strong production-first mindset, drive observability, maintain Service Level Objectives (SLOs), and ensure efficient incident resolution. Oversee the maintenance of existing systems, ensuring continuous improvements and prompt resolution of issues. Agile Delivery & Collaboration: Working More ❯
Posted:

Software Engineering Manager

City of London, London, United Kingdom
Marks and Spencer
Quality, Stability & Standards: Establish quality standards to meet performance, reliability, and maintainability of the systems. With a strong production-first mindset, drive observability, maintain Service Level Objectives (SLOs), and ensure efficient incident resolution. Oversee the maintenance of existing systems, ensuring continuous improvements and prompt resolution of issues. Agile Delivery & Collaboration: Working More ❯
Posted:

Software Engineering Manager

London Area, United Kingdom
Marks and Spencer
Quality, Stability & Standards: Establish quality standards to meet performance, reliability, and maintainability of the systems. With a strong production-first mindset, drive observability, maintain Service Level Objectives (SLOs), and ensure efficient incident resolution. Oversee the maintenance of existing systems, ensuring continuous improvements and prompt resolution of issues. Agile Delivery & Collaboration: Working More ❯
Posted:

Senior Network Software Engineer.

London, England, United Kingdom
Cisco
tools, and workflows, integrating internal systems and third-party solutions. Network Health Management : Define and implement prediction pipelines for long-term network health, availability, and service-level objectives. Operations Automation : Lead initiatives to automate and optimize network operations focusing on scalability and reliability. Collaborative Development : Work closely with teams on requirements analysis More ❯
Posted:

Senior Network Software Engineer

London, United Kingdom
Cisco Systems, Inc
tools, and workflows, integrating internal systems and third-party solutions. Network Health Management: Define and implement prediction pipelines for long-term network health, availability, and service-level objectives. Operations Automation: Lead initiatives to automate and optimize network operations focusing on scalability and reliability. Collaborative Development: Work closely with teams on requirements analysis More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Elastic Observability Specialist

City of London, London, United Kingdom
GIOS Technology
graphs, service maps, and transaction breakdowns in APM UI. Dashboarding & Visualization Develop Kibana dashboards, Canvas presentations, and Lens visualizations for SREs and Dev teams. Implement SLO/SLI monitoring and alerting using Kibana Alerting API and Watcher where needed. Performance Tuning & Scaling Advise on shard sizing, index rollover policies, and hot-warm architecture for efficient storage. More ❯
Posted:

Elastic Observability Specialist

London Area, United Kingdom
GIOS Technology
graphs, service maps, and transaction breakdowns in APM UI. Dashboarding & Visualization Develop Kibana dashboards, Canvas presentations, and Lens visualizations for SREs and Dev teams. Implement SLO/SLI monitoring and alerting using Kibana Alerting API and Watcher where needed. Performance Tuning & Scaling Advise on shard sizing, index rollover policies, and hot-warm architecture for efficient storage. More ❯
Posted:

Site Reliability Engineer IOE: Cardano

London, England, United Kingdom
Devopshunt
ensure that solutions are designed with customer experience, scalability, and performance in mind. Analyze system performance and reliability, offering recommendations for enhancement. Develop and uphold service-level objectives (SLOs), service-level indicators (SLIs), and error budgets for our services. Participate in on-call More ❯
Posted:

Senior Software Engineer - Network Production Engineer London, GBR Posted today

London, United Kingdom
Bloomberg L.P
network. Enhance existing monitoring and observability frameworks, integrating intelligent alerting and self-remediation capabilities to reduce manual intervention and improve incident response. Define and measure service-level objectives (SLOs) to track infrastructure performance and reliability. Write software utilizing orchestration systems to automate tasks and interact with other systems. Provide mentorship to More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Senior Infrastructure Engineer

London, England, United Kingdom
Rimes Technologies
in projects for new Infrastructure Services Optimize data center Infrastructure to improve performance, utilisation, availability and security, whilst controlling costs. Maintain observability best practices to measure compliance with Rimes SLO’s Implement efficient monitoring systems to measure performance and reliability of production systems. Participate in Capacity and Performance planning. Who you are: Core requirements: Bachelor's degree in Computer Science More ❯
Posted:

Site Reliability Engineer

London, England, United Kingdom
Hybrid / WFH Options
Attio Ltd
will have the following attributes: Proven experience with Google Cloud and Kubernetes Contribute across the stack, including TypeScript, Node.js, and Google Cloud Platform Champion operational excellence and resilience (99.99% SLO) Manage CI/CD pipelines to improve deployment speed and reliability Support backup, disaster recovery, and security Experience with Google Spanner is a nice to have Hiring Process An introductory More ❯
Posted:

Head of Software Engineering

City of London, London, United Kingdom
Marks and Spencer
and pair programming. Drive DevOps practices to automate the Product development life cycle Foster a culture of experimentation and innovation to drive solutions. Ensure products meet their SLI and SLO targets and are fully supported by teams both in and out of hours. Lead development of Product Group OKRs and Product health, and demonstrate responsibility for the entire Product Group More ❯
Posted:

Head of Software Engineering

London Area, United Kingdom
Marks and Spencer
and pair programming. Drive DevOps practices to automate the Product development life cycle Foster a culture of experimentation and innovation to drive solutions. Ensure products meet their SLI and SLO targets and are fully supported by teams both in and out of hours. Lead development of Product Group OKRs and Product health, and demonstrate responsibility for the entire Product Group More ❯
Posted:

Staff Platform Engineer , Managed Operations

London, United Kingdom
Amazon
driving technical, business, and cultural change to improve the reliability, performance, and efficiency excite you? The AWS Managed Operations (MO) organization was founded in April 2023, with the objective to reduce operational load and toil through long-term engineering projects. MO is building the best-in-class engineering and operations team that will own the day-to-day … fixes yesterday, and designed a solution to that class of problem, seeking feedback from your team. On Wednesday you investigated a Service Level Objective (SLO) that recently became less than useful. You dove deep, talked with the partner team, and found out the thresholds no-longer makes sense, so you More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Senior Full Stack Engineer - Backstage

London, England, United Kingdom
Flatiron Health
cloud-hosted environments in Amazon Web Services with Terraform You have experience programming React (or other Javascript frameworks) You have experience setting and maintaining service level objectives and service level indicators around enterprise platforms You have experience participating in incident response and engineering More ❯
Posted:
Service-Level Objective
London
10th Percentile
£56,004
25th Percentile
£64,509
Median
£69,384
75th Percentile
£92,188
90th Percentile
£98,750