Service-Level Objective Jobs in England

19 of 19 Service-Level Objective Jobs in England

Lead Site Reliability Engineer - Cloud

Bristol, Avon, England, United Kingdom
Hybrid / WFH Options
Robert Walters
to manage Kubernetes clusters in production environments Competence in scripting and development using languages such as Python, Java, Go, Bash, or PowerShell Strong understanding of service-level objectives (SLOs), indicators (SLIs), and monitoring practices Hands-on experience with infrastructure as code (e.g., Terraform) and CI/CD tools (e.g., Jenkins, Azure More ❯
Employment Type: Full-Time
Salary: £90,000 - £110,000 per annum
Posted:

Lead Site Reliability Engineer

London, United Kingdom
Hybrid / WFH Options
Lloyds Bank plc
critical detail to your mentees Production Kubernetes experience and debugging all services that run within the K8s ecosystem, including Istio service mesh SRE mentality (SLI, SLO & SLA) using Observability, Logging, Monitoring & Alerting (Dynatrace) Ideally coming from a software engineering or exceptional scripting skill background and have moved into SRE/DevOps while gaining a wider understanding More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Engineer - Site Reliability Engineering

London, United Kingdom
London Stock Exchange Group
Main responsibilities We are looking for people with a passion to learn, and who bring a continuous improvement mentality to our team! SREs maintain Service Level Objectives for the systems they own. Constantly measuring and improving availability, latency, and overall system health is at the core of our team's purpose. More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Lead Site Reliability Engineer

London, United Kingdom
Lloyds Banking Group
critical detail to your mentees Production Kubernetes experience and debugging all services that run within the K8s ecosystem, including Istio service mesh SRE mentality (SLI, SLO & SLA) using Observability, Logging, Monitoring & Alerting (Dynatrace) Ideally coming from a software engineering or exceptional scripting skill background and have moved into SRE/DevOps while gaining a wider understanding More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Director of AWS Platforms

London, United Kingdom
Boston Consulting Group
observability platforms to support real-time decision-making. Support incident prevention, root cause analysis, and continuous improvement through data-driven insights. Define and enforce service level objectives (SLOs) and key performance indicators (KPIs) for SACM health and value. Governance, Compliance & Asset Management: Ensure accurate, complete, and up-to-date asset and More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Director of Software Asset and Configuration Management

London, United Kingdom
Boston Consulting Group
observability platforms to support real-time decision-making. Support incident prevention, root cause analysis, and continuous improvement through data-driven insights. Define and enforce service level objectives (SLOs) and key performance indicators (KPIs) for SACM health and value. Governance, Compliance & Asset Management: Ensure accurate, complete, and up-to-date asset and More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Global IT Security Platform Senior Director

London, United Kingdom
Boston Consulting Group
automated response. Apply SRE principles to improve reliability, performance, and maintainability of security services. Lead platform health, patching automation, and vulnerability remediation workflows. Define service level objectives (SLOs) and key performance indicators (KPIs) for all security services. Compliance, Governance & Risk Management: Ensure alignment with global compliance requirements such as ISO More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Platform Engineer

London, England, United Kingdom
Allegis Global Solutions
Cloud Platform (GCP). This role will involve working closely with development, platform engineering, and security teams to implement DevOps best practices, define and enforce service-level objectives, and build a scalable monitoring and alerting platform. Key Responsibilities Automate deployment, monitoring, and incident response processes using GCP-native tools and technologies. … in Onyx to operate with a DevOps ethos. Collaborate with development teams to optimise application performance, reliability, and observability on GCP. Implement and enforce Service Level Objectives (SLOs) and Error Budgets to ensure a balance between reliability and feature development. Develop and maintain a comprehensive monitoring and alerting platform to detect More ❯
Posted:

Head of Network Operations Analytics and Automation

Chester, Cheshire, United Kingdom
Bank of America
culture of innovation, collaboration, and continuous improvement. Ensure network automation complies with relevant regulatory requirements, security requirements and industry standards. Establish Key Performance Indicators and Service-Level Objectives to measure operational effectiveness. Build relationships with CTO, Application Production Support & Engineering, CIO organizations and other stakeholders. Communicate effectively with technical and non More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

SRE Engineer

London, South East, England, United Kingdom
Robert Walters
development lifecycle to ensure reliability, scalability, and operational stability are maintained across all supported platforms.* Define, create, and monitor application analytics to support improved service level objectives and drive data-informed decision making.* Ensure strict adherence to change management release processes while accelerating automation initiatives for these workflows.* Lead resiliency management … e.g., RDS/Aurora) and non-relational databases equips you to support diverse data storage requirements.* Previous exposure to site reliability engineering concepts-including service level objectives (SLOs), service level agreements (SLAs), service level indicators More ❯
Employment Type: Contractor
Rate: £400 - £500 per day
Posted:

Senior Site Reliability Engineer (SRE) / Unix

London, United Kingdom
Morgan Hunt UK Limited
OS/application deployments. Manage Oracle Database 19c on Oracle Linux (KVM) . Disaster Recovery & Automation Strengthen automation for disaster recovery (DR) activities . Work towards Recovery Time Objective (RTO) of 2hrs & Recovery Point Objective (RPO) of zero . Conduct DR testing (3 scheduled tests per financial year, potentially outside core hours). Maintain CommVault backup … . Monitoring & Observability Support logging & observability stacks (InfluxDB, Grafana, Prometheus, Nagios). Enhance monitoring via REST APIs, time-series databases, and full-stack tools (TICK, Elasticsearch, OpenSearch). Promote SLO/SLI measurement & tracking . Security & Compliance Drive security improvements & vulnerability remediation . Perform regular RHEL/KVM patching & hardening . Manage Red Hat Satellite & Ansible Automation Platform . Support More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Senior Site Reliability Engineer SRE / Unix

London, South East, England, United Kingdom
Morgan Hunt Recruitment
OS/application deployments. Manage Oracle Database 19c on Oracle Linux (KVM) . Disaster Recovery & Automation Strengthen automation for disaster recovery (DR) activities . Work towards Recovery Time Objective (RTO) of 2hrs & Recovery Point Objective (RPO) of zero . Conduct DR testing (3 scheduled tests per financial year, potentially outside core hours). Maintain CommVault backup … . Monitoring & Observability Support logging & observability stacks (InfluxDB, Grafana, Prometheus, Nagios). Enhance monitoring via REST APIs, time-series databases, and full-stack tools (TICK, Elasticsearch, OpenSearch). Promote SLO/SLI measurement & tracking . Security & Compliance Drive security improvements & vulnerability remediation . Perform regular RHEL/KVM patching & hardening . Manage Red Hat Satellite & Ansible Automation Platform . Support More ❯
Employment Type: Contractor
Rate: £550 per day
Posted:

Software Engineering Manager - Financial Services

London, United Kingdom
Marks & Spencer Plc
Quality, Stability & Standards: Establish quality standards to meet performance, reliability, and maintainability of the systems. With a strong production-first mindset, drive observability, maintain Service Level Objectives (SLOs), and ensure efficient incident resolution. Oversee the maintenance of existing systems, ensuring continuous improvements and prompt resolution of issues. Agile Delivery & Collaboration: Working More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

SRE Engineer

London, South East, England, United Kingdom
McGregor Boyall
a code concept is desirable. Experience with build automation, test driven development, continuous integration and delivery Experience with Relational and non Relational Databases Previous SRE experience including knowledge about SLO/SLA/SLI and error budgets, is advantageous Experience working or familiarity with one public cloud (AWS, Google or Azure) If this is of interest and you have the More ❯
Employment Type: Contractor
Rate: £400 - £500 per day
Posted:

Site Reliability Engineer

City of London, London, England, United Kingdom
Certain Advantage
Actively participate in the development life cycle, ensuring reliability and scalability and operational stability Define, create and track application analytics in support of better service level objectives Ensure adherence to change management release processes, accelerate automation of these processes Run resiliency management planning, scheduling and execution of disaster recovery tests & seek … a code concept is desirable. Experience with build automation, test driven development, continuous integration and delivery Experience with Relational and non Relational Databases Previous SRE experience including knowledge about SLO/SLA/SLI and error budgets, is advantageous Experience working or familiarity with one public cloud (AWS, Google or Azure) Preferred skills – what’ll get you noticed! Experience in More ❯
Employment Type: Temporary
Salary: Salary negotiable
Posted:

Engineering Manager, SRE Hybrid - New York City

London, United Kingdom
Hybrid / WFH Options
vercel.com
teamwork. Build rapport with each member of the team and support them as they level up their skills. Define and maintain company-wide practices around SLO definition and management, incident management, postmortem analysis, and disaster testing and recovery. Generate informed insights regarding service quality and interface directly with executive leadership to communicate More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Staff Engineer

Newcastle Upon Tyne, Tyne and Wear, North East, United Kingdom
Hybrid / WFH Options
Develop
platform's core value streams. Key Responsibilities Technical Leadership & Strategy Champion engineering best practices, system reliability, and architectural integrity Define and track progress toward Service Level Objectives (SLOs) Collaborate with product stakeholders to shape robust and scalable solutions Take responsibility for non-functional areas such as performance, maintainability, and security Provide More ❯
Employment Type: Permanent
Salary: £80,000
Posted:

Site Reliability Engineer - Met Office

London, United Kingdom
Microsoft Corporation
their full potential through the Microsoft Cloud. We are fast growing team, but we make sure we are committed to remain agile. Customer first, nurturing trust, high responsiveness, automation, SLO/SLI/SLA, blameless post-mortem, observability, monitoring, alerting, and toil reduction form the foundations of our code and we work with teams across Microsoft and external customers to … Baseline Personnel Security Standards; UK Security Clearance Responsibilities Collaborating closely with the existing SRE teams on building and enhancing tooling and automation solutions for faster resolution of issues impacting SLO's and averting incidents altogether when possible. Collaborating with the customers to understand their pain points around Supportability and SLO attainment and formulate strategies for addressing recurring issues in a More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Staff Software Engineer, AI Reliability Engineering

London, United Kingdom
Hybrid / WFH Options
Menlo Ventures
of Anthropic's mission to bring the capabilities of groundbreaking AI technologies to benefit humanity in a safe and reliable way. Responsibilities: Develop appropriate Service Level Objectives for large language model serving and training systems, balancing availability/latency with development velocity Design and implement monitoring systems including availability, latency and … distributed systems observability and monitoring at scale Understand the unique challenges of operating AI infrastructure, including model serving, batch inference, and training pipelines Have proven experience implementing and maintaining SLO/SLA frameworks for business-critical services Are comfortable working with both traditional metrics (latency, availability) and AI-specific metrics (model performance, training convergence) Have experience with chaos engineering and More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:
Service-Level Objective
England
10th Percentile
£47,500
25th Percentile
£63,850
Median
£69,384
75th Percentile
£96,355
90th Percentile
£109,130