Site Reliability Engineering Jobs in London

1 to 25 of 134 Site Reliability Engineering Jobs in London

Site Reliability Engineer

London, United Kingdom
Hybrid / WFH Options
AI Tech Suite
excellence Develop and implement strategic plans to enhance the reliability, scalability, and efficiency of our infrastructure Collaborate with cross-functional teams to align SRE initiatives with broader organizational goals Establish and maintain SLIs, SLOs, and SLAs for critical systems and services Drive the adoption of best practices in automation … and management solution that helps organizations harness AI's potential while ensuring governance, security, compliance, and control. Experience Requirements: Proven experience in a senior SRE role or similar. Strong knowledge of cloud technologies and SLA SLO SLI management. Experience leading teams and implementing SCRUM processes. Excellent communication and leadership skills. … Experience line managing, mentoring, and coaching. Responsibilities: Collaborate with the Principal SRE to shape and implement the SRE strategic plan. Lead the SRE team in translating strategy into actionable plans, coordinating these through the SCRUM process. Address wellbeing and performance concerns, fostering a positive and productive team environment. Work with More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Site Reliability Engineering Lead (Global)

London, United Kingdom
AXIS Capital
our collective success? We seek a talented Engineering Lead to join our dynamic team and lead our Site Reliability Engineering (SRE) function. This role ensures our systems are reliable and scalable, directly impacting user satisfaction. By integrating SRE activities across teams, you'll foster collaboration and … alignment with senior management will keep us competitive and innovative, driving collective success. What will you do in this role? Oversee and manage the SRE engineering team to ensure continuous improvement in reliability, scalability, ensuring conformance to our security standards. Lead the integration of SRE activities across Application … Computer Science, or a related field. Proven experience in a leadership role within an engineering team. Strong technical background with expertise in DevSecOps, SRE, Agile Excellent technical and organizational skills. Strong problem-solving abilities and attention to detail. What we prefer you to have: Effective communication and interpersonal skills. More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Amazon Dedicated Cloud Engineer, Region Reliability Engineering & Automation

London, United Kingdom
Amazon
Amazon Dedicated Cloud Engineer, Region Reliability Engineering & Automation Job ID: Amazon Development Center U.S., Inc. Are you passionate about creating resilient cloud systems that power mission-critical operations? Do you want to apply leading edge artificial intelligence technologies like Amazon Bedrock to challenging problems? Do you thrive on … engineering and maintaining the largest cloud infrastructure for some of the world's most complex environments? Amazon Web Services is seeking talented AWS Dedicated Cloud Engineers to join our Region Reliability Engineering & Automation (RRE&A) team. Our mission is to ensure the seamless operation of AWS's … dedicated cloud regions through proactive reliability engineering, automation, and leading-edge solutions. We seek individuals who bring a deep technical skill set in Development, Operations, Networking, and Systems Engineering, and who understand the Agile mindset and DevOps philosophies. We welcome engineers willing to think differently, redesigning systems More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Senior Director – Operations and Reliability Engineering

london, south east england, united kingdom
Boston Consulting Group
thrive. What You'll Do The Senior Director – Operations and Reliability Engineering is responsible for blending Site Reliability Engineering (SRE), DevOps, and traditional operations models to build a next-generation Reliability Engineering function. This role ensures end-to-end automation at scale, 24x7 … ensuring compliance with standardized frameworks and operational excellence. Key Responsibilities: Strategic Leadership & Transformation: Define and execute a modern Reliability Engineering strategy, integrating SRE, DevOps, and automation-first operational models. Drive end-to-end automation to eliminate toil, improve efficiency, and enhance operational resilience. Lead the transition from traditional … Operational Excellence: Mandate and assure the adoption of IT Service Management (ITSM) processes across all teams, ensuring standardized, efficient, and effective service delivery. Establish SRE-based operational metrics, including SLOs, SLIs, and error budgets. Oversee incident response, problem resolution, and root cause analysis with AI-driven remediation. Ensure high availability More ❯
Posted:

Site Reliability Engineer

London, United Kingdom
ECS Resource Group Ltd
Global Services Provider who are looking for a Site Reliability Engineer to join their team where you will manage and guide the SRE team. As a Site Reliability Engineer, you will be responsible for: Manage and guide the Site Reliability Engineering (SRE) team … while promoting SRE principles throughout FCA product groups. Serve as the SRE subject matter expert and strategic lead within the delivery organization. Oversee and advise on daily operations related to observability tools, including their upkeep and optimization. Provide hands-on support to engineering teams for delivering observability initiatives, as … the reliability of quality assurance results. Proven skills and experience to help you succeed in this role: Strong Experience with primary role of SRE Engineer Strong experience in Devops Tools (Git Hub, Git Hub Actions, Workflow, CodeQL Jenkins, Nexus, CloudFormation/Terraform etc.) Strong experience in monitoring tool (Datadog More ❯
Employment Type: Contract
Rate: £450 - £500/day
Posted:

Site Reliability Engineer

London, United Kingdom
Hybrid / WFH Options
Eligo Recruitment
Are you a Site Reliability Engineer with experience in the iGaming and Gambling sector looking for an exciting new challenge? BENEFITS: Up to £95k depending on experience, fully remote, excellent benefits package. Join a rapidly growing company at the forefront of the iGaming industry, dedicated to delivering world … a leading brand, blending sports betting and online casino entertainment on a cutting-edge, custom-built platform. As a Site Reliability Engineer (SRE) , you will play a crucial role in designing, implementing, and maintaining scalable and reliable infrastructure. Working closely with development teams, you'll apply SRE principles … and resolving performance and availability issues. Manage and optimise containerised environments with Kubernetes , ensuring scalability and high availability. Collaborate with development teams to implement SRE best practices Implement strategies for Continuous Deployment to minimise release risks. Required Experience & Expertise Previous experience within the iGaming and Gambling sector. Strong experience with More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Site Reliability Engineer

London, United Kingdom
Robert Walters Workforce Consultancy
Site Reliability Engineer London - 3 days in office mandatory Full-time permanent Up to £90,000 + 5 annual performance related bonus We have an exciting new opportunity for a Site Reliability Engineer to Join Robert Walters as a Consultant. As an Employed Consultant, you will … rapidly growing field in technology and applying your skillsets to drive innovation and modernize the world's most complex and mission-critical systems! As SRE you will join a Global Investment Bank within the SRE Payments Technology Team in the Corporate & Investment Bank line of business, you will solve complex … and iteratively improve on existing solutions. You are a significant contributor to your team by sharing your knowledge of end-to-end operations, availability, reliability, and scalability of your application or platform. Job responsibilities: * Guides and assists others in the areas of building appropriate level designs and gaining consensus More ❯
Employment Type: Permanent
Salary: £90,000
Posted:

Senior Systems Engineer, Site Reliability Engineering

London, United Kingdom
Google
Preferred Qualifications: Master's degree in Computer Science or Engineering, or a related field. About the Job Site Reliability Engineering (SRE) combines software and systems engineering to build and run large-scale, massively distributed, fault-tolerant systems. SRE ensures that Google Cloud's services-both … our systems capacity and performance. Much of our software development focuses on optimizing existing systems, building infrastructure and eliminating work through automation. On the SRE team, you'll have the opportunity to manage the complex challenges of scale which are unique to Google Cloud, while using your expertise in coding … algorithms, complexity analysis and large-scale system design. SRE's culture of diversity, intellectual curiosity, problem solving and openness is key to its success. Our organization brings together people with a wide variety of backgrounds, experiences and perspectives. We encourage them to collaborate, think big and take risks in a More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Senior Software Engineer, Site Reliability Engineering, Google Cloud

London, United Kingdom
Google
automate routine tasks. Systematic problem-solving approach, coupled with effective verbal and written communication skills. About the Job Site Reliability Engineering (SRE) combines software and systems engineering to build and run large-scale, massively distributed, fault-tolerant systems. SRE ensures that Google Cloud's services-both … our systems capacity and performance. Much of our software development focuses on optimizing existing systems, building infrastructure and eliminating work through automation. On the SRE team, you'll have the opportunity to manage the complex challenges of scale which are unique to Google Cloud, while using your expertise in coding … algorithms, complexity analysis, and large-scale system design. SRE's culture of diversity, intellectual curiosity, problem solving, and openness is key to its success. Our organization brings together people with a wide variety of backgrounds, experiences, and perspectives. We encourage them to collaborate, think big, and take risks in a More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Site Reliability Engineer (SRE) - Consultant - Digital Factory

London, United Kingdom
Hybrid / WFH Options
Capgemini
Press Tab to Move to Skip to Content Link Select how often (in days) to receive an alert: Site Reliability Engineer (SRE) - Consultant - Digital Factory At Capgemini Invent, we believe difference drives change. As inventive transformation consultants, we blend our strategic, creative and scientific capabilities, collaborating closely with … science and data. Superpowered by creativity and design. All underpinned by technology created with purpose. YOUR ROLE As a Site Reliability Engineer (SRE), you will play a key role in ensuring the reliability, scalability, and efficiency of our clients' platforms. Your focus will include building strong observability … practices, aligning with the SRE mindset & principles, and driving continuous improvement. This will involve: Defining and implementing Service Level Indicators (SLIs) and Service Level Objectives (SLOs) to measure and maintain system and application performance, ensuring services meet agreed reliability targets. Instrumenting applications to collect key metrics, logs, and traces More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Oracle Site Reliability Engineer

London, United Kingdom
BENTLEY SYSTEMS, INC
Select how often (in days) to receive an alert: We are seeking a highly skilled and proactive Oracle Site Reliability Engineer (SRE) to ensure the reliability, performance, and scalability of our critical Oracle-based applications and services supporting a global user base. The ideal candidate will possess … deep expertise in Oracle technologies and SRE methodologies. You will be responsible for ensuring the stability and efficiency of our Oracle systems, implementing automation, managing patching, and providing expert-level support to our global users. Strong cross-functional collaboration and a proactive approach to problem-solving are essential for success … troubleshoot IT systems. Leverage cloud platforms and automation tools to enhance scalability and efficiency. Ensure compliance with IT standards and regulations. Apply knowledge of SRE (Site Reliability Engineering) and/or DEVOPS practices to improve system reliability and performance. Maintain UK Security Clearance BPSS (Baseline Personnel More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Senior Director - Operations and Reliability Engineering

London, United Kingdom
TieTalent
clients to thrive. What You'll Do The Senior Director - Operations and Reliability Engineering is responsible for blendingSite Reliability Engineering (SRE), DevOps, and traditional operations modelsto build a next-generationReliability Engineering function. This role ensuresend-to-end automation at scale, 24x7 operational excellence, and high … workforce development programs forAI-driven operations, automation, and modern reliability practices. What You'll Bring Required Qualifications: 15+ years of experiencein IT operations, SRE, DevOps, or platform engineering. 5+ years in a senior leadership role, managinglarge-scale IT environments. Deep technical expertise incloud computing (AWS, Azure, GCP), on-prem … security, regulatory compliance, and risk management. Excellent leadership, communication, and stakeholder management skills. Preferred Qualifications: Certifications:ITIL, AWS/Azure/GCP Solutions Architect, SRE Foundation, CISSP, or equivalent. Experience withKubernetes, Terraform, Ansible, and AI-powered operations tools. Strong problem-solving abilities, with a data-driven approach to operational excellence. More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Site Reliability Engineer - Paris/London

London, United Kingdom
Mistral AI
can make a meaningful impact. See more about our culture on . Role Summary We are seeking highly experienced Site Reliability Engineers (SRE) to shape the reliability, scalability and performance of our platform and customer facing applications. You will work closely with our software engineers and research … teams to ensure our systems meet and exceed our internal and external customers' expectations. What you will do As a Site Reliability Engineer, you balance the day-to-day operations on production systems with long-term software engineering improvements to reduce operational toil and foster the reliability … and conferences. About you Master's degree in Computer Science, Engineering or a related field. 7+ years of experience in a DevOps/SRE role. Strong experience with cloud computing and highly available distributed systems. Exposure to site reliability issues in critical environments (issue root cause analysis More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Site Reliability Engineer

London, United Kingdom
Hybrid / WFH Options
Nominet
Location: Hybrid - 20% in the office per month Nominet is on the hunt for a skilled Site Reliability Engineer to be a part of our Reliability Engineering function. This team is dedicated to the creation and upkeep of our secure compute platforms, a foundational element of … unwavering commitment to strict security and compliance protocols, you can expect to encounter a myriad of challenging problems to address. The role of the Site Reliability Engineer is vital to Nominet; this role encompasses the design, rollout, and administration of scalable cloud infrastructure, primarily on AWS. The chosen … or tools to further enhance automation, orchestration, and developer experience. About you and your experience Technical A suitable candidate would ideally have experience in SRE, platform engineering, DevOps, or a cloud engineering role. AWS: Experience operating production systems on AWS. Holding relevant AWS certifications (like AWS Certified Solutions More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Senior Site Reliability Engineer (SRE)

London, United Kingdom
Devopshunt
what matters. We are in it for the long term, come join us on this journey. As a Senior Site Reliability Engineer (SRE), you'll be joining a team whose mission is to ensure the availability, performance, security and reliability of our platform and core services, ensuring … monitoring of those systems, for building tooling and automation to reduce TOIL and for responding to incidents as part of our 24/7 SRE on-call team. Reliability Engineering at Board Intelligence The SRE team: Strives to provide the highest standards of Availability, Scalability, Performance and Security … and responds to incidents as part of a 24/7 rota Key responsibilities of the role We're looking for a great Senior SRE to be a hands on individual contributor to key technical projects and to help us build a first-class SRE function. This role will involve More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Site Reliability Engineer with Python

London, United Kingdom
Jas Gujral
Site Reliability Engineer with Python Our client is seeking a site reliability engineer to deploy, manage, troubleshoot, and enhance complex cloud-based internal tools and externally managed services for a diverse organization. You should have at least 7 to 10 years of hands-on experience as … a Site Reliability Engineer. You will collaborate with IT, product, and engineering teams to maintain and improve these tools and services, troubleshooting issues as they arise. The ideal candidate will proactively identify system weaknesses and resolve them before causing production issues through monitoring and data analysis using … and services promptly after failures. Serve as the technical point of contact for two core platforms (mobile and web), engaging with IT support and engineering teams for problem-solving, issue resolution, and feature enhancements. Collaborate with internal teams and external vendors to ensure software quality, security, and performance standards More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Senior Site Reliability Engineer

London, United Kingdom
Hybrid / WFH Options
NinjaOne, LLC
IT solutions that simplify the way IT organizations work. We are currently looking for a Senior Site Reliability Engineer to join our SRE team in the Platform Engineering organization and help us scale our products to millions of end-users. We are looking for individuals with a … improve efficiency and reduce delivery time of applications and infrastructure Other duties as needed About You 7+ years' experience in DevOps and/or Site Reliability Engineering roles 3+ years' experience with an object-oriented language (preferably Java, .NET or C++) Expert+ level Linux administration, scripting, and More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Senior Site Reliability Engineer London, United Kingdom

London, United Kingdom
Hybrid / WFH Options
NinjaOne, LLC
IT solutions that simplify the way IT organizations work. We are currently looking for a Senior Site Reliability Engineer to join our SRE team in the Platform Engineering organization and help us scale our products to millions of end-users. We are looking for individuals with a … improve efficiency and reduce delivery time of applications and infrastructure Other duties as needed About You 7+ years' experience in DevOps and/or Site Reliability Engineering roles 3+ years' experience with an object-oriented language (preferably Java, .NET or C++) Expert+ level Linux administration, scripting, and More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Site Reliability Engineer with Python

London
Nexus Jobs Limited
Job Description Site Reliability Engineer with Python Our Client looking to bring on a site reliability engineer to help deploy, manage, troubleshoot, and enhance our complex cloud-based set of internal tools and externally managed services for a variety of users across our wide-ranging organization. … You will have at least 7 to 10 years hands-on expertise working as a Site Reliability Engineer. You will work closely with IT, product, and engineering to extend and maintain this set of tools and services and to help debug and resolve problems. In addition, the … Actively lead any critical issue post-mortem processes, including coordination of any meetings and further steps to take Qualifications • 7+ years experience with software engineering, software development, and/or system operations • Experience debugging complex problems and implementing timely cost-effective solutions • Experience designing, building, and operating large-scale More ❯
Employment Type: Permanent
Salary: £80,000 - £100,000
Posted:

Site Reliability Engineer Remote - Canada, Americas / Engineering

London, United Kingdom
Hybrid / WFH Options
Tyk Technologies
Site Reliability Engineer Remote - Canada, Americas/Engineering We offer The Tyk API Management platform is helping to drive the connected world and power new products and services. We're changing the way that organisations connect any number of their systems and services.Whether internal, external, public or … like an environment that you believe could work for you then read on to find out more. The role: We're looking for a Site Reliability Engineer to manage, maintain, improve and provide support on our platform. You will be curious by nature, always looking for ways to … be advocate of continuous improvement Reliability of our new global Tyk Cloud platform Automation of operations and support Writing and maintaining documentation on SRE processes and policies Recommending and implementing ways of driving operational efficiency and driving down our cost to run, without impacting service Assisting in penetration testing More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Site Reliability Engineer

London, United Kingdom
Cloud Bridge
The Site Reliability Engineer (SRE) will play a key role in maintaining and scaling infrastructure, ensuring reliability, performance, and scalability. You will collaborate closely with development, operations, and security teams to improve the reliability and efficiency of applications, addressing incidents, automating processes, and managing infrastructure as … GCP, or Azure), automate provisioning using Terraform or CloudFormation, and manage resources for optimal performance. Monitor, troubleshoot, and resolve incidents, optimizing systems to ensure reliability and minimize downtime. Implement monitoring (Prometheus, Grafana, Datadog) and set up alerting systems to proactively address issues and ensure scalability. Work with DevOps, engineering … Certifications : AWS Certified DevOps Engineer, Google Professional Cloud Architect, or similar. Containerization & Orchestration : Experience with Docker, Kubernetes, or ECS/EKS for containerized applications. SRE Experience : Familiarity with SRE principles like SLAs, SLOs, and error budgets, and practical application of those in large-scale systems. Distributed Systems : Understanding of microservices More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Site Reliability Engineer

London, United Kingdom
Hybrid / WFH Options
Vitesse
scale up, we want like-minded humans to join us on this exciting journey. Are you ready? As a Site Reliability Engineer (SRE), you will play an important role in designing, building, and maintaining the infrastructure and tools necessary to support our software applications and services. You will … collaborate closely with the product engineering squads, technical operations, and security teams to ensure the reliability, scalability, and security of our platform. Your responsibilities will include automating infrastructure provisioning, configuration management, and deployment pipelines, utilizing best practices and modern technologies to streamline processes and improve efficiency. You will … also be responsible for monitoring system performance, identifying bottlenecks, and implementing solutions to enhance system reliability and performance. Key Responsibilities Cloud Platform Management: Using Azure/AWS to manage and optimize infrastructure components, ensuring scalability, reliability, and cost management. Infrastructure Design and Implementation: Designing, building and maintaining the More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Principal Software Engineer

London
Hybrid / WFH Options
Sanderson plc
to build a market-leading digital offering with customer experience at its heart. This is an exciting and key role, partnering with business aligned engineering and product teams, to ensure a collaborative team culture is at the heart of what we do. Our Team empowers innovation, underpinned by engineering … IAM solutions (PingGateway, PingAM, PingIDM, PingDS), including designing and implementing cloud-based, scalable and resilient IAM solutions for large corporate organisations. Experience with IAM engineering experience across authentication, authorisation, single sign-on, multi-factor authentication, identity lifecycle management, OAuth2.0, OpenID Connect, SAML and policy management Knowledge of Site Reliability Engineering, automation, observability, incident management, resilience, disaster recovery, high availability, documentation IAM engineering experience, authentication, authorisation, single sign-on, multi-factor authentication, user lifecycle management, hands on CI/CD approaches and technologies Experience with Ping Identity/Okta/ForgeRock (product platform experience, system More ❯
Employment Type: Permanent
Salary: £90,000 - £110,000
Posted:

Site Reliability Engineer - SRE

City of London, London, United Kingdom
Hybrid / WFH Options
Sanderson Recruitment
Role: Site Reliability Engineer Location: London (Hybrid) Salary: £80,000 - £105,000 As our Site Reliability Engineer, you'll work closely with our feature team and other colleagues to meet defined service level objectives and continually improve systems and environments. You'll define error budgets that … Very strong engineering skills in Java,JavaScript or Python Open Telemetry experience Must have Core Java/Python Must have experience as an SRE knowledge of Python Data Structures Strong knowledge of deploy and release services, automation and troubleshooting Experience of utilising tools and technology across the software development More ❯
Employment Type: Permanent
Posted:

Site Reliability Engineer (SRE) - Payments

London, United Kingdom
Apple Inc
Site Reliability Engineer (SRE) - Payments London, England, United Kingdom Software and Services Description SRE and Engineering Operations Engineers in the team take part in every aspect of the software development lifecycle. We work in a fast-paced environment and are responsible for hands-on coding of critical … We have constructive design discussions, learn from each other, and use our experience to guide and teach. We work closely with privacy and security engineering teams to ensure that the products we build go above and beyond on both fronts. We also partner closely with quality and testing teams … analytical thinking skills. Experience building systems both on-premise (data center) and on public cloud (AWS, GCP, or Azure welcome). Understanding of core SRE concepts - Monitoring, Alerting, Incident management. Preferred Qualifications Expertise with container platforms (e.g. Docker, or similar). Experience in presenting complex technical concepts to both technical More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:
Site Reliability Engineering
London
10th Percentile
£68,375
25th Percentile
£86,250
Median
£110,000
75th Percentile
£138,750
90th Percentile
£139,125