Lead SiteReliabilityEngineer Are you ready to take your career to the next level in a role that’s critical to the reliability, scalability, and performance of cutting-edge systems? We’re on the lookout for a Lead SiteReliabilityEngineer to … Contribute to quality systems through deviation management, CAPA follow-up, and root cause investigations. What We’re Looking For: 5+ years of experience in SiteReliability Engineering or a related field. Hands-on experience with Biosafety and GMP environments. Strong foundation in Lean Six Sigma principles. Proven problem More ❯
Reading, England, United Kingdom Hybrid / WFH Options
People Source Consulting trading as Experis
SiteReliabilityEngineer - DevOps Engineer 18 Month Contract PAYE - Fully Remote/or Hybrid based in Midlands if preferred. The role We are working with one of the finest gaming studios in the industry and are on the lookout for an … exceptional SiteReliabilityEngineer who can bring their expertise and unique thinking to help make their team even stronger! As an SRE the main purpose is solving for scale through collaboration and automation, bringing engineering principles to infrastructure and operational problems. Work closely with the different teams More ❯
facilitate effective job matching and career development, not just for our users but also for our own team members. We are looking for a SiteReliabilityEngineer Lead to ensure our systems are reliable, scalable, and efficient. As the SiteReliabilityEngineer Lead, you … maintaining the health and performance of our platforms while also leading a talented team of engineers. You will champion and coach best practices in reliability and operational excellence to deliver an exceptional experience for our users. Key Responsibilities Minimising downtime to products & services and ensuring the platform is stable … availability and performance. Work with senior stakeholders to mature the concept of SiteReliability within the CVL organisation. Lead and mentor the SRE function, fostering a culture of collaboration, innovation, and excellence. Creating a bridge between Development and support teams by applying an ‘as-a-service' mindset to More ❯
Reigate, Surrey, United Kingdom Hybrid / WFH Options
Willis Towers Watson
Description Summary : We are seeking a SiteReliabilityEngineer to join our SRE team based in Reigate. The ideal candidate will have excellent communication skills, experience working with multiple stakeholders, and a track record in Azure and Observability platforms. You will be joining Insurance Consulting and Technology … delivery family to deliver core foundational functionality that will be used by multiple SaaS product offerings across the business. You will be with other SiteReliability and Response teams as well as with the core Applications Teams, whose responsibility is to deliver and manage business critical services that … working arrangements, with presence in the Reigate office up to two days per week. The Role: Collaborate with cross-functional teams to ensure the reliability, availability, and performance of our client-facing services Maintain and configure observability platforms such as Datadog Proactive monitoring of production and other environments to More ❯
london, south east england, United Kingdom Hybrid / WFH Options
RP International
SiteReliabilityEngineer | Inside IR35 | Hybrid - 2 Days Onsite London | 6 Month Contract Our client a multinational and respected consultancy is hiring for a Lead SiteReliabilityEngineer with expertise in AWS and DevOps Tools for a new project in the Public Sector. Technical More ❯
A prestigious, technology-driven hedge fund is seeking a highly skilled SiteReliabilityEngineer (SRE) to join their global infrastructure team. This is a unique opportunity to work in a high-performance, low-latency trading environment where technology is at the heart of the firm’s competitive … critical role in ensuring the performance, reliability, and scalability of the systems that power the fund’s trading and research platforms. As an SRE, you will work closely with software engineers and investment teams to build automation-first solutions that support the firm’s most advanced strategies. Key Responsibilities … across the business. Design and implement automation to eliminate manual tasks and reduce operational risk. Collaborate with software and investment teams to embed the SRE mindset early in the development lifecycle. Ideal Candidate: SRE with experience working with data systems Ability to program (structured, OOP, and TDD) using one or More ❯
eDV SiteReliabilityEngineer Looking for an eDV SRE. Someone with a defence industry specialism with a passion … for creating efficient and secure cloud infrastructure. You will play a critical part in transforming and enhancing both internal and external operations through effective SRE practices. Core Responsibilities Infrastructure Excellence: Design, manage, and evolve our cloud-based infrastructure to support high-traffic applications and seamless service delivery. Secure Deployment: Develop More ❯
london, south east england, united kingdom Hybrid / WFH Options
Future Talent Group
SiteReliabilityEngineer – FinTech/Global Payments – London HQ/Remote First Salary - £80,000/£85,000 + Bonus Location - This UK-based team offers a fully remote working option, with a headquarters in Central London. In this role, you will be joining a leading SaaS … market. The business aims to scale its platform significantly over the next few years to support a growing international client base. Responsibilities Champion core SRE practices: define SLIs/SLOs/SLAs, reduce toil through automation, and plan for Disaster Recovery. Refine KPIs to support data-driven decisions around reliability … teams to build resilient, observable, and maintainable features. Promote DevOps culture by leading knowledge-sharing sessions and supporting issue resolution. Skills Strong grounding in SRE principles and operational best practices. Proficient with observability tools (Prometheus, Grafana, OTEL, Cloudwatch) and telemetry pipelines. Solid programming skills in Python and/or Go More ❯
london (city of london), south east england, united kingdom Hybrid / WFH Options
Future Talent Group
SiteReliabilityEngineer – FinTech/Global Payments – London HQ/Remote First Salary - £80,000/£85,000 + Bonus Location - This UK-based team offers a fully remote working option, with a headquarters in Central London. In this role, you will be joining a leading SaaS … market. The business aims to scale its platform significantly over the next few years to support a growing international client base. Responsibilities Champion core SRE practices: define SLIs/SLOs/SLAs, reduce toil through automation, and plan for Disaster Recovery. Refine KPIs to support data-driven decisions around reliability … teams to build resilient, observable, and maintainable features. Promote DevOps culture by leading knowledge-sharing sessions and supporting issue resolution. Skills Strong grounding in SRE principles and operational best practices. Proficient with observability tools (Prometheus, Grafana, OTEL, Cloudwatch) and telemetry pipelines. Solid programming skills in Python and/or Go More ❯
london (west end), south east england, united kingdom Hybrid / WFH Options
Future Talent Group
SiteReliabilityEngineer – FinTech/Global Payments – London HQ/Remote First Salary - £80,000/£85,000 + Bonus Location - This UK-based team offers a fully remote working option, with a headquarters in Central London. In this role, you will be joining a leading SaaS … market. The business aims to scale its platform significantly over the next few years to support a growing international client base. Responsibilities Champion core SRE practices: define SLIs/SLOs/SLAs, reduce toil through automation, and plan for Disaster Recovery. Refine KPIs to support data-driven decisions around reliability … teams to build resilient, observable, and maintainable features. Promote DevOps culture by leading knowledge-sharing sessions and supporting issue resolution. Skills Strong grounding in SRE principles and operational best practices. Proficient with observability tools (Prometheus, Grafana, OTEL, Cloudwatch) and telemetry pipelines. Solid programming skills in Python and/or Go More ❯
london, south east england, united kingdom Hybrid / WFH Options
MarkJames Search
Job Title: SiteReliability Engineering (SRE) Lead – Observability Location: Stratford, London (Hybrid – 2 days per week onsite) Contract Length: 6 months Rate: £450–£500 per day (Inside IR35) Industry: Financial Services A leading Financial Services organisation in London is seeking a SiteReliability Engineering (SRE) Lead … a hybrid role requiring two days per week onsite at their Stratford, London offices. The role sits Inside IR35 . Key Responsibilities: Lead the SRE Observability team and champion observability practices across multiple product groups. Provide thought leadership from the Cognizant delivery team on all things SRE. Leverage hands-on … creation and QA of project-level Observability Plans. Input into and assure the quality of testing strategies and results. Requirements Proven experience in an SRE role with a strong focus on Observability. Expert-level proficiency with DevOps tools including GitHub, GitHub Actions, Jenkins, Nexus, CloudFormation/Terraform, and CodeQL. Extensive More ❯
The SRE Manager is responsible for leading the SiteReliability Engineering function across Europe, ensuring the reliability, scalability, and performance of critical infrastructure and services. This role plays a key part in the global follow-the-sun support model, working closely with the Global SRE Leader to … impact team. You'll collaborate with Engineering, Infrastructure, and Operations teams to maintain high availability and resilient service delivery, while also mentoring a regional SRE team focused on continuous improvement and innovation. Key Responsibilities: Technical Leadership Develop deep expertise in the Titanium trading platform to lead and support critical business … ensuring priorities align with business goals and resource capacity. Operational Excellence Champion initiatives that enhance system availability, scalability, and performance. Collaborate with the Global SRE Leader to refine and enforce operational policies (e.g., Capacity Planning, Change Management, Disaster Recovery). Cross-Functional Collaboration Partner with Software Engineering, Infrastructure, Operations, Security More ❯
Southampton, Hampshire, United Kingdom Hybrid / WFH Options
NICE
production environment by monitoring availability and taking a holistic view of system health Build software and systems to manage platform infrastructure and applications Improve reliability, quality, and time-to-market of our suite of software solutions Measure and optimize system performance, with an eye toward pushing our capabilities forward … Participate in system design consulting, platform management, and capacity planning Create sustainable systems and services through automation and uplifts Balance feature development speed and reliability with well-defined service level objectives Have you got what it takes? 3-6 years of working experience in a similar role, with a … Python, Go, Java, C#) and experience with scripting languages (e.g., Bash, PowerShell). Deep understanding of cloud computing platforms (e.g., AWS), the working and reliability constraints of some of the prominent services (e.g., EC2, ECS, Lambda, DynamoDB etc) Experience with infrastructure as code tools such as CloudFormation, Terraform. Deep More ❯
This role plays a key part in the global follow-the-sun support model, working closely with the Global SRE Leader to support platforms worldwide. We are looking for SRE talent with experience in an On-Prem/Datacenter environment. The ideal candidate will bring strong technical leadership, experience in … high-impact team. You'll collaborate with Engineering, Infrastructure, and Operations teams to maintain high availability and resilient service delivery, while also mentoring a SRE team focused on continuous improvement and innovation. Key Responsibilities: Technical Leadership Develop deep expertise in the Titanium trading platform to lead and support critical business … ensuring priorities align with business goals and resource capacity. Operational Excellence Champion initiatives that enhance system availability, scalability, and performance. Collaborate with the Global SRE Leader to refine and enforce operational policies (e.g., Capacity Planning, Change Management, Disaster Recovery). Cross-Functional Collaboration Partner with Software Engineering, Infrastructure, Operations, Security More ❯
Company Description It all started in sunny San Diego, California in 2004 when a visionary engineer, Fred Luddy, saw the potential to transform how we work. Fast forward to today - ServiceNow stands as a global market leader, bringing innovative AI-enhanced technology to over 8,100 customers, including … infrastructure. Use your experience in software development, systems engineering and networking to proactively prevent repeatable issues. Drive initiatives with partner teams to improve the reliability and performance of the infrastructure through improved system design. Drive a culture of intolerance to manual activity which results in a highly automated environment More ❯