excellence Develop and implement strategic plans to enhance the reliability, scalability, and efficiency of our infrastructure Collaborate with cross-functional teams to align SRE initiatives with broader organizational goals Establish and maintain SLIs, SLOs, and SLAs for critical systems and services Drive the adoption of best practices in automation … and management solution that helps organizations harness AI's potential while ensuring governance, security, compliance, and control. Experience Requirements: Proven experience in a senior SRE role or similar. Strong knowledge of cloud technologies and SLA SLO SLI management. Experience leading teams and implementing SCRUM processes. Excellent communication and leadership skills. … Experience line managing, mentoring, and coaching. Responsibilities: Collaborate with the Principal SRE to shape and implement the SRE strategic plan. Lead the SRE team in translating strategy into actionable plans, coordinating these through the SCRUM process. Address wellbeing and performance concerns, fostering a positive and productive team environment. Work with More ❯
Cambridge, Cambridgeshire, United Kingdom Hybrid / WFH Options
AI Tech Suite
excellence Develop and implement strategic plans to enhance the reliability, scalability, and efficiency of our infrastructure Collaborate with cross-functional teams to align SRE initiatives with broader organizational goals Establish and maintain SLIs, SLOs, and SLAs for critical systems and services Drive the adoption of best practices in automation … and management solution that helps organizations harness AI's potential while ensuring governance, security, compliance, and control. Experience Requirements: Proven experience in a senior SRE role or similar. Strong knowledge of cloud technologies and SLA SLO SLI management. Experience leading teams and implementing SCRUM processes. Excellent communication and leadership skills. … Experience line managing, mentoring, and coaching. Responsibilities: Collaborate with the Principal SRE to shape and implement the SRE strategic plan. Lead the SRE team in translating strategy into actionable plans, coordinating these through the SCRUM process. Address wellbeing and performance concerns, fostering a positive and productive team environment. Work with More ❯
systems while keeping levels of manual work low. SREs are expected to be experienced in software engineering principles, operational discipline, and automation. The SRE team works on a fully remote basis and works in conjunction with their US and Australian teams as well. This company are a market leader … Collaborate with product engineering teams to design/build fit-for-purpose and observable software. Required Skills and Experience: Proven experience in a SRE/DevOps/Platform Engineering role and having previously worked in a Software Engineering role in .Net and C#. Proficiency in C# development … development opportunities. Working with a team of caring, high-performing, and passionate people who have fun supporting our vision, innovation, and continuous improvement. This SRE/DevOps Engineer role is working for a market leading global software company and this job is part of a large program of change and More ❯
Job Title: SiteReliabilityEngineering (SRE) Lead – Observability Location: Stratford, London (Hybrid – 2 days per week onsite) Contract Length: 6 months Rate: £450–£500 per day (Inside IR35) Industry: Financial Services A leading Financial Services organisation in London is seeking a SiteReliabilityEngineering (SRE) Lead – Observability to join their team on a 6-month contract. This is a hybrid role requiring two days per week onsite at their Stratford, London offices. The role sits Inside IR35 . Key Responsibilities: Lead the SRE Observability team and champion observability practices across multiple product groups. … creation and QA of project-level Observability Plans. Input into and assure the quality of testing strategies and results. Requirements Proven experience in an SRE role with a strong focus on Observability. Expert-level proficiency with DevOps tools including GitHub, GitHub Actions, Jenkins, Nexus, CloudFormation/Terraform, and CodeQL. Extensive More ❯
Job Title: SiteReliabilityEngineering (SRE) Lead – Observability Location: Stratford, London (Hybrid – 2 days per week onsite) Contract Length: 6 months Rate: £450–£500 per day (Inside IR35) Industry: Financial Services A leading Financial Services organisation in London is seeking a SiteReliabilityEngineering (SRE) Lead – Observability to join their team on a 6-month contract. This is a hybrid role requiring two days per week onsite at their Stratford, London offices. The role sits Inside IR35 . Key Responsibilities: Lead the SRE Observability team and champion observability practices across multiple product groups. … creation and QA of project-level Observability Plans. Input into and assure the quality of testing strategies and results. Requirements Proven experience in an SRE role with a strong focus on Observability. Expert-level proficiency with DevOps tools including GitHub, GitHub Actions, Jenkins, Nexus, CloudFormation/Terraform, and CodeQL. Extensive More ❯
london, south east england, united kingdom Hybrid / WFH Options
MarkJames Search
Job Title: SiteReliabilityEngineering (SRE) Lead – Observability Location: Stratford, London (Hybrid – 2 days per week onsite) Contract Length: 6 months Rate: £450–£500 per day (Inside IR35) Industry: Financial Services A leading Financial Services organisation in London is seeking a SiteReliabilityEngineering (SRE) Lead – Observability to join their team on a 6-month contract. This is a hybrid role requiring two days per week onsite at their Stratford, London offices. The role sits Inside IR35 . Key Responsibilities: Lead the SRE Observability team and champion observability practices across multiple product groups. … creation and QA of project-level Observability Plans. Input into and assure the quality of testing strategies and results. Requirements Proven experience in an SRE role with a strong focus on Observability. Expert-level proficiency with DevOps tools including GitHub, GitHub Actions, Jenkins, Nexus, CloudFormation/Terraform, and CodeQL. Extensive More ❯
Manchester, Lancashire, United Kingdom Hybrid / WFH Options
bet365 Group
A SiteReliability Engineer, who will enhance system reliability, observability and performance through a strong engineering approach and assist with incident resolution and best practices. You will have software engineering skills, focusing on system reliability and observability. You will monitor the health, performance and … availability of critical systems, directly impacting operational efficiency. Using your engineering expertise, you will implement solutions that enhance reliability, including service instrumentation with tools such as Open Telemetry, improve logging practices and develop features for maintainability. You will also help engineer tools and automation for effective service management. … Collaboration is key, working across multiple functions to integrate reliability and observability best practices into the software development life cycle. By supporting governance standards set by the central teams, you will foster a culture where these principles are integral to development. Your contributions will ensure our systems meet user More ❯
Who we are looking for A SiteReliability Engineer, who will enhance system reliability, observability and performance through a strong engineering approach and assist with incident resolution and best practices. You will have software engineering skills, focusing on system reliability and observability. You will … monitor the health, performance and availability of critical systems, directly impacting operational efficiency. Using your engineering expertise, you will implement solutions that enhance reliability, including service instrumentation with tools such as Open Telemetry, improve logging practices and develop features for maintainability. You will also help engineer tools and … automation for effective service management. Collaboration is key, working across multiple functions to integrate reliability and observability best practices into the software development life cycle. By supporting governance standards set by the central teams, you will foster a culture where these principles are integral to development. Your contributions will More ❯
Stoke-on-Trent, Staffordshire, UK Hybrid / WFH Options
bet365
Who we are looking for A SiteReliability Engineer, who will enhance system reliability, observability and performance through a strong engineering approach and assist with incident resolution and best practices. You will have software engineering skills, focusing on system reliability and observability. You will … monitor the health, performance and availability of critical systems, directly impacting operational efficiency. Using your engineering expertise, you will implement solutions that enhance reliability, including service instrumentation with tools such as Open Telemetry, improve logging practices and develop features for maintainability. You will also help engineer tools and … automation for effective service management. Collaboration is key, working across multiple functions to integrate reliability and observability best practices into the software development life cycle. By supporting governance standards set by the central teams, you will foster a culture where these principles are integral to development. Your contributions will More ❯
Manchester Area, United Kingdom Hybrid / WFH Options
bet365
Who we are looking for A SiteReliability Engineer, who will enhance system reliability, observability and performance through a strong engineering approach and assist with incident resolution and best practices. You will have software engineering skills, focusing on system reliability and observability. You will … monitor the health, performance and availability of critical systems, directly impacting operational efficiency. Using your engineering expertise, you will implement solutions that enhance reliability, including service instrumentation with tools such as Open Telemetry, improve logging practices and develop features for maintainability. You will also help engineer tools and … automation for effective service management. Collaboration is key, working across multiple functions to integrate reliability and observability best practices into the software development life cycle. By supporting governance standards set by the central teams, you will foster a culture where these principles are integral to development. Your contributions will More ❯
Stoke-On-Trent, England, United Kingdom Hybrid / WFH Options
bet365
Who we are looking for A SiteReliability Engineer, who will enhance system reliability, observability and performance through a strong engineering approach and assist with incident resolution and best practices. You will have software engineering skills, focusing on system reliability and observability. You will … monitor the health, performance and availability of critical systems, directly impacting operational efficiency. Using your engineering expertise, you will implement solutions that enhance reliability, including service instrumentation with tools such as Open Telemetry, improve logging practices and develop features for maintainability. You will also help engineer tools and … automation for effective service management. Collaboration is key, working across multiple functions to integrate reliability and observability best practices into the software development life cycle. By supporting governance standards set by the central teams, you will foster a culture where these principles are integral to development. Your contributions will More ❯
Press Tab to Move to Skip to Content Link Select how often (in days) to receive an alert: SiteReliability Engineer (SRE) - Consultant - Digital Factory At Capgemini Invent, we believe difference drives change. As inventive transformation consultants, we blend our strategic, creative and scientific capabilities, collaborating closely with … science and data. Superpowered by creativity and design. All underpinned by technology created with purpose. YOUR ROLE As a SiteReliability Engineer (SRE), you will play a key role in ensuring the reliability, scalability, and efficiency of our clients' platforms. Your focus will include building strong observability … practices, aligning with the SRE mindset & principles, and driving continuous improvement. This will involve: Defining and implementing Service Level Indicators (SLIs) and Service Level Objectives (SLOs) to measure and maintain system and application performance, ensuring services meet agreed reliability targets. Instrumenting applications to collect key metrics, logs, and traces More ❯
Manchester, Lancashire, United Kingdom Hybrid / WFH Options
Embarcaderomediagroup
SiteReliability & Platform Engineer to help lead the way. You'll sit at the heart of our engineering operations, bringing together SRE principles and modern platform engineering practices. This includes combining principles of SRE - such as service-level reliability, observability, incident response - with platform engineering … ship faster, safer, and more cost-efficiently. What you'll be doing: Designing and operating highly reliable, scalable, and secure Azure-based platforms Applying SRE principles like SLOs, observability, and incident management to drive service reliability Building Infrastructure as Code using Terraform (v1.7+) and GitOps workflows Enabling teams through … for someone passionate about building robust infrastructure and enabling others to move faster and more securely. You might come from a cloud engineering, SRE, or DevOps background - what matters most is your curiosity, systems thinking, and drive to improve operational efficiency. At Sorted, we are committed to fostering an More ❯
Dundee, Angus, United Kingdom Hybrid / WFH Options
Ivanti
offerings. We are responsible for the reliability, deployment, and operation of the Ivanti Cloud product portfolio. We are seeking individuals eager to drive SRE maturity through the research and development of internal tooling, operational enhancements, and deployment pipelines. Ivanti SRE takes a holistic view of operational procedures, incident response … procedures, application and infrastructure monitoring, and process automation. Ivanti SRE is a blend of infrastructure, networking, automation, development, and application administration. This is a hands-on technical position. The ideal candidate will have a software engineering background and strong experience with continuous deployment, SaaS delivery, and production incident response. … the company's growth trajectory through continuous innovation and customer-centric solutions. What You Will Be Doing Researching, maintaining, and contributing to automation of SRE tools and processes Contributing to solutions toward reducing toil within SRE Participating in code review and analysis with SRE peers Composing and reviewing contributions to More ❯
SiteReliability Engineer (SRE) Hybrid working Who are we? Toyota Connected Europe aims to create a better world through connected mobility for all. We are a new company focused on integrating big data and a customer-centric approach into all aspects of the mobility experience, making it more … foster a start-up culture where every member acts like an owner, with immediate impact and visibility of their work. About the role: Cloud Engineering is crucial for Toyota Connected Europe's growth, providing essential tools and processes for scalable and robust global expansion. We aim to enhance agility … effectiveness, and innovation by collaborating with product teams to align on technological and project goals. As a SiteReliability Engineer, you will manage complex cloud operations for the world's largest automotive company, applying your expertise in a dynamic, fast-paced environment to empower the development of next More ❯
The successful candidate will play a key role by being an active and leading member of the cloud engineering team. You will support on running and maintaining products and services within the GCP platform. Meaning the next generation of services that form this Financial Services companies vision for … Role - Lead SiteReliability Engineer Salary - £90,440 - £106,400 Location - London – Hybrid/Flexible working. Essential Skills: · Experience working with GCP products (or extensive experience with Azure) and Cloud security and networking. · Working experience of building and administering Kubernetes clusters in a production environment and experience in … or alternatives such as Azure DevOps; You will report partner with service teams to drive the adoption of SiteReliabilityEngineering (SRE) best practices, ensuring these principles are integrated effectively within our microservices. Collaborate with infrastructure engineers to guarantee the resilience, scalability, and overall performance of the More ❯
City Of Bristol, England, United Kingdom Hybrid / WFH Options
Gravitas Recruitment Group (Global) Ltd
The successful candidate will play a key role by being an active and leading member of the cloud engineering team. You will support on running and maintaining products and services within the GCP platform. Meaning the next generation of services that form this Financial Services companies vision for … Role - Lead SiteReliability Engineer Salary - £90,440 - £106,400 Location - London – Hybrid/Flexible working. Essential Skills: · Experience working with GCP products (or extensive experience with Azure) and Cloud security and networking. · Working experience of building and administering Kubernetes clusters in a production environment and experience in … or alternatives such as Azure DevOps; You will report partner with service teams to drive the adoption of SiteReliabilityEngineering (SRE) best practices, ensuring these principles are integrated effectively within our microservices. Collaborate with infrastructure engineers to guarantee the resilience, scalability, and overall performance of the More ❯
Bradford, Yorkshire, United Kingdom Hybrid / WFH Options
Freemans Grattan Holdings (fgh)
our customer journey. Working collaboratively with a team of transformation experts you will have the flexibility to leverage your professional experience to solve computer engineering issues across a variety of technical areas, dependent on where your interests lie. Innovation is key as we look for new ideas which will … in a DevOps, or SiteReliabilityEngineering building high-traffic, high availability systems. Experience with sitereliabilityengineering (SRE) principles and monitoring tools, including New Relic. Experience in website performance monitoring and tuning using tools such as Lighthouse and the ability to troubleshoot performance More ❯
Doing: Platform Strategy & Leadership Develop and execute a platform engineering strategy aligned with business goals. Drive best practices in CI/CD, DevSecOps, SRE, and cloud-native technologies. Provide technical leadership, mentoring, and strategic direction to the team. Foster a data-driven culture focused on operational excellence and continuous … improvement. SiteReliabilityEngineering (SRE) Implement SRE methodologies to enhance system performance, reliability, and availability. Develop robust monitoring, observability, and incident response strategies. Lead automation efforts to reduce toil and enhance efficiency for development teams. Oversee 24/7 IT support operations, ensuring quick and effective … suppliers. Oversee vendor management to ensure seamless integration of third-party solutions. What We’re Looking For: Essential: Proven experience leading platform engineering, SRE, or DevOps teams in a fast-paced environment. Strong background in CI/CD pipeline development and automation tools. Expertise in AWS cloud services and More ❯
Birmingham, England, United Kingdom Hybrid / WFH Options
Digital Waffle
Doing: Platform Strategy & Leadership Develop and execute a platform engineering strategy aligned with business goals. Drive best practices in CI/CD, DevSecOps, SRE, and cloud-native technologies. Provide technical leadership, mentoring, and strategic direction to the team. Foster a data-driven culture focused on operational excellence and continuous … improvement. SiteReliabilityEngineering (SRE) Implement SRE methodologies to enhance system performance, reliability, and availability. Develop robust monitoring, observability, and incident response strategies. Lead automation efforts to reduce toil and enhance efficiency for development teams. Oversee 24/7 IT support operations, ensuring quick and effective … suppliers. Oversee vendor management to ensure seamless integration of third-party solutions. What We’re Looking For: Essential: Proven experience leading platform engineering, SRE, or DevOps teams in a fast-paced environment. Strong background in CI/CD pipeline development and automation tools. Expertise in AWS cloud services and More ❯
Location: Hybrid - 20% in the office per month Nominet is on the hunt for a skilled SiteReliability Engineer to be a part of our ReliabilityEngineering function. This team is dedicated to the creation and upkeep of our secure compute platforms, a foundational element of … unwavering commitment to strict security and compliance protocols, you can expect to encounter a myriad of challenging problems to address. The role of the SiteReliability Engineer is vital to Nominet; this role encompasses the design, rollout, and administration of scalable cloud infrastructure, primarily on AWS. The chosen … or tools to further enhance automation, orchestration, and developer experience. About you and your experience Technical A suitable candidate would ideally have experience in SRE, platform engineering, DevOps, or a cloud engineering role. AWS: Experience operating production systems on AWS. Holding relevant AWS certifications (like AWS Certified Solutions More ❯
IT solutions that simplify the way IT organizations work. We are currently looking for a Senior SiteReliability Engineer to join our SRE team in the Platform Engineering organization and help us scale our products to millions of end-users. We are looking for individuals with a … improve efficiency and reduce delivery time of applications and infrastructure Other duties as needed About You 7+ years' experience in DevOps and/or SiteReliabilityEngineering roles 3+ years' experience with an object-oriented language (preferably Java, .NET or C++) Expert+ level Linux administration, scripting, and More ❯
IT solutions that simplify the way IT organizations work. We are currently looking for a Senior SiteReliability Engineer to join our SRE team in the Platform Engineering organization and help us scale our products to millions of end-users. We are looking for individuals with a … improve efficiency and reduce delivery time of applications and infrastructure Other duties as needed About You 7+ years' experience in DevOps and/or SiteReliabilityEngineering roles 3+ years' experience with an object-oriented language (preferably Java, .NET or C++) Expert+ level Linux administration, scripting, and More ❯
SiteReliability Engineer Remote - Canada, Americas/Engineering We offer The Tyk API Management platform is helping to drive the connected world and power new products and services. We're changing the way that organisations connect any number of their systems and services.Whether internal, external, public or … like an environment that you believe could work for you then read on to find out more. The role: We're looking for a SiteReliability Engineer to manage, maintain, improve and provide support on our platform. You will be curious by nature, always looking for ways to … be advocate of continuous improvement Reliability of our new global Tyk Cloud platform Automation of operations and support Writing and maintaining documentation on SRE processes and policies Recommending and implementing ways of driving operational efficiency and driving down our cost to run, without impacting service Assisting in penetration testing More ❯
scale up, we want like-minded humans to join us on this exciting journey. Are you ready? As a SiteReliability Engineer (SRE), you will play an important role in designing, building, and maintaining the infrastructure and tools necessary to support our software applications and services. You will … collaborate closely with the product engineering squads, technical operations, and security teams to ensure the reliability, scalability, and security of our platform. Your responsibilities will include automating infrastructure provisioning, configuration management, and deployment pipelines, utilizing best practices and modern technologies to streamline processes and improve efficiency. You will … also be responsible for monitoring system performance, identifying bottlenecks, and implementing solutions to enhance system reliability and performance. Key Responsibilities Cloud Platform Management: Using Azure/AWS to manage and optimize infrastructure components, ensuring scalability, reliability, and cost management. Infrastructure Design and Implementation: Designing, building and maintaining the More ❯