Engineer - SiteReliabilityEngineering page is loaded Engineer - SiteReliabilityEngineering Apply locations USA-St. Louis-795 Office Pkwy time type Full time posted on Posted 11 Days Ago job requisition id R Our Team We are evolving our ReliabilityEngineering team to move beyond support and operations. As a Senior Engineer … in SiteReliability, you will be part of a diverse and inclusive organization that has full ownership of the availability, performance, and scalability of one of the most critical shared services at LSEG. Main responsibilities We are looking for people with a passion to learn, and who bring a continuous improvement mentality to our team! SREs maintain Service … core of our team's purpose. Write automation to scale systems sustainably, prevent service issues, or when they occur, quickly recover service. Partner with development teams to improve system reliability, observability, and release velocity. Participate in on-call rotations, incident response, postmortems, and root cause analysis and resolution. Be a vocal advocate of strong/sound engineering practices More ❯
Salford, Manchester, United Kingdom Hybrid / WFH Options
Lloyds Bank plc
SiteReliability Engineer (SRE) for GCP Analytics Platform page is loaded SiteReliability Engineer (SRE) for GCP Analytics Platform Apply locations Manchester time type Full time posted on Posted 2 Days Ago time left to apply End Date: August 14, 2025 (12 days left to apply) job requisition id 139740 End Date Wednesday 13 August … Range £47,790 - £53,100 We support flexible working - click here for more information on flexible working options Flexible Working Options Hybrid Working, Job Share Job Description Summary An SRE will focus on monitoring and improving the SLO of their cloud infrastructure services whilst working under the guidance of senior SRE colleagues. Job Description JOB TITLE: SiteReliability … hybrid, which involves spending at least two days per week, or 40% of our time, at our Manchester office. About this opportunity As a SiteReliability Engineer (SRE) within the Data & Platform Enablement Lab, you'll play a pivotal role in shaping and supporting a best-in-class analytics platform on Google Cloud within Lloyds Banking. Our mission More ❯
they are already renowned as having game-changing technology within their industry, with exciting scope for expansion into further industries. This role is looking for a Graduate or experienced SRE professional to work within the SRE team responsible for incident response and issue resolution. Location: Cambridge Salary: £32,000 - £70,000 per annum + excellent benefits including private healthcare (could … be more available for an experienced SRE) Requirements for SiteReliability Engineer - Graduate Considered: Excellent academics including 2.1 or 1st class honours degree from a leading international University in a STEM subject A minimum of AAB at A-Level or international equivalent if applying at Graduate level Any experience working an incident response or technical support environment would … the knowledge this role will not lead to a role in the R&D/Software teams Responsibilities for SiteReliability Engineer - Graduate Considered: Working within the SRE team you will be responsible for the architecture of a mission-critical cloud platform for an industry-leading software company. You will diagnose issues within complex systems, identify root causes More ❯
Senior SiteReliability Engineer page is loaded Senior SiteReliability Engineer Apply remote type Remote locations Remote - United Kingdom time type Full time posted on Posted Yesterday job requisition id JR- Job Description The … rapid adoption of advanced software in vehicles marks a new era for automakers and consumers, bringing both advantages and challenges. As part of SiteReliabilityEngineering (SRE) at General motors, you'll join a dedicated team focused on enhancing the reliability, efficiency, and scalability of our distributed systems. We leverage engineering principles to manage operations … and systems engineering skills to keep our services resilient, robust, and scalable. This role is for a hands-on position as an Individual Contributor (IC). As an SRE IC, you will focus on enhancing the reliability, efficiency, and performance of our services. You'll work closely with other engineers to develop automated solutions, respond to incidents, and More ❯
SiteReliabilityEngineering Manager page is loaded Site … ReliabilityEngineering Manager Apply remote type Remote locations Remote - United Kingdom time type Full time posted on Posted Yesterday job requisition id JR- Job Description As an SREEngineering Manager, you will be expected to not only lead your team in setting priorities and ensuring alignment with organizational goals but also to be deeply technical. We expect … details, solve problems hands-on, and support your team's technical decisions is crucial. You'll be a mentor, guide, and a partner, helping engineers grow, and ensuring the reliability and efficiency of the systems they are working on. We believe in setting a high bar for engineering managers who can lead by example in both technical expertise More ❯
Senior Customer Experience Engineer (SRE) Description For Senior Customer Experience Engineer (SRE) Microsoft's Azure Customer Experience (CXP) team is seeking a Senior Customer Experience Engineer (SRE) to drive reliabilityengineering excellence for their cloud platform. This role combines traditional SRE responsibilities with a strong focus on customer experience and satisfaction. The position involves working in a fast … s cloud services. The team operates with a "no dead-ends", "whatever it takes", and "biased for action" philosophy, focusing on customer success through the Microsoft Cloud. As an SRE, you'll be responsible for maintaining and improving service reliability, availability, and performance. Key responsibilities include building automation solutions, collaborating with customers to address pain points, implementing service telemetry … enterprise customers. The role offers both technical challenges and the opportunity to directly influence product development through customer feedback and operational insights. Last updated 5 days ago Collaborate with SRE teams on building and enhancing tooling and automation solutions Work with customers to understand pain points around Supportability and SLO attainment Be the single point of contact for enterprise customer More ❯
Nottingham, Nottinghamshire, United Kingdom Hybrid / WFH Options
Capital One (Europe) plc
Nottingham, Nottinghamshire Senior Software Development Engineer - SiteReliability About the Role We're looking for a Senior Engineer to join our SiteReliabilityEngineering (SRE) team. This role is ideal for a skilled Java engineer with a passion for understanding how complex systems work, analysing performance, and applying engineering solutions to make them more … efficient, stable, and scalable. You'll lead on planning and implementing key SRE initiatives, optimise and automate how our systems operate, and improve observability through better monitoring and logging. You'll also work closely with your peers to drive consistency and high standards across SRE and the wider engineering community, so a real enthusiasm for influencing others and leading … to reduce operational overheads through observability and service automation. Drive engineering best practice (e.g., Operational Excellence, Security, Quality, Resilience etc.) and set standards across the team and wider SRE community. Innovate within your team and contribute within your technical domain. Deliver key pieces of intent from inception through to design and hands-on delivery, in collaboration with your SREM. More ❯
Milton Keynes, Buckinghamshire, England, United Kingdom
Noir
SiteReliability Engineer (SRE) - Market leading company - Milton Keynes (Tech stack: .Net, C#, ASP.Net Core, SQL Server, PowerShell, Azure CLI, Bash, Azure DevOps, Jenkins, GitHub Actions, Docker, Kubernetes) Help shape the tech future of UK market leader! Backed by a major financial institution with soaring profits - my client is modernising platforms, embracing AI, and driving automation at scale. … We're hiring a Lead SiteReliability Engineer (SRE) to drive reliability, observability, and performance across our Azure cloud infrastructure. You'll work in a modern engineering environment where we live by "you build it, you run it", focused on automation, scale, and resilience. Tech stack you'll work with: .NET, C#, ASP.NET Core, SQL Server … PowerShell, Azure CLI, Bash, Azure DevOps, Jenkins, GitHub Actions, Docker, Kubernetes We want to hear from you if: As a SiteReliability Engineer (SRE) you've delivered scalable systems using .NET, C#, and ASP.NET Core , with real-world experience managing production workloads You've automated operations using PowerShell, Azure CLI, and Bash to reduce toil and boost efficiency More ❯
this business-critical environment. This senior leadership role will report to the Head of FIC Production & Reliability Engineering. You will spearhead the SiteReliabilityEngineering (SRE) function across FIC Technology-defining and executing the strategic vision- while remaining embedded in managing and accountable for day-to-day production stability in the Rates & Credit business. You will … be responsible for the resilience, scalability, and performance of mission-critical trading and risk systems-particularly within the Rates & Credit business-while also influencing SRE practices across FX, Repo, Emerging Markets, Listed Derivatives and other Core Platforms. You will work closely with senior business and technology stakeholders to shape the future of Production Engineering, while remaining deeply engaged in … the option to purchase additional days The opportunity to support a wide ranging CSR programme + 2 days' volunteering leave per year Your key responsibilities Define and drive the SRE strategy across FIC Technology, aligning reliability goals with business priorities and regulatory expectations Lead the transformation of production support into a proactive, data-driven engineering discipline focused on More ❯
SiteReliability Engineer - Data Infrastructure, AD/ADAS London/Product & Technology - AD/ADAS/Employee/hybrid Woven by Toyota is enabling Toyota's once-in-a-century transformation into a mobility company. Inspired by a legacy of innovating for the benefit of others, our mission is to challenge the current state of mobility through human … automotive software development. The right candidate will have excellent communication skills, solid coding skills, expertise in building scalable, reliable, highly available and fault-tolerant systems, broad knowledge of software engineering and sitereliabilityengineering in areas such as Large-Scale Data and Compute Infrastructure, Stream Processing, Kubernetes, High-Performance Networking, Observability and Infrastructure Automation. RESPONSIBILITIES Set … maintain, optimize and support large scale, multi-region, multi-cloud compute and storage infrastructure powering our data platform and mission critical services. Work with fellow Data Infrastructure engineers and SiteReliability engineers to ensure our systems are scalable, reliable, fault-tolerant, highly available, highly performant, and observable. Manage incidents, triage product or system issues and debug/track More ❯
are seeking a foundational member for the Cloud Infrastructure team at Writer. This role involves contributing to the development and implementation of our SiteReliabilityEngineering (SRE) program. The ideal candidate will ensure the reliability, scalability, performance, and security of Writer's critical systems, proactively guaranteeing that our high-ROI products reach customers seamlessly. Your responsibilities … ensure cost efficiency. Ensure the security and compliance of our systems, adhering to industry standards and regulations. Provide mentorship and technical guidance to junior engineers, fostering a culture of reliability and continuous improvement. Stay current with emerging technologies and industry trends to improve our sitereliability practices. Is this you? Proven expertise in SiteReliabilityEngineering with at least 7 years of hands-on experience. Deep understanding of system architecture and infrastructure design for high availability and performance. Bachelor's degree in Computer Science, Engineering, or a related field. Strong proficiency in programming languages such as Python, Java, or Go for automation and monitoring. Experience with cloud platforms like AWS, Azure, or More ❯
Cambridge, Cambridgeshire, East Anglia, United Kingdom
RedTech Recruitment
are already renowned as having game-changing technology within their industry, with exciting scope for expansion into further industries. This role is looking for someone to work within the SRE team responsible for incident response and issue resolution. Location: Cambridge Salary: £32,000 £60,000 + excellent benefits (£32,000 for a new Graduate) Requirements for SiteReliability … of a role involving lots of problem solving identifying the root causes of issues. Good logical reasoning Responsibilities for SiteReliability Engineer Graduate Considered: Working within the SRE team you will be responsible for the architecture of a mission-critical cloud platform for an industry-leading software company. You will be diagnosing issues within complex systems and identifying … emailing (if this email address has been removed by the job-board, full details for contact are available on our website). Keywords- SiteReliability Engineer/SRE/DevOps/Software Engineering/Software Development/Engineering/Physics/Astrophysics/Python/Computer science/Cloud/Mathematics/AWS/Azure/ More ❯
JOB TITLE: SiteReliability Engineer (GCP Analytics Platform) SALARY: £70,929 - £78,810 LOCATION(S): Manchester HOURS: Full-time - 35 hours per week WORKING PATTERN: Our work style is … hybrid, which involves spending at least two days per week, or 40% of our time, at our Manchester office. About this opportunity As a SiteReliability Engineer (SRE) within the Data & Platform Enablement Lab, you'll play a pivotal role in shaping and supporting a best-in-class analytics platform on Google Cloud within Lloyds Banking. Our mission … of hands-on experience working with Google Cloud products, particularly in the context of analytics platforms or large-scale infrastructure. Strong understanding of SiteReliabilityEngineering (SRE) principles, including SLIs/SLOs, error budgets, and incident response. Experience with infrastructure as code (e.g., Terraform, Deployment Manager) and CI/CD pipelines. Proficiency in monitoring, logging, and observability More ❯
Edinburgh, Midlothian, Scotland, United Kingdom Hybrid / WFH Options
McGregor Boyall
SiteReliability Engineer | UK Remote | 6months | £530 p/d outside ir35 One of our public sector clients is seeking a skilled SiteReliability Engineer (SRE) to support and enhance their modern digital platform as it transitions from on-premise to cloud-native environments. You'll work within a highly collaborative, agile Site Resilience team … focused on building reliable, secure, and scalable infrastructure and services. SiteReliability Engineer - Key Responsibilities: Administer and optimise RHEL 7/8/9 and Red Hat Satellite Automate OS and application deployment using Ansible and Infrastructure as Code (IaC) principles Support Oracle 19c on Oracle Linux with KVM and CommVault integration Maintain observability stacks (Prometheus, Grafana, InfluxDB … Deliver and support infrastructure in AWS using VPC, EC2, S3, NLB, and automation via Terraform or CDK Collaborate with development and assurance teams to improve resilience and reduce risk SiteReliability Engineer - Essential Skills & Experience: Strong Unix/Linux (RHEL 7/8/9) system administration Load balancer technologies (HAProxy, keepalived) Advanced scripting (Bash, Perl), configuration management More ❯
Engineering Manager, Reliability Because your new ideas are our new ways of working. Evolve, your way. Our Technology team is actively shaping the next wave of advancements. Engaged with innovative initiatives, your expertise will propel our business into the future. Collaborating with a creative team of tech enthusiasts, you'll contribute your unique skills to fuel our technological … advancements. The purpose of EngineeringReliability Manager is to enable smooth operations and to … increase reliability of live products & services. This role will facilitate resolution of incidents that block customer outcomes and embed and advocate for SiteReliabilityEngineering (SRE) principles. This role may sit across a single product group or multiple product groups within the channels domain. What You'll Get People are at the heart of what we More ❯
Graduate DevOps Engineer/SRE All top graduates with tech-related degrees should read this! If you have a passion for building things, love constantly solving interesting challenges and also enjoy some coding as well, then we would encourage you to explore a career in DevOps & SiteReliabilityEngineering (if you're not already!). The demand … for this skill set is high, the role is interesting and varied and it is quite rare to see entry-level DevOps or SRE positions advertised. If you're already an experienced DevOps Engineer or SiteReliability Engineer we also really want to hear from you, as we are excited to be able to offer this role working … days a week in office) Salary: £35,000 - £70,000 per annum + excellent benefits (£35,000 for a new Gradaute, more DOE experience) Requirements for Graduate DevOpsEngineer/SRE: This company hires some of the very brightest engineers and is looking for a 2.1 or 1st class honours degree from a leading international University in a STEM subject Minimum More ❯
Bristol, Avon, England, United Kingdom Hybrid / WFH Options
Robert Walters
and operation of cloud infrastructure and applications on Google Cloud Platform. You will work collaboratively with engineering and infrastructure teams to implement sitereliabilityengineering (SRE) principles, focusing on system reliability, observability, automation, and operational excellence. This role follows a hybrid working model, requiring attendance at the Bristol office for at least two days per … week or 40% of the working time. Key Responsibilities Promote and embed SRE best practices within engineering teams and microservices environments Partner with infrastructure and DevOps engineers to improve system resilience and performance Troubleshoot complex incidents and implement long-term solutions through code and automation Develop and improve automation pipelines to reduce manual operations and enhance system efficiency Contribute … to multiple strategic digital initiatives and collaborate across engineering domains Essential Skills and Experience Background in software engineering or telemetry, with current focus on SRE Extensive experience with public cloud platforms, particularly Google Cloud (or AWS/Azure) Proven ability to manage Kubernetes clusters in production environments Competence in scripting and development using languages such as Python, Java More ❯
Leeds, West Yorkshire, United Kingdom Hybrid / WFH Options
VIQU IT
Lead SiteReliability Engineer Hybrid/Remote – Once a month requirement in Leeds. Up to £80,000 per annum plus car allowance plus bonus. VIQU have partnered with a leading company within the supply chain industry who are seeking a Lead SiteReliability Engineer (AWS) to join and mentor their growing team. This position will lead … the organisations cloud infrastructure. This role is mostly remote, with monthly travel required to Leeds. Responsibilities of the Lead SiteReliability Engineer: Lead a team of four SRE’s, helping to maintain the stability of cloud platforms. Take on hands on technical responsibilities within AWS, utilising a range of cloud technologies (CI/CD, Container Orchestration, IaaS, Scripting … the Lead SiteReliability Engineer: Must have at least a years’ experience in managing technical teams, and over five years of experience in a hands on, technical SRE/Dev Ops Engineer role. Experience with CI/CD tools (Jenkins and Concourse CI ideally). Must hold experience within AWS and hold relevant AWS certifications (SA1, DOP-C02 More ❯
SiteReliability Engineer (SRE) Manager - Apple Services Engineering London, England, United Kingdom Software and Services Description Apple Service Engineering (ASE)'s Compute team is seeking highly motivated individual with strong technical and communication skills to join us in on our quest to build and enhance massive clusters hosting Virtual Machines, Containers and associated infrastructure that can … engage with the upstream community to drive Apple's requirements. Ultimately, you will help build the platform that delivers our applications at scale to our end users.As a Compute SiteReliabilityEngineering manager, you will be leading a team responsible for providing the platform for mission-critical cloud systems to maintain constant uptime, scale seamlessly, and allow … for new applications and services to flourish. Minimum Qualifications Extensive Leadership in Cloud Computing: In depth experience building and leading high-performing engineering teams, with a deep focus on cloud computing and hands-on experience across public and/or private cloud environments. Large-Scale Infrastructure Management: Proven ability to manage enterprise services in large-scale nix environments and More ❯
this business-critical environment. This senior leadership role will report to the Head of FIC Production & Reliability Engineering. You will spearhead the SiteReliabilityEngineering (SRE) function across FIC Technologydefining and executing the strategic vision while remaining embedded in managing and accountable for day-to-day production stability in the Rates & Credit business. You will be … responsible for the resilience, scalability, and performance of mission-critical trading and risk systemsparticularly within the Rates & Credit businesswhile also influencing SRE practices across FX, Repo, Emerging Markets, Listed Derivatives and other Core Platforms. You will work closely with senior business and technology stakeholders to shape the future of Production Engineering, while remaining deeply engaged in the technical architecture … the option to purchase additional days The opportunity to support a wide ranging CSR programme + 2 days volunteering leave per year Your key responsibilities Define and drive the SRE strategy across FIC Technology, aligning reliability goals with business priorities and regulatory expectations Lead the transformation of production support into a proactive, data-driven engineering discipline focused on More ❯
to £95,000 + Bonus + Shares Watford (Hybrid) Method Resourcing are proud to be partnering with a fast-growing, international technology business delivering critical services across multiple high-reliability sectors. They're seeking a Head of Delivery Enablement who can … ensure cohesive, end-to-end delivery across architecture, DevOps, quality assurance, and project delivery. Role Overview: Acting as the Technical Product Owner for SiteReliabilityEngineering (SRE), you'll manage the technical backlog to balance future strategic initiatives with feedback from engineering teams. You will guide DevOps engineers through the full delivery lifecycle, lead the development … strategic work, align on tooling, and drive improvements in observability, automation, and testing. Ideal Experience & Skills Demonstrated technical leadership across diverse skillsets, including SiteReliabilityEngineering (SRE), DevOps, and Quality Assurance (QA) Proven track record of aligning and integrating cross-functional technical teams and complex systems Strong stakeholder management skills with the ability to influence decisions and More ❯
West Midlands, England, United Kingdom Hybrid / WFH Options
MYO Talent
SiteReliability Engineer/SRE/Dynatrace/Observational Monitoring Tools/Automation/Grafana Labs/InfluxDB tools/Software/Network/Remote based/6 month contract/£500 – 650 per day Inside IR35. One of our leading clients is looking to recruit a SiteReliability Engineer (SRE) with strong Dynatrace experience. Location … remote Duration – 6 months Day rate – £500 – 650 per day Experience: Must have experience working as a SRE/SiteReliability Engineer Must have strong Dynatrace experience Strong reliability, performance, and availability of systems, leveraging Dynatrace for monitoring and troubleshooting Dynatrace delivery, support and implementation Installation and Configuration, Performance Analysis, Incident Response, Automation Experience with modern observability More ❯
Solihull, West Midlands, United Kingdom Hybrid / WFH Options
MYO Talent
SiteReliability Engineer/SRE/Dynatrace/Observational Monitoring Tools/Automation/Grafana Labs/InfluxDB tools/Software/Network/Remote based/6 month contract/£500 650 per day Inside IR35. One of our leading clients is looking to recruit a SiteReliability Engineer (SRE) with strong Dynatrace experience. Location … remote Duration 6 months Day rate £500 650 per day Experience: Must have experience working as a SRE/SiteReliability Engineer Must have strong Dynatrace experience Strong reliability, performance, and availability of systems, leveraging Dynatrace for monitoring and troubleshooting Dynatrace delivery, support and implementation Installation and Configuration, Performance Analysis, Incident Response, Automation Experience with modern observability More ❯
application performance - identifying, and implementing, improvements to application performance and stability. Collaborate with the design and implementation of the desired pipelines and process for deployment to production environment. The SRE will work closely with Platform and Software domains to ensure continuous improvement of performance and stability whilst adhering to standards. Undertake ad-hoc projects and other activities as required. Key … Accountabilities and Activities Contribute to the SRE function including: Drive evolution of the DevOps/GitOps toolchain, promoting improvements to streamline the software delivery process and showing improvements through metrics. Accountable for halting or stopping a project/product if the solution is not technically acceptable. Responsible for producing and maintaining documentation relating to application design, integration processes, testing procedures … to create operational run and playbooks. Integration with Domains including: Collaborating with Domains to plan, design, test and maintain the application. Design patterns for any component or structure under SRE responsibility. Implementation of components such as Monitoring and Logging. Manage the runbook preparations of Domains. Liaise and support other teams on work items including: Developing, refining, and tuning integrations between More ❯
Has anyone actually ever given you a good description of what SRE is? Recently I've met dozens of companies implementing an SRE function. Half are just rebranding an ops team (because Ops ain't cool), some don't want to call the additional silo they have created 'DevOps' (because apparently that's the wrong thing to do) so they … re calling it SRE and the rest actually don't really know how to describe what they're doing. And if you can't describe it simply, you don't know what it is, chief (because Google do it, isn't the right answer). That was until today, when I met a company who actually white boarded their vision … process rather than the build. We discussed Kubernetes, Prometheus and API Gateways. Most importantly, they spoke like they knew what the hell they were on about. Not just about SRE, but on the whole Engineering process. This is a company with at the top of their game, who are about to introduce a brand new monitisation model to a More ❯