they are already renowned as having game-changing technology within their industry, with exciting scope for expansion into further industries. This role is looking for a Graduate or experienced SRE professional to work within the SRE team responsible for incident response and issue resolution. Location: Cambridge Salary: £32,000 - £70,000 per annum + excellent benefits including private healthcare (could … be more available for an experienced SRE) Requirements for SiteReliability Engineer - Graduate Considered: Excellent academics including 2.1 or 1st class honours degree from a leading international University in a STEM subject A minimum of AAB at A-Level or international equivalent if applying at Graduate level Any experience working an incident response or technical support environment would … the knowledge this role will not lead to a role in the R&D/Software teams Responsibilities for SiteReliability Engineer - Graduate Considered: Working within the SRE team you will be responsible for the architecture of a mission-critical cloud platform for an industry-leading software company. You will diagnose issues within complex systems, identify root causes More ❯
we're looking for a SiteReliability & Platform Engineer to help lead the way. You'll sit at the heart of our engineering operations, bringing together SRE principles and modern platform engineering practices. This includes combining principles of SRE - such as service-level reliability, observability, incident response - with platform engineering practices like GitOps, Infrastructure … enablement, to help development teams ship faster, safer, and more cost-efficiently. What you'll be doing: Designing and operating highly reliable, scalable, and secure Azure-based platforms Applying SRE principles like SLOs, observability, and incident management to drive service reliability Building Infrastructure as Code using Terraform (v1.7+) and GitOps workflows Enabling teams through platform tools, reusable Terraform modules … This is a great opportunity for someone passionate about building robust infrastructure and enabling others to move faster and more securely. You might come from a cloud engineering, SRE, or DevOps background - what matters most is your curiosity, systems thinking, and drive to improve operational efficiency. At Sorted, we are committed to fostering an inclusive environment where people from More ❯
Crawley, England, United Kingdom Hybrid / WFH Options
Spectrum IT Recruitment
recruiter to learn more. Base pay range Direct message the job poster from Spectrum IT Recruitment Account Manager @ Spectrum IT | Recruitment, Customer Service SiteReliabilityEngineering (SRE) Lead - Up to £110,000 + Bonus & Benefits Location: Crawley, Hybrid (1 day a week onsite) Are you ready to lead a high-performing team and play a pivotal role … in shaping the future of infrastructure and reliabilityengineering? We're looking for an experienced SiteReliabilityEngineering (SRE) Team Lead/Technical Lead who is passionate about driving operational excellence and has a clear vision for building scalable, resilient systems. In this role, you'll lead from the front; mentoring a team of DevOps … securely. What You'll Be Doing: Lead, inspire, and develop a growing team of DevOps Engineers, setting clear goals and driving team performance. Define and deliver the company's SRE strategy, aligning it with wider technical and business goals. Collaborate across engineering, product, and security teams to ensure operational excellence and seamless delivery. Architect, implement, and manage secure, scalable More ❯
looking for a SiteReliability Engineer to join their highly skilled, innovative team. Essential skills: Strong proficiency in Python for infrastructure and automation Hands-on experience in SRE, DevOps or production engineering roles Deep understanding of monitoring, incident response workflows, and system architecture Productive approach to improving systems and reducing technical debt Strong collaboration and communication skills … working closely with developers, quants, and platform engineers Experience designing and delivering scalable, reliable production systems Proficiency with Linux/Unix systems Bachelor’s degree in CS, Engineering or a related field Familiarity with Kubernetes, Docker, or container orchestration technologies Experience with automation tools such as Terraform or Ansible Background in Go, Bash or other system-level languages Exposure … design and implement automation for operations, deployments, monitoring and incident management, as well as owning the observability stack (metrics, logs, traces and alerting). You will also: apply core SRE principles (SLIs, SLOs, error budgets) to enhance system reliability; build, document, and improve high-performance system designs; lead incident response and implement improvements; collaborate closely with quant developers/ More ❯
Knutsford, England, United Kingdom Hybrid / WFH Options
Jobs via eFinancialCareers
Join to apply for the SiteReliability Engineer - TEKsystems role at Jobs via eFinancialCareers 3 days ago Be among the first 25 applicants Join to apply for the SiteReliability Engineer - TEKsystems role at Jobs via eFinancialCareers Get AI-powered advice on this job and more exclusive features. SiteReliability Engineer – Oracle Specialist Site … UK (Hybrid – 2 days in office, Mon/Tue preferred) Note: Sponsorship is not available for this role We are seeking a highly skilled SiteReliability Engineer (SRE) with deep Oracle expertise to join a forward-thinking team driving digital transformation and operational excellence. About The Role As a SRE, you’ll apply software engineering principles, automation … scalability of critical systems. This role requires Oracle expertise from day one, with a strong focus on OEM, OID, and performance tuning. You’ll be part of a growing SRE team, working closely with colleagues across multiple global locations, and contributing to the evolution of a modern, automated infrastructure. Key Responsibilities Ensure system availability, performance, and scalability through proactive monitoring More ❯
are already renowned as having game-changing technology within their industry, with exciting scope for expansion into further industries. This role is looking for someone to work within the SRE team responsible for incident response and issue resolution. Location:Cambridge Salary:£32,000 £60,000 + excellent benefits (£32,000 for a new Graduate) Requirements For SiteReliability … of a role involving lots of problem solving identifying the root causes of issues. Good logical reasoning Responsibilities For SiteReliability Engineer Graduate Considered Working within the SRE team you will be responsible for the architecture of a mission-critical cloud platform for an industry-leading software company. You will be diagnosing issues within complex systems and identifying … emailing (if this email address has been removed by the job-board, full details for contact are available on our website). Keywords-SiteReliability Engineer/SRE/DevOps/Software Engineering/Software Development/Engineering/Physics/Astrophysics/Python/Computer science/Cloud/Mathematics/AWS/Azure/ More ❯
London, England, United Kingdom Hybrid / WFH Options
Wayve Technologies Ltd
The role We're on the lookout for a SiteReliability Engineer (SRE) with a thirst for innovation and a desire to establish Operational Excellence and best practices. You'll be instrumental in fortifying the backbone of our AI-driven autonomous vehicles, ensuring they're robust, resilient, and ready to revolutionise urban mobility. This role isn't just … Champion automation to continuously elevate our efficiency, aiming to make manual interventions a thing of the past. About you In order to set you up for success as an SRE at Wayve, we’re looking for the following skills and experience. Essential Over 8 years experience in SiteReliabilityEngineering or a similar role, especially in a More ❯
Liverpool, England, United Kingdom Hybrid / WFH Options
Concerto
Join to apply for the SiteReliability Engineer role at Concerto - property asset management system with CAFM 1 week ago Be among the first 25 applicants Join to apply for the SiteReliability Engineer role at Concerto - property asset management system with CAFM SiteReliability Engineer - Liverpool (Hybrid Working) As a SiteReliability Engineer at Concerto (part of Bellrock Group), you will play a pivotal role in ensuring the reliability, performance, and scalability of our Intelligent Assets Management SaaS platform. You will lead the improvement of … infrastructure, DevOps, and monitoring across our systems—empowering the engineering team to release features faster and more safely. Your hands-on experience and strategic thinking will help embed SRE principles throughout the team, improving customer experience, system health and developer productivity. You’ll work across internal environments and customer-facing systems, shaping operational excellence and reliability at every More ❯
London, England, United Kingdom Hybrid / WFH Options
LoyaltyLion
the beginning of our growth trajectory. The Role We are looking for a SiteReliability Engineer to join our team and support our growth. Working with our SRE Lead, you will ensure the reliability, availability, and performance of our platform's infrastructure and systems. You will also support our Data team with provisioning and tuning our Data … the things you'll be doing Delivering clean, architecturally sound, maintainable, and secure infrastructure. Working closely with AWS infrastructure, particularly data services, to support database scalability and availability. Supporting engineering teams with infrastructure needs and platform support. Conducting performance tuning to optimize database performance and data processing efficiency. Implementing observability systems for infrastructure and data to ensure reliability, find areas for improvement, and proactively assess risks. Maintaining infrastructure with code, using Terraform to optimize AWS infrastructure. Documenting and promoting DevOps best practices across the engineering team. Conducting proofs-of-concept on new technologies and evaluating their fit for LoyaltyLion. Participating in blameless post-mortems to learn from incidents and prevent recurrence. Automating manual tasks to enhance More ❯
Airways the sky is never the limit. The Role: SiteReliability Engineer Join our innovative Digital team at British Airways as a SiteReliability Engineer (SRE), where you'll play a key role in ensuring the reliability, availability, and performance of the applications that keep our business flying. This is a pivotal role that supports … the strategic direction of SRE at BA, aligning technical work with business outcomes and driving continuous improvement in service delivery. What You'll Do: Maintain and enhance the reliability, availability, and performance of British Airways' applications through robust monitoring and troubleshooting. Contribute to the implementation of the SRE strategy, ensuring alignment with business objectives and operational priorities. Collaborate with … approach, with a positive attitude toward change and challenges. Confidence in decision-making and a commitment to achieving strategic goals. Your Experience: Demonstrable experience in supporting or leading an SRE function, with an understanding of aligning engineering work with business needs. Proficiency in cloud platforms such as AWS, Azure, or Google Cloud, and container orchestration with Docker and Kubernetes. More ❯
Social network you want to login/join with: SiteReliability Engineer IOE: Cardano, gb col-narrow-left Client: IO Global Location: gb, United Kingdom Job Category: Other - EU work permit required: Yes col-narrow-right Job Reference: a9af62c65026 Job Views: 29 Posted: 17.06.2025 Expiry Date: 01.08.2025 col-wide Job Description: Who are we? IOHK, is a technology … progress within our teams, our products and services are designed for people to be fearless, to be changemakers. What the role involves: As a SiteReliability Engineer (SRE) you are an integral part of our open-source project, ensuring the reliability, availability, and performance of our production systems. This role combines service operation, systems engineering and … software engineering principles to operate and monitor services as well as create or maintain tools, automations, and infrastructure code that bolster the efficiency and resilience of our platform. Design, write, and deliver tools and software primarily using Python, Bash, Terraform or Nix to improve the availability, scalability, and efficiency of our services. Engage in and refine the whole lifecycle More ❯
generating results that allow our clients to thrive. What You'll Do The Senior Director - Operations and ReliabilityEngineering is responsible for blendingSite ReliabilityEngineering (SRE), DevOps, and traditional operations modelsto build a next-generationReliability Engineering function. This role ensuresend-to-end automation at scale, 24x7 operational excellence, and high availabilityacrossall of BCG, includingBCG Core … agility and operational resilience. Establish workforce development programs forAI-driven operations, automation, and modern reliability practices. What You'll Bring Required Qualifications: 15+ years of experiencein IT operations, SRE, DevOps, or platform engineering. 5+ years in a senior leadership role, managinglarge-scale IT environments. Deep technical expertise incloud computing (AWS, Azure, GCP), on-prem infrastructure, and hybrid environments. Proven … remediation. Strong understanding ofzero-trust security, regulatory compliance, and risk management. Excellent leadership, communication, and stakeholder management skills. Preferred Qualifications: Certifications:ITIL, AWS/Azure/GCP Solutions Architect, SRE Foundation, CISSP, or equivalent. Experience withKubernetes, Terraform, Ansible, and AI-powered operations tools. Strong problem-solving abilities, with a data-driven approach to operational excellence. TheSenior Director - Operations Platform Leadis More ❯
generating results that allow our clients to thrive. What You'll Do The Senior Director – Operations and ReliabilityEngineering is responsible for blendingSite ReliabilityEngineering (SRE), DevOps, and traditional operations modelsto build a next-generationReliability Engineering function. This role ensuresend-to-end automation at scale, 24x7 operational excellence, and high availabilityacrossall of BCG, includingBCG Core … agility and operational resilience. Establish workforce development programs forAI-driven operations, automation, and modern reliability practices. What You'll Bring Required Qualifications: 15+ years of experiencein IT operations, SRE, DevOps, or platform engineering. 5+ years in a senior leadership role, managinglarge-scale IT environments. Deep technical expertise incloud computing (AWS, Azure, GCP), on-prem infrastructure, and hybrid environments. Proven … remediation. Strong understanding ofzero-trust security, regulatory compliance, and risk management. Excellent leadership, communication, and stakeholder management skills. Preferred Qualifications: Certifications:ITIL, AWS/Azure/GCP Solutions Architect, SRE Foundation, CISSP, or equivalent. Experience withKubernetes, Terraform, Ansible, and AI-powered operations tools. Strong problem-solving abilities, with a data-driven approach to operational excellence. TheSenior Director – Operations Platform Leadis More ❯
generating results that allow our clients to thrive. What You'll Do The Senior Director – Operations and ReliabilityEngineering is responsible for blendingSite ReliabilityEngineering (SRE), DevOps, and traditional operations modelsto build a next-generationReliability Engineering function. This role ensuresend-to-end automation at scale, 24x7 operational excellence, and high availabilityacrossall of BCG, includingBCG Core … agility and operational resilience. Establish workforce development programs forAI-driven operations, automation, and modern reliability practices. What You'll Bring Required Qualifications 15+ years of experiencein IT operations, SRE, DevOps, or platform engineering. 5+ years in a senior leadership role, managinglarge-scale IT environments. Deep technical expertise incloud computing (AWS, Azure, GCP), on-prem infrastructure, and hybrid environments. Proven … remediation. Strong understanding ofzero-trust security, regulatory compliance, and risk management. Excellent leadership, communication, and stakeholder management skills. Preferred Qualifications Certifications:ITIL, AWS/Azure/GCP Solutions Architect, SRE Foundation, CISSP, or equivalent. Experience withKubernetes, Terraform, Ansible, and AI-powered operations tools. Strong problem-solving abilities, with a data-driven approach to operational excellence. TheSenior Director – Operations Platform Leadis More ❯
Location: London, England, United Kingdom Join Axon and be a Force for Good. As an SRE contributor in Axon's Real Time Operations organization, you are passionate about delivering solutions to the real-time problems our mission-critical cloud native services encounter. You are also obsessed about achieving the high quality and reliability our customers demand. You will work … closely not only with your peers, but also the RTO engineering teams, allowing your technical deliverables to reach the entire engineering organization, enabling product teams to continuously deliver features on the vanguard of innovation and helping scale our products to thousands of agencies around the world. What You'll Do Location: London UK Build robust, easy-to-use … foundational platforms and tools that enable engineering teams to provision services rapidly, consistently, and securely. Exemplify cloud-native sitereliability best practices. Write code that is performant, maintainable, clear, and concise. Employ strong problem-solving skills, with the ability to debug problems in cloud-native distributed systems. Influence and educate the engineering organization to adopt new More ❯
the box and eager to nurture others and learn from them, then this is your challenge! The Team The Infrastructure as a Service (IaaS) team aims at upholding the reliability and scalability we expect from Algolia's infrastructure for its critical systems and products. Our focus is on enabling teams across Algolia to leverage this infrastructure while keeping it … under control through an always increasing level of automation. The Opportunity The Senior SiteReliability Engineer position within the IaaS team provides a dynamic opportunity for a professional with foundational experience in maintaining and optimising … scalable infrastructures. This role specifically concentrates on three key areas: Server and container hosting, cloud and network expertise and flawless observability. As a Senior SiteReliability Engineer (SRE) , you will play a pivotal role in designing, implementing, and maintaining highly available, scalable, and fault-tolerant systems. Your work will directly impact the effectiveness and productivity of teams and More ❯
London, England, United Kingdom Hybrid / WFH Options
AudioStack
We're on a mission to democratize audio creation by building world-class audio infrastructure for our customers. As a SiteReliability Engineer, you'll play a key role in improving our platform's developer operations including observability … monitoring, and overall reliability. You would be part of a cross-functional team dedicated to implementing robust DevOps practices and enhancing infrastructure and sitereliabilityengineering (SRE). A customer-focused mindset is essential, as the team collaborates closely with stakeholders to ensure solutions meet business and user needs. In addition to a focus on observability, you … type Full-time Job function Job function Information Technology Industries IT Services and IT Consulting Referrals increase your chances of interviewing at AudioStack by 2x Get notified about new SiteReliability Engineer jobs in London, England, United Kingdom . London, England, United Kingdom 2 weeks ago Hounslow, England, United Kingdom 1 week ago London, England, United Kingdom More ❯
are passionate about building unified IT solutions that simplify the way IT organizations work. We are currently looking for a Senior SiteReliability Engineer to join our SRE team in the Platform Engineering organization and help us scale our products to millions of end-users. We are looking for individuals with a passion for automation and observability … and SOP's Develop software, scripts, or tooling to improve efficiency and reduce delivery time of applications and infrastructure Other duties as needed About You 7+ years' experience in SiteReliability Engineer roles 3+ years' experience with an object-oriented language (preferably Java, .NET or C++) Expert+ level Linux administration, scripting, and troubleshooting Demonstrable knowledge of Observability tools More ❯
are passionate about building unified IT solutions that simplify the way IT organizations work. We are currently looking for a Senior SiteReliability Engineer to join our SRE team in the Platform Engineering organization and help us scale our products to millions of end-users. We are looking for individuals with a passion for automation and observability … and SOP's Develop software, scripts, or tooling to improve efficiency and reduce delivery time of applications and infrastructure Other duties as needed About You 7+ years' experience in SiteReliability Engineer roles 3+ years' experience with an object-oriented language (preferably Java, .NET or C++) Expert+ level Linux administration, scripting, and troubleshooting Demonstratable knowledge of Observability tools More ❯
Join to apply for the SR SiteReliability Engineer role at Wakapi . We are seeking a highly skilled Senior SiteReliability Engineer to join our Platform Engineering team. The ideal candidate will have a strong understanding of DevOps and Service Level Management (SLM) metrics, with experience in event-driven infrastructure projects using tools like … Terraform, New Relic, Kubernetes, AWS, and Kafka. As a Platform Engineering representative, you will collaborate with engineering teams to ensure our platform infrastructure tooling meets their needs and positively impacts Developer Experience. You will also assist in setting appropriate thresholds for alerts and automations related to their applications. Responsibilities Design, implement, and maintain scalable and highly available systems … ensuring observability through metrics, tracing, log aggregation, and alerting. Help teams determine settings and thresholds for alerts and automations based on application performance requirements. Monitor, optimize, and ensure system reliability and performance using tools like New Relic and applying DORA metrics. Track uptime, response times, and resolution times to ensure compliance with SLAs, SLOs, and SLIs. Implement and promote More ❯
with the subject line: “Application Support Request”. Role: Senior SiteReliability Engineer Location: London Job Type: Permanent Are you looking to take your SRE skills to the next level? We’ve got a great opportunity for you – Senior SiteReliability Engineer Careers at TCS: It means more TCS is a purpose-led transformation company, built … are built, not just maintained. Use data to prevent problems, not just react to them. Partner across teams to make performance, scalability, and user experience part of the whole engineering mindset. The Role As a Senior SiteReliability Engineer , you will be playing a key role in operational support, integration of applications and building and maintaining infrastructure. More ❯
with the subject line: “Application Support Request”. Role: Senior SiteReliability Engineer Location: London Job Type: Permanent Are you looking to take your SRE skills to the next level? We’ve got a great opportunity for you – Senior SiteReliability Engineer Careers at TCS: It means more TCS is a purpose-led transformation company, built … are built, not just maintained. Use data to prevent problems, not just react to them. Partner across teams to make performance, scalability, and user experience part of the whole engineering mindset. The Role As a Senior SiteReliability Engineer , you will be playing a key role in operational support, integration of applications and building and maintaining infrastructure. More ❯
Social network you want to login/join with: Senior SiteReliability Engineer, London col-narrow-left Client: Location: London, United Kingdom Job Category: Other - EU work permit required: Yes col-narrow-right Job Reference: 5c1405419a00 Job Views: 7 Posted: 02.06.2025 Expiry Date: 17.07.2025 col-wide Job Description: As a Senior SiteReliability Engineer, you will … be working alongside our autonomous cross-functional squads. You will advocate high-quality engineering and best-practice in production software as well as providing the infrastructure to both build rapid prototypes and launch production-quality services. You must be a strong communicator who can explain what is required to build and deliver top quality software products. You will be … authentication, network topology, sharded databases, scalable web services and interfaces to external data sources and APIs. Responsibilities: Implementing software solutions for cloud infrastructure in accordance with specification and best engineering practices. Working towards improving long-term infrastructure availability and reliability. Monitoring and handling incident response of the infrastructure, platforms and core engineering services. Constructing pipelines to automate infrastructure More ❯
automotive software development. The right candidate will have excellent communication skills, solid coding skills, expertise in building scalable, reliable, highly available and fault-tolerant systems, broad knowledge of software engineering and sitereliabilityengineering in areas such as Large-Scale Data and Compute Infrastructure, Stream Processing, Kubernetes, High-Performance Networking, Observability and Infrastructure Automation. RESPONSIBILITIES Set … maintain, optimize and support large scale, multi-region, multi-cloud compute and storage infrastructure powering our data platform and mission critical services. Work with fellow Data Infrastructure engineers and SiteReliability engineers to ensure our systems are scalable, reliable, fault-tolerant, highly available, highly performant, and observable. Manage incidents, triage product or system issues and debug/track …/resolve by analyzing the root cause of these issues and the impact on users & operations. Work closely with other Data Infrastructure engineers, SiteReliability engineers, ML Platform engineers, Computer Vision and ML engineers on high-impact projects to create innovative solutions to problems in the self-drive space. Mentor junior engineers in their day to day work More ❯
SiteReliability Engineer (DV Security Clearance) Position Description CGI was recognised in the Sunday Times Best Places to Work List 2025 and has been named one of the 'World's Best Employers' by Forbes magazine. We offer a competitive salary, excellent pension, private healthcare, plus a share scheme (3.5% + 3.5% matching) which makes you a member not … agencies most challenging problems. Our teams work alongside our clients to help them understand how to exploit technologies to maintain competitive advantage. Our systems are engineered for performance, security, reliability and scalability; built with modern CI and CD tooling and techniques. We are currently looking for an experienced cloud infrastructure engineer to join our team - being able to think … all of the skills we need, we would consider high quality individuals who meet most of the criteria. Required qualifications to be successful in this role • Background in Software Engineering, including the development of automation scripts, infrastructure as code, creating tooling or frameworks and feature development, ideally using Java and/or python. • Experience of engineering enablement products More ❯