excellence Develop and implement strategic plans to enhance the reliability, scalability, and efficiency of our infrastructure Collaborate with cross-functional teams to align SRE initiatives with broader organizational goals Establish and maintain SLIs, SLOs, and SLAs for critical systems and services Drive the adoption of best practices in automation … and management solution that helps organizations harness AI's potential while ensuring governance, security, compliance, and control. Experience Requirements: Proven experience in a senior SRE role or similar. Strong knowledge of cloud technologies and SLA SLO SLI management. Experience leading teams and implementing SCRUM processes. Excellent communication and leadership skills. … Experience line managing, mentoring, and coaching. Responsibilities: Collaborate with the Principal SRE to shape and implement the SRE strategic plan. Lead the SRE team in translating strategy into actionable plans, coordinating these through the SCRUM process. Address wellbeing and performance concerns, fostering a positive and productive team environment. Work with More ❯
Cambridge, Cambridgeshire, United Kingdom Hybrid / WFH Options
AI Tech Suite
excellence Develop and implement strategic plans to enhance the reliability, scalability, and efficiency of our infrastructure Collaborate with cross-functional teams to align SRE initiatives with broader organizational goals Establish and maintain SLIs, SLOs, and SLAs for critical systems and services Drive the adoption of best practices in automation … and management solution that helps organizations harness AI's potential while ensuring governance, security, compliance, and control. Experience Requirements: Proven experience in a senior SRE role or similar. Strong knowledge of cloud technologies and SLA SLO SLI management. Experience leading teams and implementing SCRUM processes. Excellent communication and leadership skills. … Experience line managing, mentoring, and coaching. Responsibilities: Collaborate with the Principal SRE to shape and implement the SRE strategic plan. Lead the SRE team in translating strategy into actionable plans, coordinating these through the SCRUM process. Address wellbeing and performance concerns, fostering a positive and productive team environment. Work with More ❯
Cambridge, Cambridgeshire, East Anglia, United Kingdom
RedTech Recruitment
game-changing technology within their industry, with exciting scope for expansion into further industries. This role is looking for someone to work within the SRE team responsible for incident response and issue resolution. Location: Cambridge Salary: £32,000 £60,000 + excellent benefits (£32,000 for a new Graduate) Requirements … problem solving identifying the root causes of issues. Good logical reasoning Responsibilities for SiteReliabilityEngineer Graduate Considered: Working within the SRE team you will be responsible for the architecture of a mission-critical cloud platform for an industry-leading software company. You will be diagnosing issues … been removed by the job-board, full details for contact are available on our website). Keywords- SiteReliabilityEngineer/SRE/DevOps/Software Engineering/Software Development/Engineering/Physics/Astrophysics/Python/Computer science/Cloud/Mathematics/AWS More ❯
About this Opportunity Great opportunity for a Senior SiteReliabilityEngineer to join our Financial Wellbeing Platform. As a Senior SRE you'll be responsible for ensuring our products run reliably, are scalable, and perform optimally in production environments. You'll monitor and manage these aspects … engineers What You'll Need Strong understanding of SiteReliability Engineering with commercial experience in working in a relevant environment and putting SRE principles into practice. Stakeholder management experience and the ability to guide and consult engineering teams Strong DevOps understanding, including experience of Infrastructure as Code and … CI/CD pipelines, such as Terraform and Jenkins, or alternatives such as GCP Cloud SRE experience and broad set of relevant product knowledge Knowledge of SLAs, SLOs and SLIs is essential along with the best practices for defining and implementing them. Confidence and capability to communicate complex technical problems More ❯
Services Provider who are looking for a SiteReliabilityEngineer to join their team where you will manage and guide the SRE team. As a SiteReliabilityEngineer, you will be responsible for: Manage and guide the SiteReliability Engineering (SRE) team … while promoting SRE principles throughout FCA product groups. Serve as the SRE subject matter expert and strategic lead within the delivery organization. Oversee and advise on daily operations related to observability tools, including their upkeep and optimization. Provide hands-on support to engineering teams for delivering observability initiatives, as needed. … the reliability of quality assurance results. Proven skills and experience to help you succeed in this role: Strong Experience with primary role of SREEngineer Strong experience in Devops Tools (Git Hub, Git Hub Actions, Workflow, CodeQL Jenkins, Nexus, CloudFormation/Terraform etc.) Strong experience in monitoring tool More ❯
Join us! Job Description: This job is responsible for partnering with engineering and technology teams to implement measures as prescribed by lead/senior SRE engineers. Key responsibilities include ensuring appropriate instrumentation, tooling, ticketing, alerting and on-call routines are in place for key services, identifying root causes of issues … scripts, tools and libraries and leverages them for common instrumentation, automation, and operational needs, and when mentoring SiteReliabilityEngineer (SRE) resources on reliability practices and established tools/capabilities. Collaborates with Development and Infrastructure teams to understand technical solutions and implement monitoring capabilities outlined … in the application and system monitoring designs put forward by the SRE Lead. Partners to implement code changes to make use of common reliability libraries and tools and helps Application Production Services and Application Development teammates understand how to use them. Identifies vulnerabilities and opportunities for reliability improvement More ❯
Graduate DevOps Engineer/SRE All top graduates with tech-related degrees should read this! If you have a passion for building things, love constantly solving interesting challenges and also enjoy some coding as well, then we would encourage you to explore a career in DevOps & SiteReliability … demand for this skill set is high, the role is interesting and varied and it is quite rare to see entry-level DevOps or SRE positions advertised. If you're already an experienced DevOps Engineer or SiteReliabilityEngineer we also really want to hear from … Salary: £35,000 - £70,000 per annum + excellent benefits (£35,000 for a new Gradaute, more DOE experience) Requirements for Graduate DevOpsEngineer/SRE: This company hires some of the very brightest engineers and is looking for a 2.1 or 1st class honours degree from a leading international University More ❯
SiteReliabilityEngineer London - 3 days in office mandatory Full-time permanent Up to £90,000 + 5 annual performance related bonus We have an exciting new opportunity for a SiteReliabilityEngineer to Join Robert Walters as a Consultant. As an Employed Consultant … rapidly growing field in technology and applying your skillsets to drive innovation and modernize the world's most complex and mission-critical systems! As SRE you will join a Global Investment Bank within the SRE Payments Technology Team in the Corporate & Investment Bank line of business, you will solve complex … and iteratively improve on existing solutions. You are a significant contributor to your team by sharing your knowledge of end-to-end operations, availability, reliability, and scalability of your application or platform. Job responsibilities: * Guides and assists others in the areas of building appropriate level designs and gaining consensus More ❯
SR2 | Socially Responsible Recruitment | Certified B Corporation™
Cloud SRE (SiteReliabilityEngineer) 2 days a week onsite in Bristol Up to £105,000 depending on experience Brilliant Benefits Package One of SR2’s standout clients is on the lookout for a skilled Cloud SRE (SiteReliabilityEngineer) to join their growing … member of a cloud-focussed team of super talented engineers working with the teams within the business to influence and drive the adoption of SRE best practices and ways of working specifically within microservices. You will ideally be from an engineering background – this is a through and through engineering focussed More ❯
Job Title: SiteReliabilityEngineer Location: London, UK Rate/Salary: .00 GBP Daily Job Type: Contract We at TEKsystems are on the lookout for a SiteReliabilityEngineer to support one of our market leading clients. Key purpose from the role - Release and … the following experience: Managed/implement large scale distributed server systems within Azure Knowledge of languages such as PowerShell, C# Problem solving as a SiteReliabilityEngineer Worked on modern release pipelines - CI/CD (Octopus Deploy/Azure DevOps/TeamCity) Interested? Please apply within or More ❯
enabler of Capital One's ambitions. We are keen to add a Senior SiteReliability Engineering Manager (SSREM) to our Nottingham based SRE organisation whose primary focus is to provide effective leadership as we evolve and mature sitereliability practices for the benefit of our cloud … applications and their customers. The successful candidate will be a leader of leaders with custodianship of application services across 5+ SRE teams. We're looking for an experienced professional whose technical background allows effective challenge and support of teams managing primarily Java based applications running in a dynamic IaaC AWS … outcomes in the pursuit of business, functional and personal goals. The successful application will lead by example, build strong and valuable relationships within the SRE org, wider tech and business stakeholders. They have the ability to face ambiguity and understand how to make sense of complexity, importantly being able to More ❯
The SiteReliabilityEngineer (SRE) will play a key role in maintaining and scaling infrastructure, ensuring reliability, performance, and scalability. You will collaborate closely with development, operations, and security teams to improve the reliability and efficiency of applications, addressing incidents, automating processes, and managing infrastructure … GCP, or Azure), automate provisioning using Terraform or CloudFormation, and manage resources for optimal performance. Monitor, troubleshoot, and resolve incidents, optimizing systems to ensure reliability and minimize downtime. Implement monitoring (Prometheus, Grafana, Datadog) and set up alerting systems to proactively address issues and ensure scalability. Work with DevOps, engineering … AWS Certified DevOps Engineer, Google Professional Cloud Architect, or similar. Containerization & Orchestration : Experience with Docker, Kubernetes, or ECS/EKS for containerized applications. SRE Experience : Familiarity with SRE principles like SLAs, SLOs, and error budgets, and practical application of those in large-scale systems. Distributed Systems : Understanding of microservices More ❯
up, we want like-minded humans to join us on this exciting journey. Are you ready? As a SiteReliabilityEngineer (SRE), you will play an important role in designing, building, and maintaining the infrastructure and tools necessary to support our software applications and services. You will … collaborate closely with the product engineering squads, technical operations, and security teams to ensure the reliability, scalability, and security of our platform. Your responsibilities will include automating infrastructure provisioning, configuration management, and deployment pipelines, utilizing best practices and modern technologies to streamline processes and improve efficiency. You will also … be responsible for monitoring system performance, identifying bottlenecks, and implementing solutions to enhance system reliability and performance. Key Responsibilities Cloud Platform Management: Using Azure/AWS to manage and optimize infrastructure components, ensuring scalability, reliability, and cost management. Infrastructure Design and Implementation: Designing, building and maintaining the cloud More ❯
City of London, London, United Kingdom Hybrid / WFH Options
Sanderson Recruitment
Role: SiteReliabilityEngineer Location: London (Hybrid) Salary: £80,000 - £105,000 As our SiteReliabilityEngineer, you'll work closely with our feature team and other colleagues to meet defined service level objectives and continually improve systems and environments. You'll define error … Candidate: Very strong engineering skills in Java,JavaScript or Python Open Telemetry experience Must have Core Java/Python Must have experience as an SRE knowledge of Python Data Structures Strong knowledge of deploy and release services, automation and troubleshooting Experience of utilising tools and technology across the software development More ❯
Reading, England, United Kingdom Hybrid / WFH Options
People Source Consulting trading as Experis
SiteReliabilityEngineer - DevOps Engineer 18 Month Contract PAYE - Fully Remote/or Hybrid based in Midlands if preferred. The role We are working with one of the finest gaming studios in the industry and are on the lookout for an … exceptional SiteReliabilityEngineer who can bring their expertise and unique thinking to help make their team even stronger! As an SRE the main purpose is solving for scale through collaboration and automation, bringing engineering principles to infrastructure and operational problems. Work closely with the different teams More ❯
purpose. About this opportunity Great opportunity for a Senior SiteReliabilityEngineer to join our Financial Wellbeing Platform. As a Senior SRE you'll be responsible for ensuring our products run reliably, are scalable, and perform optimally in production environments. You'll monitor and manage these aspects … engineers What you'll need Strong understanding of SiteReliability Engineering with commercial experience in working in a relevant environment and putting SRE principles into practice. Stakeholder management experience and the ability to guide and consult engineering teams Strong DevOps understanding, including experience of Infrastructure as Code and … CI/CD pipelines, such as Terraform and Jenkins, or alternatives such as GCP Cloud SRE experience and broad set of relevant product knowledge Knowledge of SLAs, SLOs and SLIs is essential along with the best practices for defining and implementing them. Confidence and capability to communicate complex technical problems More ❯
availability, reliability, security, and velocity. Maintain effective feedback loops so that findings can be prioritised and acted upon in a timely fashion. Follow SRE and DevOps core principles to drive adoption and utilisation. Where applicable, work with IT Services and Delivery functions to implement technical releases and maintenance plans … our Promises: Taking ownership to deliver Setting clear expectations Respecting others Skill/Experience Strong background in one or more of the following areas: SRE/application support/IT operations/infrastructure/software development/DevOps. Experience working within both Agile and ITIL frameworks. Experience working with DevOps … principals and concepts such as CI/CD and IaC. Experience of SRE environments and processes specifically in the areas of availability, incident management and monitoring. Excellent analytical and problem-solving skills. Effective communication skills, both written and verbal. Ability to work well in high-pressure situations. Experience using Azure More ❯
availability, reliability, security, and velocity. · Maintain effective feedback loops so that findings can be prioritised and acted upon in a timely fashion. · Follow SRE and DevOps core principles to drive adoption and utilisation. · Where applicable, work with IT Services and Delivery functions to implement technical releases and maintenance plans … our Promises: · Taking ownership to deliver · Setting clear expectations · Respecting others Skill/Experience · Strong background in one or more of the following areas: SRE/application support/IT operations/infrastructure/software development/DevOps. · Experience working within both Agile and ITIL frameworks. · Experience working with DevOps … principals and concepts such as CI/CD and IaC. · Experience of SRE environments and processes specifically in the areas of availability, incident management and monitoring. · Excellent analytical and problem-solving skills. · Effective communication skills, both written and verbal. · Ability to work well in high-pressure situations. · Experience using Azure More ❯
Job Description SiteReliabilityEngineer with Python Our Client looking to bring on a sitereliabilityengineer to help deploy, manage, troubleshoot, and enhance our complex cloud-based set of internal tools and externally managed services for a variety of users across our wide … ranging organization. You will have at least 7 to 10 years hands-on expertise working as a SiteReliability Engineer. You will work closely with IT, product, and engineering to extend and maintain this set of tools and services and to help debug and resolve problems. In addition More ❯
Job Title: Infrastructure SiteReliabilityEngineer Location: Remote, occasional travel to the London office is required Duration: 3 months (with potential to extend) Day rate: Competitive (Inside IR35) Deloitte Working with the Deloitte Associate (Contractor) Programme means we can offer you the opportunity to work on a … suit your experience should you wish to continue with Deloitte. The Role An exciting opportunity for a skilled SiteReliabilityEngineer (SRE) to work with the security team of a large Social Media Platform organisation. You will design, implement and maintain a robust security log migration pipeline … health, analyse data to identify and prevent potential problems while implementing best practices to keep services running efficiently. Essential Skills: We are looking for SRE’s that can; Code without utilising scripts Have software development experience, Write and execute scripts using the Bash Shell , Experience in live support monitoring as More ❯
The SiteReliability Engineering (SRE) team at Pendo is responsible for provisioning and maintaining cloud infrastructure from development through production for all product initiatives, and working with developers and product managers to ensure that our products are not only reliable and performant, but also cost-efficient. Our platform … on-call and incident management functions, supporting a high-throughput platform which processes more than 15 billion events per day. To ensure the reliability of this environment for our customers, SREs work closely with developers and product managers to understand service level objectives, think through failures scenarios, and design … systems which balance cost with reliability objectives. Additionally, SREs collaborate with the Information Security team to ensure that cloud infrastructure is properly secured, and that sufficient controls are in place to meet our compliance goals with respect to industry standards such as SOC 2. Role Responsibilities Write high-quality More ❯
Manchester, Lancashire, United Kingdom Hybrid / WFH Options
bet365 Group
A SiteReliabilityEngineer, who will enhance system reliability, observability and performance through a strong engineering approach and assist with incident resolution and best practices. You will have software engineering skills, focusing on system reliability and observability. You will monitor the health, performance and availability … of critical systems, directly impacting operational efficiency. Using your engineering expertise, you will implement solutions that enhance reliability, including service instrumentation with tools such as Open Telemetry, improve logging practices and develop features for maintainability. You will also help engineer tools and automation for effective service management. Collaboration … is key, working across multiple functions to integrate reliability and observability best practices into the software development life cycle. By supporting governance standards set by the central teams, you will foster a culture where these principles are integral to development. Your contributions will ensure our systems meet user demands More ❯
nurture others and learn from them, then this is your challenge! The Team The Infrastructure as a Service (IaaS) team aims at upholding the reliability and scalability we expect from Algolia's infrastructure for its critical systems and products. Our focus is on enabling teams across Algolia to leverage … this infrastructure while keeping it under control through an always increasing level of automation. The Opportunity The Senior SiteReliabilityEngineer position within the IaaS team provides a dynamic opportunity for a professional with foundational experience in maintaining and optimising scalable infrastructures. This role specifically concentrates … on three key areas: Server and container hosting, cloud and network expertise and flawless observability. As a Senior SiteReliabilityEngineer (SRE) , you will play a pivotal role in designing, implementing, and maintaining highly available, scalable, and fault-tolerant systems. Your work will directly impact the effectiveness More ❯
SiteReliabilityEngineer - DevOps Engineer 18 Month Contract PAYE - Fully Remote/or Hybrid based in Midlands if preferred. The role We are working with one of the finest gaming studios in the industry and are on the lookout for an … exceptional SiteReliabilityEngineer who can bring their expertise and unique thinking to help make their team even stronger! As an SRE the main purpose is solving for scale through collaboration and automation, bringing engineering principles to infrastructure and operational problems. Work closely with the different teams More ❯
SiteReliabilityEngineer - DevOps Engineer 18 Month Contract PAYE - Fully Remote/or Hybrid based in Midlands if preferred. The role We are working with one of the finest gaming studios in the industry and are on the lookout for an … exceptional SiteReliabilityEngineer who can bring their expertise and unique thinking to help make their team even stronger! As an SRE , the main purpose is solving for scale through collaboration and automation, bringing engineering principles to infrastructure and operational problems. Work closely with the different teams More ❯