SiteReliability Engineer – (SRE, Terraform, AKS, Azure, Kubernetes, PowerShell, Python, Bash, Datadog, Monitoring Tools) – Permanent – Remote Location: Remote (occasional travel to Nottinghamshire HQ) Salary: Up to £85,000 per annum + benefits Start Date: ASAP Charles Simon Associates are working with a global organisation who are looking to recruit a SiteReliability Engineer (SRE) on a … permanent basis. This is an exciting opportunity to join a forward-thinking business where reliability, scalability, and automation are at the heart of technology delivery. Responsibilities include: Designing and enforcing SLOs, SLIs, and SLAs to ensure high reliability and performance. Building and maintaining monitoring/observability solutions (Datadog, Grafana, Azure Application Insights, Log Analytics). Managing Infrastructure as … ReliabilityEngineering and want to work in an environment where “that will do” is never good enough, this role is for you. SiteReliability Engineer – (SRE, Terraform, AKS, Azure, Kubernetes, PowerShell, Python, Bash, Datadog, Monitoring Tools) – Permanent – Remote More ❯
SiteReliability Engineer – (SRE, Terraform, AKS, Azure, Kubernetes, PowerShell, Python, Bash, Datadog, Monitoring Tools) – Permanent – Remote THIS IS AN AZURE FOCUSED ROLE, IF YOU APPLY AND DO NOT WORK EITHER SOLEY OR MAINLY ON AZURE YOU WILL NOT BE CONSIDERED. Location: Remote (occasional travel to Nottinghamshire HQ) Salary: Up to £95,000 per annum + benefits Start Date … ASAP Charles Simon Associates are working with a global organisation who are looking to recruit a SiteReliability Engineer (SRE) on a permanent basis. This is an exciting opportunity to join a forward-thinking business where reliability, scalability, and automation are at the heart of technology delivery. Responsibilities include: Designing and enforcing SLOs, SLIs, and SLAs to … ReliabilityEngineering and want to work in an environment where “that will do” is never good enough, this role is for you. SiteReliability Engineer – (SRE, Terraform, AKS, Azure, Kubernetes, PowerShell, Python, Bash, Datadog, Monitoring Tools) – Permanent – Remote More ❯
Wigan, Lancashire, England, United Kingdom Hybrid/Remote Options
Searchability
SITERELIABILITY ENGINEER £40k salary Join a growing, technology-driven business operating at scale within the online gaming and sports sector. Opportunity to shape the SRE strategy. ABOUT THE CLIENT Our client is a fast-growing digital technology company at the forefront of delivering high-availability platforms for the sports and gaming industry. They pride themselves on innovation … Engineer to strengthen their engineering function and help evolve their observability and automation capabilities. THE BENEFITS Hybrid working model (office and remote) Opportunity to define and lead SRE strategy within a collaborative culture Exposure to modern cloud-native and containerised environments THE SITERELIABILITY ENGINEER ROLE: As a SiteReliability Engineer, you'll ensure … performance testing, system tuning and incident management to ensure smooth operation during critical events. SITERELIABILITY ENGINEER ESSENTIAL SKILLS At least 2 years' experience working as an SRE Deep understanding of system reliability, scalability and performance tuning Experience with observability tools (Grafana, Prometheus, OpenTelemetry) Proficiency in a programming language such as Go or .NET for automation and More ❯
Wigan, Greater Manchester, United Kingdom Hybrid/Remote Options
Searchability (UK) Ltd
SITERELIABILITY ENGINEER £40k salary Join a growing, technology-driven business operating at scale within the online gaming and sports sector. Opportunity to shape the SRE strategy. ABOUT THE CLIENT Our client is a fast-growing digital technology company at the forefront of delivering high-availability platforms for the sports and gaming industry. They pride themselves on innovation … Engineer to strengthen their engineering function and help evolve their observability and automation capabilities. THE BENEFITS Hybrid working model (office and remote) Opportunity to define and lead SRE strategy within a collaborative culture Exposure to modern cloud-native and containerised environments THE SITERELIABILITY ENGINEER ROLE: As a SiteReliability Engineer, you'll ensure … performance testing, system tuning and incident management to ensure smooth operation during critical events. SITERELIABILITY ENGINEER ESSENTIAL SKILLS At least 2 years' experience working as an SRE Deep understanding of system reliability, scalability and performance tuning Experience with observability tools (Grafana, Prometheus, OpenTelemetry) Proficiency in a programming language such as Go or .NET for automation and More ❯
Prestigious opportunity with a Global Investment Giant for a SiteReliabilityEngineering (SRE) Manager to be based in our Manchester HQ, leading a talented team of engineers dedicated to maintaining and enhancing the reliability of our systems. Working closely with cross-functional teams across the globe, including business stakeholders, product managers, and software engineers, you will … role has an opportunity to provide strategic guidance on improvements. At the forefront of providing production support services including, incident logging, incident resolution, problem management, change management practices, and SRE support, we are inviting you to join our success story. As our SiteReliabilityEngineering Manager you will:- Lead, coach, and develop a high-performing SRE team. … for incident response, root cause analysis, and post-mortem reviews to prevent future incidents. Work closely with business and technology teams to understand their needs and ensure alignment with reliability and uptime goals. Facilitate communication and collaboration across global teams. Drive the development and adoption of automation tools to improve efficiency and reduce manual intervention. Establish and maintain comprehensive More ❯
Employment Type: Permanent
Salary: £90000 - £100000/annum To £140,000 package
Prestigious opportunity with a Global Investment Giant for a SiteReliabilityEngineering (SRE) Manager to be based in our Manchester HQ, leading a talented team of engineers dedicated to maintaining and enhancing the reliability of our systems. Working closely with cross-functional teams across the globe, including business stakeholders, product managers, and software engineers, you will … role has an opportunity to provide strategic guidance on improvements. At the forefront of providing production support services including, incident logging, incident resolution, problem management, change management practices, and SRE support, we are inviting you to join our success story. As our SiteReliabilityEngineering Manager you will:- Lead, coach, and develop a high-performing SRE team. … for incident response, root cause analysis, and post-mortem reviews to prevent future incidents. Work closely with business and technology teams to understand their needs and ensure alignment with reliability and uptime goals. Facilitate communication and collaboration across global teams. Drive the development and adoption of automation tools to improve efficiency and reduce manual intervention. Establish and maintain comprehensive More ❯
Hereford, Herefordshire, West Midlands, United Kingdom Hybrid/Remote Options
Twinstream Limited
SiteReliability Engineer (Contract – Outside IR35) | Hybrid – Hereford (with occasional site visits) | Day Rate: £500–£600 Build, Optimise, and Secure the Systems That Power the UK's Most Critical Infrastructure In 2019, a team of engineers solving complex cross-domain challenges inside government organisations came together to form TwinStream — a business built on technical excellence, innovation, and … cross-domain and cloud infrastructure solutions to some of the most high-profile government programmes in the country. Now we're looking for a SiteReliability Engineer (SRE) to join us on a contract basis — someone ready to own reliability, drive automation, and keep our systems performing flawlessly. The SiteReliability Engineer Role: As a … Security Clearance: Due to the nature of our projects, candidates must be eligible for DV Clearance . Ready to Deliver Reliability That Matters? If you're an experienced SRE who thrives on solving complex infrastructure challenges, driving automation, and delivering secure, high-performance systems — we want to hear from you. Apply now and join TwinStream — where your engineeringMore ❯
SiteReliability Engineer (SRE) Central London (Hybrid 3 days per week in the office) £65,000 £75,000 per annum + Excellent Benefits Were working with an innovative software company thats scaling its platform to support rapid customer growth and product expansion. Theyre looking for a SiteReliability Engineer (SRE) to join their platform team and … performance into the software lifecycle. Managing and evolving CI/CD pipelines to ensure smooth deployments and rollbacks. Contributing to incident response , post-mortems, and reliability improvements. Championing SRE principles such as error budgets, SLIs/SLOs, and automation-first thinking. What Were Looking For Strong experience running cloud infrastructure (AWS preferred) in production. Proven background in Kubernetes operations … engineering culture. Influence how reliability and performance are engineered at scale. Work with talented developers and DevOps engineers in a collaborative environment. AWS | SiteReliability | SRE | Cloud | Kubernetes | Terraform | CI/CD | Observability | Python | Go | Automation Click APPLY NOW to be considered for this position! Follow ReVybe IT Recruitment to stay up to date with the More ❯
Team Lead - SiteReliabilityEngineering is responsible for ensuring the effective and efficient running of the current NOC team with a view to transition to an SRE function over time. The team is responsible for enabling innovation and velocity of change while ensuring system reliability focusing on the critical features and functionality within products and platforms. … Customer Excellence and Continual Service Improvement within the team. I dentify, develop, communicate, and implement process changes within the team. Act as a point of escalation for the team. SRE responsibilities: Help define the SRE practice for the organisation, collaborate with other stakeholders to select the relevant SRE principles, define the objectives and measurements of the outcomes. Collaborate with stakeholders … to product owners and key stakeholders. Design, code, test and deliver solutions to automate manual operation (i.e., "TOIL"). Participate in operations support and on-call rotation shifts, for SRE supported systems and products. Participate in or lead problem management activities , including post-mortem incident analysis, and provision of technical insight, documented findings, outcomes and recommendations as part of a More ❯
Mid-Level SiteReliability Engineer (SRE) Are you an experienced SiteReliability Engineer with a passion for building reliable, scalable systems that empower innovation? Our client is looking for a skilled Mid-Level SRE to join our growing technology team. In this role, you’ll help ensure our infrastructure is stable, secure, and efficient - supporting the … applications that drive support our clients. The Role We are seeking a mid-level SiteReliability Engineer (SRE) to join our technology team, helping to ensure the smooth operation and reliability of our infrastructure. You’ll play a vital role in maintaining uptime, managing deployments, and supporting other team members. This is a hands-on position suited … performance, and availability of production systems. Perform regular updates, patching, and maintenance across environments. Manage infrastructure provisioning using Terraform, Ansible, and AWS. Collaborate & Support Work closely with the junior SRE to develop their practical experience and technical confidence. Partner with developers, data scientists, and business users to resolve technical issues. Automate & Optimise Contribute to configuration management and automation improvements. Identify More ❯
Birmingham, West Midlands, United Kingdom Hybrid/Remote Options
DWP Digital
Senior SiteReliability Engineer Pay up to £78,517 plus 28.97% employer pension contributions, hybrid working, flexible hours, and a truly great work life balance. DWP. … Digital with Purpose. We have a fantastic opportunity to join our community of experts at DWP Digital as a Senior SiteReliability Engineer, within one of our SRE teams at the heart of Digital Transformation. We're using fresh ideas and leading-edge tech to build and maintain digital solutions that will be used by nearly every person … Demonstrable experience of developing cloud based and supporting cloud-based applications in AWS & Azure. Incident Resolution: Strong experience in resolving complex technical incidents, ensuring minimal downtime and swift recovery. ReliabilityEngineering: Expertise in reliabilityengineering, including capacity and performance management through effective monitoring, logging, and alerting. Leadership: Demonstrated ability to engage with stakeholders at all levels More ❯
SiteReliability Engineer | Bristol (3 days onsite, 2 days remote) | £65,000–£95,000 DOE Join a Team That Builds the Backbone of Secure, High-Performance Systems In 2019, a group of engineers solving complex cross-domain challenges inside government organisations decided to take things further — and TwinStream was born. Our mission? To deliver technical excellence, operational reliability … exceptional client service for some of the UK’s most high-profile government projects. Now, we’re growing — and we’re looking for a SiteReliability Engineer (SRE) who’s ready to shape the next generation of resilient, high-impact infrastructure. Key Responsibilities of the SiteReliability Engineer: As a TwinStream SRE, you’ll sit at … ensuring the availability, performance, and cost-effectiveness of mission-critical cross-domain services. You’ll work closely with feature development teams and our BAU/Support engineers to: Drive reliability, performance, and scalability across cloud and on-prem systems Automate toil, reduce alert fatigue, and evolve monitoring capabilities Build and refine CI/CD pipelines to enable seamless delivery More ❯
SiteReliability Engineer Bristol (3 days onsite, 2 days remote) £65,000 £95,000 DOE Join a Team That Builds the Backbone of Secure, High-Performance Systems In 2019, a group of engineers solving complex cross-domain challenges inside government organisations decided to take things further and TwinStream was born. Our mission? To deliver technical excellence, operational reliability … exceptional client service for some of the UK s most high-profile government projects. Now, we re growing and we re looking for a SiteReliability Engineer (SRE) who s ready to shape the next generation of resilient, high-impact infrastructure. Key Responsibilities of the SiteReliability Engineer: As a TwinStream SRE, you ll sit at … ensuring the availability, performance, and cost-effectiveness of mission-critical cross-domain services. You ll work closely with feature development teams and our BAU/Support engineers to: Drive reliability, performance, and scalability across cloud and on-prem systems Automate toil, reduce alert fatigue, and evolve monitoring capabilities Build and refine CI/CD pipelines to enable seamless delivery More ❯
SiteReliability Engineer | Bristol (3 days onsite, 2 days remote) | £65,000–£95,000 DOE Join a Team That Builds the Backbone of Secure, High-Performance Systems In 2019, a group of engineers solving complex cross-domain challenges inside government organisations decided to take things further — and TwinStream was born. Our mission? To deliver technical excellence, operational reliability … exceptional client service for some of the UK's most high-profile government projects. Now, we're growing — and we're looking for a SiteReliability Engineer (SRE) who's ready to shape the next generation of resilient, high-impact infrastructure. Key Responsibilities of the SiteReliability Engineer: As a TwinStream SRE, you'll sit at … ensuring the availability, performance, and cost-effectiveness of mission-critical cross-domain services. You'll work closely with feature development teams and our BAU/Support engineers to: Drive reliability, performance, and scalability across cloud and on-prem systems Automate toil, reduce alert fatigue, and evolve monitoring capabilities Build and refine CI/CD pipelines to enable seamless delivery More ❯
the future of AI. Together, we can make a meaningful impact. See more about our culture on Role Summary We are seeking highly experienced SiteReliability Engineers (SRE) to shape the reliability, scalability and performance of our platform and customer facing applications. You will work closely with our software engineers and research teams to ensure our systems … meet and exceed our internal and external customers' expectations. Location: Paris or London Reporting line: Head of Engineering What you will do As a SiteReliability Engineer, you balance the day to day operations on production systems with long term software engineering improvements to reduce operational toil and foster the reliability, availability, and performance of … projects, research publications, blog articles and conferences About you Master's degree in Computer Science, Engineering or a related field 7+ years of experience in a DevOps/SRE role Strong experience with cloud computing and highly available distributed systems Exposure to sitereliability issues in critical environments (issue root cause analysis, in production troubleshooting, on call More ❯
Manchester, Lancashire, United Kingdom Hybrid/Remote Options
MI5
healthy work life balance and offer a range of working patterns, including full time, part time, and compressed hours. Hybrid working, which refers to a combination of working on site and from home, may be more limited due to the nature of the work. However, some homeworking may be available depending on business needs. We also support flexible start … to the architecture and design of both new and existing systems, establish and promote best practices, and deliver high quality software solutions. Drawing on your expertise in various software engineering methodologies, you'll introduce fresh ideas and innovative approaches that make a real impact at the core of our mission: keeping the UK safe, both in the real world … and online. You'll bring a genuine enthusiasm for discovering and applying new software engineering techniques. As part of a wider network of peers, you'll collaborate and learn from others. With your experience, you'll set the standard, introduce innovative ways of working, and identify new priorities. Whether leading and mentoring a team or acting as the technical More ❯
Role: SiteReliability Engineer Location: London or Manchester (2 days per week on-site + monthly team days rotating between locations) Duration: 6 months Rate: £675 per day (Inside IR35) Security Clearance: Must hold active SC clearance due to project timings Team Size: Core team of 10 within a wider programme of 60+ and expanding Experience Level … 4+ years About the Role We're looking for an experienced SiteReliability Engineer to support a major central government programme, ensuring the reliability, scalability, and performance of critical digital services. You'll help shape modern cloud infrastructure and enable engineering teams to deliver secure and resilient platforms. Essential Requirements Technical Skills AWS Services: Strong proficiency … administration skills Leadership & Approach Technical Direction: Capable of providing technical leadership and mentoring without direct line management responsibilities System Design: Skilled in designing and implementing efficient, high-performing systems Reliability & Scalability: Focused on delivering robust, maintainable, and secure infrastructure solutions Collaboration: Comfortable working within cross-functional teams and engaging stakeholders across central government What You'll Be Doing Building More ❯
Milton Keynes, Buckinghamshire, England, United Kingdom Hybrid/Remote Options
REDTECH RECRUIT
engineering practices Experience of large scale platform migrations, customer transitions and maintaining service continuity Background working with blended QA teams and embedding quality engineering practices Understanding of SRE principles, Infrastructure as Code and cloud operations in product driven environments Excellent ability to drive collaboration across engineering, product, security, professional services and customer facing teams Highly advantageous experience … partnership, define quality metrics, integrate QA within agile workflows and drive continuous improvement in automation and quality engineering Lead the evolution of cloud operations into a product aligned SRE function, embedding Infrastructure as Code, reliability principles and operational excellence Partner closely with product, CTO and wider business teams to ensure delivery aligns with strategic priorities Collaborate with sales … Manager/VP Engineering/Software Engineering Lead/SaaS Engineering Lead/Cloud Engineering Manager/Technical Director/Platform Engineering/SRE Leadership/Azure/AWS/GCP/SaaS Migration/Microservices/Infrastructure as Code/Kubernetes/CI/CD/AI Engineering/Machine Learning/ More ❯
Head of Performance & ReliabilityEngineering Full-Time - Hybrid (3 days in Cambridgeshire) Up to £95,000 + Bonus This is an exceptional opportunity to join a major organisation at a pivotal stage in their digital transformation. As Head of Performance & ReliabilityEngineering You'll shape strategy, lead performance testing and chaos engineering initiatives, and embed … reliability best practices across engineering, DevOps … and infrastructure teams. This is a senior, strategic leadership role focused on system excellence, observability, and continuous improvement. Ideal Candidate: Proven experience leading Performance Engineering, Reliability, or SRE functions Deep expertise in performance testing methodologies (load, stress, spike, soak) Strong hands-on background with LoadRunner and Dynatrace (plus tools such as NeoLoad, k6, or JMeter) Skilled in chaos More ❯
customer's systems are built and maintained. This role blends operational product support with software engineering to create applications to understand the overall health of our systems. The SRE team sits within a wider programme at the core of the customer mission. The role holder As an SRE, fundamentally you will be doing work that has historically been done … expertise to substitute automation for human labour, with the objective of limiting traditional manual operations work (incident tickets, on-call etc.) to no more than half of the SRE team's time (and aiming for considerably less). You will have an enthusiasm to learn and experiment, to develop tools to understand application health and improve their reliability … enable them to be scalable and resilient to failure, and how to get the best out of the infrastructure they are deployed to. Participating in the wider DevOps/SRE community within the organisation. Competencies It is desirable for you to have experience in the areas below. However more valued for this role is that you have excitement and enthusiasm More ❯
SiteReliability Engineer/SRE/DevOps/AWS/IaC/Manchester/Permanent/Remote/£50,000 - £60,000pa Vivo Talent is proud to be partnering with a market-leading software organisation to recruit a talented SiteReliability Engineer (SRE) to join their growing team. This is a fantastic opportunity to play a … pivotal role in designing and maintaining reliable, scalable infrastructure that keeps the business running smoothly and enables innovation at scale. As the SRE, you'll take ownership of ensuring systems remain stable, efficient, and secure - while also having the chance to mentor junior team members and help shape the foundations of a growing engineering function. If you're passionate … Optimise: Use tools like Terraform, Ansible and AWS to manage infrastructure and enhance automation. Collaborate & Support: Work hand-in-hand with cross-functional teams and help develop the junior SRE through mentoring and knowledge sharing. Monitor & Troubleshoot: Strengthen monitoring systems (moving from Nagios to Datadog) and take ownership of incident management. What You'll Bring Solid experience in SRE or More ❯
our team. Thus, we are building the firm around exceptional talent. Position Overview The Junior SiteReliability & Network Engineer will be working alongside the head of the SRE team to ensure the reliability, scalability, and performance of our trading platform. This role will blend traditional system administration with software development and require expertise in cloud network engineering, particularly with AWS. The ideal candidate will understand network routing and networking stacks. Key Responsibilities System Reliability and Performance Monitor, maintain, and optimize the performance, availability, and scalability of our trading platform. Respond to and resolve system incidents, ensuring minimal downtime and swift recovery. Collaborate with the development team to design and implement system enhancements. Infrastructure Management Under … implement improvements to our infrastructure and development processes. Participate in post-mortem reviews to identify areas of improvement after incidents. Qualifications Minimum of 2 years of experience in a SiteReliabilityEngineering, DevOps, or similar role. Strong experience with hybrid cloud and on-prem system management. Expertise in cloud network engineering, particularly with AWS. Proficiency in More ❯
do at CMC Markets, and staying true to that has been pivotal to our success. CMC Markets is seeking an experienced and proactive SiteReliabilityEngineering (SRE) Manager to establish and lead a new SRE function within the IT Production department. This is a key leadership role responsible for defining the SRE strategy, implementing best practices, and … resilience across the trading platforms Ensure new systems are aligned with best practices Drive improvements and alignment in observability and monitoring tools, improving MTTD and MTTR Produce analysis on SRE function performance Provide guidance, recommendations and hands-on support to teams, promoting SRE best practices Develop and maintain a roadmap for continuous improvement of support and observability Maintain personal/… role Read and comply with CMC policies and procedures as they relate to your employment Complete all mandatory compliance training KEY SKILLS AND EXPERIENCE 2 years experience in a SRE function or similar in hybrid cloud/on prem environment 7 years experience in IT operational roles working with highly reliable systems Experience in modern development methodologies and languages Proficiency More ❯
Azure Sitereliability Engineer|6 month contract|Onsite 2/3 days per week|£650 per day InsideIR35 Opus RS are looking for a Senior SiteReliability Engineer with deep expertise in Azure cloud migration and a strong DevOps background to join our clients team. What We're Looking For Previous experience as a SiteReliability Engineer Strong skills in Terraform, GitHub, AKS, and networking (load balancing, Firewalls, routing). Proven track record in Agile delivery and DevOps practices. Extensive experience with Azure and cloud migration using frameworks … like CAF and WAF. Ability to communicate effectively with technical and non-technical stakeholders. Familiarity with change control processes and performance monitoring. If you're a results-driven Senior SRE ready to tackle a new cloud challenges and deliver innovative solutions, we'd love to hear from you, please contact me at More ❯
Strong expertise in implementing SiteReliabilityEngineering (SRE) principles. Advanced knowledge of establishing observability using tools Dynatrace & Datadog (primary skills). Proficiency in automation & scripting using Python & Ansible (primary skills). Strong experience with cloud platforms AWS & Azure (primary skills). Solid understanding of containerization and orchestration tools like Docker and Kubernetes . Proficiency in cloud native … distributed systems & microservices architecture. Exposure to AI/ML techniques for predictive analytics and automated problem resolution. Familiarity with CI/CD pipelines & enabling automated release & deployment engineering solutions. Good to have experience with chaos engineering tools like Gremlin or Chaos Monkey and implementing automation frameworks for resilience tracking. Ability to manage and prioritize multiple projects in a … fast-paced environment. Strong interpersonal and communication skills to work effectively across teams. Excellent problem solving, analytical thinking, and adaptability. Strategic mindset balancing engineering excellence with business priorities. More ❯