SiteReliability Engineer – (SRE, Terraform, AKS, Azure, Kubernetes, PowerShell, Python, Bash, Datadog, Monitoring Tools) – Permanent – Remote Location: Remote (occasional travel to Nottinghamshire HQ) Salary: Up to £85,000 per annum + benefits Start Date: ASAP Charles Simon Associates are working with a global organisation who are looking to recruit a SiteReliability Engineer (SRE) on a … permanent basis. This is an exciting opportunity to join a forward-thinking business where reliability, scalability, and automation are at the heart of technology delivery. Responsibilities include: Designing and enforcing SLOs, SLIs, and SLAs to ensure high reliability and performance. Building and maintaining monitoring/observability solutions (Datadog, Grafana, Azure Application Insights, Log Analytics). Managing Infrastructure as … ReliabilityEngineering and want to work in an environment where “that will do” is never good enough, this role is for you. SiteReliability Engineer – (SRE, Terraform, AKS, Azure, Kubernetes, PowerShell, Python, Bash, Datadog, Monitoring Tools) – Permanent – Remote More ❯
SiteReliability Engineer – (SRE, Terraform, AKS, Azure, Kubernetes, PowerShell, Python, Bash, Datadog, Monitoring Tools) – Permanent – Remote Location: Remote (occasional travel to Nottinghamshire HQ) Salary: Up to £95,000 per annum + benefits Start Date: ASAP Charles Simon Associates are working with a global organisation who are looking to recruit a SiteReliability Engineer (SRE) on a … permanent basis. This is an exciting opportunity to join a forward-thinking business where reliability, scalability, and automation are at the heart of technology delivery. Responsibilities include: Designing and enforcing SLOs, SLIs, and SLAs to ensure high reliability and performance. Building and maintaining monitoring/observability solutions (Datadog, Grafana, Azure Application Insights, Log Analytics). Managing Infrastructure as … ReliabilityEngineering and want to work in an environment where “that will do” is never good enough, this role is for you. SiteReliability Engineer – (SRE, Terraform, AKS, Azure, Kubernetes, PowerShell, Python, Bash, Datadog, Monitoring Tools) – Permanent – Remote More ❯
SiteReliability Engineer (SRE) Central London (Hybrid 3 days per week in the office) £65,000 £75,000 per annum + Excellent Benefits Were working with an innovative software company thats scaling its platform to support rapid customer growth and product expansion. Theyre looking for a SiteReliability Engineer (SRE) to join their platform team and … performance into the software lifecycle. Managing and evolving CI/CD pipelines to ensure smooth deployments and rollbacks. Contributing to incident response , post-mortems, and reliability improvements. Championing SRE principles such as error budgets, SLIs/SLOs, and automation-first thinking. What Were Looking For Strong experience running cloud infrastructure (AWS preferred) in production. Proven background in Kubernetes operations … engineering culture. Influence how reliability and performance are engineered at scale. Work with talented developers and DevOps engineers in a collaborative environment. AWS | SiteReliability | SRE | Cloud | Kubernetes | Terraform | CI/CD | Observability | Python | Go | Automation Click APPLY NOW to be considered for this position! Follow ReVybe IT Recruitment to stay up to date with the More ❯
SiteReliability Engineer Central London (3 days a week in the office) £65,000 - £75,000 per annum + Bonus + Generous Benefits Package We are working with an exciting technology company that are looking to bring in a SiteReliability Engineer to help scale their cloud infrastructure and DevOps capability. Theyve built a high-performing … engineering team and are now investing further into the platform side of things as demand grows. Think modern, cloud-native architecture, and a real emphasis on automation, scalability, and developer enablement. Youll have the autonomy to make technical decisions and help shape how platform engineering is done as the team continues to scale. Tech stack AWS (Core services … days a week in the office) £65,000 - £75,000 per annum + Bonus + Generous Benefits Package Click APPLY NOW to be considered for this position! AWS, SRE, Cloud, Kubernetes, EKS, Terraform, CI/CD, Automation etc. More ❯
SiteReliability Engineer Central London (3 days a week in the office) £65,000 - £75,000 per annum + Bonus + Generous Benefits Package We are working with an exciting technology company that are looking to bring in a SiteReliability Engineer to help scale their cloud infrastructure and DevOps capability. They’ve built a high … performing engineering team and are now investing further into the platform side of things as demand grows. Think modern, cloud-native architecture, and a real emphasis on automation, scalability, and developer enablement. You’ll have the autonomy to make technical decisions and help shape how platform engineering is done as the team continues to scale. Tech stack AWS … days a week in the office) £65,000 - £75,000 per annum + Bonus + Generous Benefits Package Click APPLY NOW to be considered for this position! AWS, SRE, Cloud, Kubernetes, EKS, Terraform, CI/CD, Automation etc. More ❯
Join our team as a MongoDB SiteReliability Engineer, where you'll be at the forefront of designing and maintaining robust, high-performance systems that power critical financial services. In this dynamic and fast-paced environment, your role will be essential to ensuring our infrastructure remains resilient, secure, and scalable. You'll work on automating operations, enhancing system … If you're motivated by solving, multi-layered problems and building systems that perform reliably amid shifting priorities, we encourage you to apply. To be successful as a MongoDB SiteReliability Engineer, you should have experience with: Working in SiteReliabilityEngineering, DevOps, and MongoDB administration in financial services. Using MongoDB features like replicaset, sharding More ❯
SiteReliability Engineer (SRE) - eDV Cleared Location: London (On-site) Salary: Up to £75,000 + Clearance Bonus + Company Bonus Clearance: eDV (Enhanced Developed Vetting) required Are you an experienced SiteReliability Engineer (SRE) with active eDV Clearance Do you want to work on mission-critical systems that directly support UK National Security Join … brightest minds in the industry, ensuring the reliability, scalability and performance of complex, high-assurance systems that protect the nation. The Role: As a key member of the SRE team, you'll design, build and maintain reliable infrastructure and automation solutions to keep vital services running smoothly. You'll drive continuous improvement across monitoring, deployment, and incident response for … performance bonus . Opportunity to work on high-impact, national security projects . Career development within one of the UK's most respected secure consultancies. If you're an SRE with eDV clearance looking to make a real impact in a secure and rewarding environment, we'd love to hear from you. Apply now or reach out directly to Dominic More ❯
Senior SiteReliability Engineer At UnlikelyAI, we are building the future of AI: one that is reliable, accurate and transparent. Our neurosymbolic technology harnesses the power of LLMs and generative AI, and combines it with classical symbolic technology to produce hallucination-resistant artificial intelligence for high-trust applications. To support our rapidly increasing commercial momentum, we're looking … for an experienced and pragmatic sitereliability engineer to join our exceptional team. This role is ideal for someone who has successfully scaled systems from prototype to production and enjoys working in cross-functional teams to champion cloud-native engineering. We are looking for someone with the experience and expertise to define, and own, our approach to building … for reliability and security as first-class citizens. This is a strategically important role for our technology team, as we rapidly approach entering full production in multiple projects. You'll work on a range of customer-facing and internal infrastructure projects, applying your engineering skills to solve complex reliability and scalability challenges. Your ability to build robust More ❯
Role: SiteReliability Engineer Location: London or Manchester (2 days per week on-site + monthly team days rotating between locations) Duration: 6 months Rate: £675 per day (Inside IR35) Security Clearance: Must hold active SC clearance due to project timings Team Size: Core team of 10 within a wider programme of 60+ and expanding Experience Level … 4+ years About the Role We're looking for an experienced SiteReliability Engineer to support a major central government programme, ensuring the reliability, scalability, and performance of critical digital services. You'll help shape modern cloud infrastructure and enable engineering teams to deliver secure and resilient platforms. Essential Requirements Technical Skills AWS Services: Strong proficiency … administration skills Leadership & Approach Technical Direction: Capable of providing technical leadership and mentoring without direct line management responsibilities System Design: Skilled in designing and implementing efficient, high-performing systems Reliability & Scalability: Focused on delivering robust, maintainable, and secure infrastructure solutions Collaboration: Comfortable working within cross-functional teams and engaging stakeholders across central government What You'll Be Doing Building More ❯
do at CMC Markets, and staying true to that has been pivotal to our success. CMC Markets is seeking an experienced and proactive SiteReliabilityEngineering (SRE) Manager to establish and lead a new SRE function within the IT Production department. This is a key leadership role responsible for defining the SRE strategy, implementing best practices, and … resilience across the trading platforms Ensure new systems are aligned with best practices Drive improvements and alignment in observability and monitoring tools, improving MTTD and MTTR Produce analysis on SRE function performance Provide guidance, recommendations and hands-on support to teams, promoting SRE best practices Develop and maintain a roadmap for continuous improvement of support and observability Maintain personal/… role Read and comply with CMC policies and procedures as they relate to your employment Complete all mandatory compliance training KEY SKILLS AND EXPERIENCE 2 years experience in a SRE function or similar in hybrid cloud/on prem environment 7 years experience in IT operational roles working with highly reliable systems Experience in modern development methodologies and languages Proficiency More ❯
Must hold UKIC DV Clearance Are you passionate about reliability, automation, and supporting mission-critical systems? Join this global defence organisation as a SiteReliability Engineer (SRE) and help shape the future of one of the UK's most vital national security platforms. You'll be joining a growing SRE team at the heart of the customer … s mission, focused on ensuring performance, availability, and scalability-while driving continuous improvement and innovation. About the Role As an SRE, you'll combine your operational expertise with software engineering skills to minimise manual effort and drive automation across complex systems. This role is perfect for someone who thrives on solving hard problems, automating the mundane, and building intelligent … overtime. Proactively enhance system availability, performance, and resilience. Develop tools and solutions to automate repetitive tasks and reduce operational toil. Collaborate with development teams to embed best practices and SRE principles. Deploy and manage monitoring systems to provide intelligent observability. Engage with the wider DevOps/SRE community within the organisation. Ideal Skills & Experience We're more interested in your More ❯
Job Summary This role is to design, build, and scale enterprise cloud platforms with a strong focus on automation, reliability, and developer experience. As part of the Cloud Infrastructure & DevOps team, you will build multi-cloud infrastructure that powers hundreds of production services, including critical Salesforce DevOps pipelines. You’ll partner closely with development, security, and operations teams to … Drive infrastructure compliance, DevSecOps, and policy-as-code practices. What we expect of you Minimum 5 years of experience in Platform Engineering, SiteReliabilityEngineering (SRE), or DevOps roles supporting cloud-native enterprise environments Proficient in Microsoft Azure and AWS platforms with hands-on experience in Kubernetes (AKS/EKS), Helm charts, and service mesh technologies … or HashiCorp Terraform Associate are advantageous Strong interpersonal skills including clear communication, collaboration across teams, adaptability in fast-paced environments, and a proactive mindset with a focus on reliability, performance, and developer enablement More ❯
Job Summary This role is to design, build, and scale enterprise cloud platforms with a strong focus on automation, reliability, and developer experience. As part of the Cloud Infrastructure & DevOps team, you will build multi-cloud infrastructure that powers hundreds of production services, including critical Salesforce DevOps pipelines. You’ll partner closely with development, security, and operations teams to … Drive infrastructure compliance, DevSecOps, and policy-as-code practices. What we expect of you Minimum 5 years of experience in Platform Engineering, SiteReliabilityEngineering (SRE), or DevOps roles supporting cloud-native enterprise environments Proficient in Microsoft Azure and AWS platforms with hands-on experience in Kubernetes (AKS/EKS), Helm charts, and service mesh technologies … or HashiCorp Terraform Associate are advantageous Strong interpersonal skills including clear communication, collaboration across teams, adaptability in fast-paced environments, and a proactive mindset with a focus on reliability, performance, and developer enablement More ❯
SiteReliability Engineer (Lead Level) | London | Up to £600 Inside IR35 | Hybrid (2 Days Onsite) | 6 months I’m partnered with a major media and tech company looking for a Lead SiteReliability Engineer to support and scale their Video on Demand (VOD) infrastructure. You’ll work across modern tech stacks including AWS, GCP, Cassandra, and … performance systems used by millions. What you’ll do Lead project delivery while supporting day-to-day operations and incident management Build and manage infrastructure as code to improve reliability, scalability, and performance Design and implement new architectures and best practices for infrastructure and delivery Drive automation across monitoring, CI/CD, and deployment pipelines Mentor engineers and guide … troubleshooting in live environments 💰 Up to £600 per day (Inside IR35) 📍 London | Hybrid (2 days onsite) 📅 6-month contract, with strong potential to extend If you’re an experienced SRE who enjoys taking ownership, leading technical delivery, and working on large-scale content platforms, I’d love to chat. 👉 Apply or message me if you’d like to hear more. More ❯
SiteReliability Engineer (Lead Level) | London | Up to £600 Inside IR35 | Hybrid (2 Days Onsite) | 6 months I’m partnered with a major media and tech company looking for a Lead SiteReliability Engineer to support and scale their Video on Demand (VOD) infrastructure. You’ll work across modern tech stacks including AWS, GCP, Cassandra, and … performance systems used by millions. What you’ll do Lead project delivery while supporting day-to-day operations and incident management Build and manage infrastructure as code to improve reliability, scalability, and performance Design and implement new architectures and best practices for infrastructure and delivery Drive automation across monitoring, CI/CD, and deployment pipelines Mentor engineers and guide … troubleshooting in live environments 💰 Up to £600 per day (Inside IR35) 📍 London | Hybrid (2 days onsite) 📅 6-month contract, with strong potential to extend If you’re an experienced SRE who enjoys taking ownership, leading technical delivery, and working on large-scale content platforms, I’d love to chat. 👉 Apply or message me if you’d like to hear more. More ❯
SiteReliability Engineer (Lead Level) | London | Up to £600 Inside IR35 | Hybrid (2 Days Onsite) | 6 months I’m partnered with a major media and tech company looking for a Lead SiteReliability Engineer to support and scale their Video on Demand (VOD) infrastructure. You’ll work across modern tech stacks including AWS, GCP, Cassandra, and … performance systems used by millions. What you’ll do Lead project delivery while supporting day-to-day operations and incident management Build and manage infrastructure as code to improve reliability, scalability, and performance Design and implement new architectures and best practices for infrastructure and delivery Drive automation across monitoring, CI/CD, and deployment pipelines Mentor engineers and guide … troubleshooting in live environments 💰 Up to £600 per day (Inside IR35) 📍 London | Hybrid (2 days onsite) 📅 6-month contract, with strong potential to extend If you’re an experienced SRE who enjoys taking ownership, leading technical delivery, and working on large-scale content platforms, I’d love to chat. 👉 Apply or message me if you’d like to hear more. More ❯
london (city of london), south east england, united kingdom
Arrows
SiteReliability Engineer (Lead Level) | London | Up to £600 Inside IR35 | Hybrid (2 Days Onsite) | 6 months I’m partnered with a major media and tech company looking for a Lead SiteReliability Engineer to support and scale their Video on Demand (VOD) infrastructure. You’ll work across modern tech stacks including AWS, GCP, Cassandra, and … performance systems used by millions. What you’ll do Lead project delivery while supporting day-to-day operations and incident management Build and manage infrastructure as code to improve reliability, scalability, and performance Design and implement new architectures and best practices for infrastructure and delivery Drive automation across monitoring, CI/CD, and deployment pipelines Mentor engineers and guide … troubleshooting in live environments 💰 Up to £600 per day (Inside IR35) 📍 London | Hybrid (2 days onsite) 📅 6-month contract, with strong potential to extend If you’re an experienced SRE who enjoys taking ownership, leading technical delivery, and working on large-scale content platforms, I’d love to chat. 👉 Apply or message me if you’d like to hear more. More ❯
enabling innovation and agility across BCG Core, BCG X, and CT worldwide. This role is accountable for embedding security within DevSecOps practices, applying SiteReliabilityEngineering (SRE) principles across all security services, and aligning with privacy, compliance, and business leaders to maintain trust and regulatory compliance. Key Responsibilities: Strategic Leadership & Transformation: Define and execute a unified security … remote access, zero-trust networking, and protection of sensitive data in AI/ML workloads. Leverage automation frameworks and IaC to improve scalability and reduce manual intervention. Operational Security, SRE & Assurance: Ensure security platforms are resilient, continuously monitored, and designed for 24x7 support and incident response readiness. Embed security telemetry and observability to enable proactive threat detection and automated response. … Apply SRE principles to improve reliability, performance, and maintainability of security services. Define service level objectives (SLOs) and key performance indicators (KPIs) for all security services. Compliance, Governance & Risk Management: Ensure alignment with global compliance requirements such as ISO 27001, NIST, SOC 2, GDPR, and others. Partner with governance, legal, and ISRM teams to implement enforceable policies and standards More ❯
architectures, and serverless computing. Hands-On Implementation: Lead the hands-on deployment, configuration, and management of secure, high-performance cloud environments (e.g., AWS, Azure, GCP) for critical workloads. DevOps & SRE Leadership: Instil best practices in DevOps, GitOps, and SiteReliabilityEngineering (SRE) to ensure system reliability, scalability, and performance. Security Integration: Work hand-in-glove with … Develop our cloud service offerings, create best practices, and eventually build and lead a team of cloud engineers. Who You Are: You have 8+ years of experience in cloud engineering and architecture, with at least 2+ years in a leadership or team lead position. You are an expert in containerisation and orchestration, with profound, hands-on experience with Kubernetes More ❯
Sutton, Surrey, United Kingdom Hybrid / WFH Options
Square One Resources
Job Title: Senior SiteReliability Engineer Location: Hybrid - Southampton or Sutton - 2 days a week in the office Salary/Rate: Up to £560 inside IR35 Start Date: November 2025 Job Type: 6-month contract Company Introduction We are looking for a Senior SiteReliability Engineer with AWS, Linux, Docker, IaC, and Monitoring (Grafana) experience to … messaging Services (SNF) & Kafka Advanced knowledge of at least one IaC tool such as Terraform, Cloudformation, Ansible Strong experience with monitoring tools such as Grafana, Elastic, StatusCake, PagerDuty Security Engineering pipeline experience Containerisation (Docker) on Linux Good understanding of at least one high-level programming language, ie Python In-depth knowledge of one or more database systems such as … applications Diagnose complex system performance problems Work closely within cross-functional teams to refine system monitoring and reporting Identify and implement best practices by collaborating with other SREs and SRE teams Understand Infrastructure as a Service (IaaS) and its implementation methods. Identify and implement architectural best practices by collaborating with multiple teams and software engineers Diagnose complex system problems using More ❯
infrastructure using Infrastructure-as-Code (IaC) tools such as Terraform . Develop, deliver, and support advanced research computing services and applications . Apply SiteReliabilityEngineering (SRE) principles to ensure high availability, performance, and reliability across HPC environments. Troubleshoot and resolve complex technical challenges affecting both the platform and user workloads. Essential Skills and Experience 10+ … years of hands-on experience designing, operating, or engineering large-scale computing environments (HPC, HTC, or Big Compute). Proven ability to drive innovation and integrate emerging technologies into HPC solutions. Administration experience with cluster and workload management software (eg, Slurm , LSF , Grid Engine ). Strong knowledge of Linux system administration , TCP/IP Networking , and storage systems . More ❯
London, South East, England, United Kingdom Hybrid / WFH Options
Michael Page Technology
issues. Ensure adherence to SLAs and help improve operational support efficiency. Participate in on-call rotations to provide 24/7 platform coverage. Continuously optimize monitoring, alerting, and platform reliability processes. Demonstrate a "can do" attitude, with flexibility to work occasional overtime when incidents extend beyond normal working hours. Profile Required … Skills & Qualifications Bachelor's degree in Computer Science, Information Technology, or related field (or equivalent work experience). Proven experience in technical support, sitereliabilityengineering (SRE), or platform operations. Strong knowledge of Linux/Unix and Windows environments. Familiarity with cloud platforms (AWS, Azure, GCP). Hands-on experience with CI/CD tools (Jenkins, GitHub More ❯
to join a leading technology and innovation consultancy, supporting UK public sector clients in their cloud transformation journeys. This role sits within a highly skilled team dedicated to designing, engineering, and optimising Google Cloud Platform (GCP ) solutions that power large-scale, mission-critical systems. The successful candidate will play a key role in shaping cloud strategy, driving architectural excellence … technical architecture and delivery of Google Cloud solutions for public sector organisations. Design, deploy, and operate secure, scalable, and high-performing GCP environments. Provide technical leadership and mentorship to engineering teams to ensure successful project delivery. Apply deep knowledge of Google Cloud architecture and engineering to deliver enterprise-grade solutions that meet both functional and non-functional requirements. … networking (TCP/IP, subnets, load balancing, DNS). A track record of leading small technical teams, providing guidance and mentorship. Experience in sitereliabilityengineering (SRE) or IT operations, including incident response and troubleshooting. Strong problem-solving and innovation skills, with evidence of delivering technical improvements or new ways of working. More ❯
to join a leading technology and innovation consultancy, supporting UK public sector clients in their cloud transformation journeys. This role sits within a highly skilled team dedicated to designing, engineering, and optimising Google Cloud Platform (GCP ) solutions that power large-scale, mission-critical systems. The successful candidate will play a key role in shaping cloud strategy, driving architectural excellence … technical architecture and delivery of Google Cloud solutions for public sector organisations. Design, deploy, and operate secure, scalable, and high-performing GCP environments. Provide technical leadership and mentorship to engineering teams to ensure successful project delivery. Apply deep knowledge of Google Cloud architecture and engineering to deliver enterprise-grade solutions that meet both functional and non-functional requirements. … networking (TCP/IP, subnets, load balancing, DNS). A track record of leading small technical teams, providing guidance and mentorship. Experience in sitereliabilityengineering (SRE) or IT operations, including incident response and troubleshooting. Strong problem-solving and innovation skills, with evidence of delivering technical improvements or new ways of working. More ❯