team at the heart of the global economy! The Department for Business and Trade ('DBT') and Inspire People are partnering together to bring you an exciting opportunity for Senior SiteReliability Engineers to join a team that ensures DBT's digital services work as users expect, working with development teams giving them the tools for their job, including … service-level objectives. - Participate in an on-call rota (with allowance), helping to keep DBT services resilient and reliable. - Mentor junior engineers and contribute to the growth of the SRE function. Technologies you will work with include AWS, Azure, Terraform/CloudFormation, Docker, ECS, ECR, ElasticSearch, Python/Django, PostgreSQL (RDS), Redis, and more. Essential Criteria - Cloud experience with AWS … application will be assessed against these requirements before being progressed to DBT. Shortlisted candidates will then be invited to interview and technical exercise. If you are a DevOps Engineer, SRE, or Systems Administrator looking to make a real impact across government digital services, apply today or contact Keesha Paulsen at Inspire People in confidence for more information. More ❯
City of London, London, United Kingdom Hybrid / WFH Options
Experis
understanding of Linux, Windows, and storage systems. · Experience with monitoring and observability tools (Datadog, CloudWatch, Azure Monitor, Coralogix). · Familiarity with DevOps and SiteReliabilityEngineering (SRE) principles. · Knowledge of networking and cloud security best practices. · Experience with Databricks is a plus. · Excellent problem-solving and stakeholder communication skills. Benefits Include · Annual performance-based bonus · Comprehensive pension More ❯
understanding of Linux, Windows, and storage systems. · Experience with monitoring and observability tools (Datadog, CloudWatch, Azure Monitor, Coralogix). · Familiarity with DevOps and SiteReliabilityEngineering (SRE) principles. · Knowledge of networking and cloud security best practices. · Experience with Databricks is a plus. · Excellent problem-solving and stakeholder communication skills. Benefits Include · Annual performance-based bonus · Comprehensive pension More ❯
slough, south east england, united kingdom Hybrid / WFH Options
Experis
understanding of Linux, Windows, and storage systems. · Experience with monitoring and observability tools (Datadog, CloudWatch, Azure Monitor, Coralogix). · Familiarity with DevOps and SiteReliabilityEngineering (SRE) principles. · Knowledge of networking and cloud security best practices. · Experience with Databricks is a plus. · Excellent problem-solving and stakeholder communication skills. Benefits Include · Annual performance-based bonus · Comprehensive pension More ❯
london, south east england, united kingdom Hybrid / WFH Options
Experis
understanding of Linux, Windows, and storage systems. · Experience with monitoring and observability tools (Datadog, CloudWatch, Azure Monitor, Coralogix). · Familiarity with DevOps and SiteReliabilityEngineering (SRE) principles. · Knowledge of networking and cloud security best practices. · Experience with Databricks is a plus. · Excellent problem-solving and stakeholder communication skills. Benefits Include · Annual performance-based bonus · Comprehensive pension More ❯
london (city of london), south east england, united kingdom Hybrid / WFH Options
Experis
understanding of Linux, Windows, and storage systems. · Experience with monitoring and observability tools (Datadog, CloudWatch, Azure Monitor, Coralogix). · Familiarity with DevOps and SiteReliabilityEngineering (SRE) principles. · Knowledge of networking and cloud security best practices. · Experience with Databricks is a plus. · Excellent problem-solving and stakeholder communication skills. Benefits Include · Annual performance-based bonus · Comprehensive pension More ❯
looking for a DevOps Engineer to join our growing team. Day to Day You'll Be: Infrastructure & Operations: Participate in the design, implementation, and maintenance of our infrastructure, ensuring reliability, scalability, and security. Support, monitor, and enhance the live infrastructure and platform solutions, ensuring high availability and performance. Help plan and execute the integration of our current infrastructure into … analysis. Documentation & Best Practices: Ensure comprehensive documentation of infrastructure, systems, and processes to support onboarding, troubleshooting, and scalability. Promote and implement DevOps and SiteReliabilityEngineering (SRE) best practices across the organisation. Essential Skills & Experience: Technical Expertise: Strong Linux systems administration experience, including firewalls and hardening Expertise in Docker and container orchestration. Proficiency with Infrastructure as Code … Familiarity with GCP services such as Compute Engine, Kubernetes Engine (GKE), Cloud Storage, BigQuery, and IAM. Familiarity with configuration management and IT automation tools. Strong understanding of DevOps and SRE principles. Soft Skills: Self-motivated, highly organised, and capable of driving initiatives from concept to delivery. Excellent communication and stakeholder management skills. Desirable Skills: Experience with serverless infrastructure (e.g., AWS More ❯
London, South East, England, United Kingdom Hybrid / WFH Options
Michael Page Technology
issues. Ensure adherence to SLAs and help improve operational support efficiency. Participate in on-call rotations to provide 24/7 platform coverage. Continuously optimize monitoring, alerting, and platform reliability processes. Demonstrate a "can do" attitude, with flexibility to work occasional overtime when incidents extend beyond normal working hours. Profile Required … Skills & Qualifications Bachelor's degree in Computer Science, Information Technology, or related field (or equivalent work experience). Proven experience in technical support, sitereliabilityengineering (SRE), or platform operations. Strong knowledge of Linux/Unix and Windows environments. Familiarity with cloud platforms (AWS, Azure, GCP). Hands-on experience with CI/CD tools (Jenkins, GitHub More ❯
Potters Bar, Hertfordshire, South East, United Kingdom
Searchstone Ltd
operations and reduce technical debt. Implement governance, policies, and security controls to maintain robust and compliant environments. Act as a player-coach , delivering technical solutions while mentoring and guiding engineering teams. Collaborate with development, security, and operations teams to deliver resilient cloud services. What Were Looking For Extensive experience with Microsoft Azure (PaaS, IaaS, networking, storage, identity, security, monitoring … Bicep, or ARM templates. Proficiency in Azure DevOps, CI/CD pipelines, and automation frameworks . Solid understanding of cloud security, governance, and compliance . Ability to design for reliability, scalability, and … observability . Excellent communication and leadership skills, with a proven ability to influence technical direction. Nice to Have Familiarity with multi-region and hybrid cloud architectures . Knowledge of SRE (SiteReliabilityEngineering) practices . Microsoft Certified: Azure Solutions Architect/Azure DevOps Engineer or equivalent. Why This Role? This is a rare opportunity to shape the More ❯
City of London, London, United Kingdom Hybrid / WFH Options
Anson Mccade
Professional-level Google Cloud certification required. Proven expertise in Google Cloud Platform services (Compute Engine, App Engine, GKE, Cloud Storage, IAM, VPC, etc.). Strong experience in architecting and engineering cloud-based solutions that meet both functional and non-functional requirements. Solid understanding of cloud networking and security (e.g. firewalls, encryption, identity management). Experience implementing foundational cloud platforms … communication and stakeholder management skills across technical and non-technical audiences. Desirable Experience with multi-cloud environments (AWS, Azure, hybrid). Familiarity with sitereliabilityengineering (SRE) principles and production systems support. Experience driving innovation, technical transformation, and improvements in ways of working. What's on Offer Competitive base salary between £75,000 and £90,000 . More ❯
London, South East, England, United Kingdom Hybrid / WFH Options
Holland & Barrett International Limited
want to hear from you! Key Responsibilities: Security Strategy: Help define and execute the Holland & Barrett cloud security strategy, partnering with platform and SiteReliabilityEngineering (SRE) teams to build robust infrastructure that supports our business. Perimeter Security: Establish platform perimeter security by implementing controls at ingress and egress points, including creating and maintaining an edge network More ❯
Modern Service Management expertise: Possess a broad and deep understanding of IT Service Management concepts, moving beyond traditional ITIL and embracing principles of SiteReliabilityEngineering (SRE), modern change management. Stakeholder Engagement and Influence: An expert-level ability to engage, influence, and build relationships with senior stakeholders, including Director Generals, senior directors, external government departments and Vendors. … The capacity to drive positive change in how live services are managed within a large, transforming organisation, including a proven ability to promote positive behaviours, foster a culture of reliability, further adopting agile principles to focus on outcomes and deliver value frequently. This role is open to public sector and private sector candidates and would suit someone with experience More ❯
Sheffield, South Yorkshire, England, United Kingdom
McGregor Boyall
Global Head of SRE, Platform, Cloud, Recovery Lead, Strategy , Technical Leadership, Roadmap A leading provider of Financial Services is seeking a global head of SRE who can help shape, navigate and implement the strategy for this key business area. The role: We are seeking a senior technology leader to take on the dual role of Senior Recovery Lead and Global … will lead a global team of technical experts who act as technical escalation partners during major incidents-helping reduce time to recover (TTR) through deep technical engagement, coordination, and engineering-driven solutions. Skills required: Proven background in Technology, with proven experience in SiteReliabilityEngineering, Infrastructure, DevOps, or Technical Operations.Demonstrated experience leading global technical teams in … Drives cultural and engineering change to improve stability and accountability.Cross-Functional Collaboration - Adept at aligning goals and actions across engineering, operations, and risk domains Global Head of SRE, Platform, Cloud, Recovery Lead, Strategy , Technical Leadership, Roadmap McGregor Boyall is an equal opportunity employer and do not discriminate on any grounds. More ❯
SiteReliability Engineer £65,000-£95,000 DOE Hybrid (Bristol-based, occasional site visits) Clearance: Must be eligible for DV Clearance Founded in 2019 by engineers solving complex cross-domain problems for government organisations, TwinStream delivers technical excellence and exceptional service to high-profile clients click apply for full job details More ❯
City of London, London, United Kingdom Hybrid / WFH Options
Digital Realty (UK) Limited
Position Title: SiteReliability Engineer, Interconnection Service and Network Delivery Location: Hybrid: Austin, Dallas, Boston, Ashburn, Atlanta, London, or Amsterdam Your role In this role, you will be responsible for deploying and maintaining all Digital Realty interconnection fabric network infrastructure. The ideal candidate can demonstrate a unique blend of network engineering, network operations, and software understanding through … the application of engineering principals. You will focus on delivering operational discipline and embrace key operational principals including automation, agile development, and scripting. What youll do You will be part of the global Fabric Engineering organization and work in tandem with other teams to build and maintain a global network infrastructure. Ideal candidates for this role will bring … an understanding carrier class network infrastructure as well as experience working in a fast-paced development environment. What youll need 5+ years of operations and engineering experience Bachelors degree in Computer Science (or equivalent) preferred Strong experience with automation tools (Ansible, Terraform, etc) Strong experience working with Linux systems and tools Experience with Python (or equivalent high-level language More ❯
edge software, platforms, and infrastructure. The Role Join us as a SiteReliability Engineer and help us build the future of data sovereignty! We're seeking an SRE passionate about creating high-performance, scalable, and reliable services for our production infrastructure. You'll have a direct impact, improving existing systems and developing innovative solutions to complex challenges. Our … small, collaborative engineering teams own the full lifecycle of their services, from development to production operations. We champion automation and empower you to choose the best tools for the job. If you thrive in a fast-paced environment where you can make a real difference, we want to hear from you! Required skills/expertise: Develop and implement a … and applications to support large concurrent user bases and sustained daily usage. This will involve performance tuning, capacity planning, and optimization of resource utilization. Collaborate closely with the product engineering team to influence the design and implementation of new products and features, ensuring they meet our reliability and scalability standards from the outset. Preferred Qualifications Bachelor's degree More ❯
City, Manchester, United Kingdom Hybrid / WFH Options
Experis
role in building and running secure, scalable platforms that support mission-critical services. We welcome engineers with different tech stack experience - what matters most is your passion for automation, reliability, and problem solving in a collaborative environment. What you'll do Design and implement CI/CD pipelines and automated deployments. Build and manage cloud-native and containerised environments. … Apply Infrastructure-as-Code, monitoring and SiteReliabilityEngineering principles to ensure resilience and performance. Collaborate with developers, testers, and client stakeholders to deliver end-to-end solutions. Share knowledge, contribute to a learning culture, and help shape the direction of a growing practice. What we're looking for Hands-on experience in DevOps engineering, regardless More ❯
Position Title: SiteReliability Engineer, Interconnection Service and Network Delivery Location: Hybrid: Austin, Dallas, Boston, Ashburn, Atlanta, London, or Amsterdam Your role In this role, you will be responsible for deploying and maintaining all Digital Realty interconnection fabric network infrastructure click apply for full job details More ❯
Cheltenham, Gloucestershire, South West, United Kingdom
itecopeople
DV-Cleared Application Support Engineer - Contract (Outside IR35) The Role We are seeking a DV-cleared Application Support Engineer to join our client's on-site team in the Cheltenham area. You will help maintain and support a managed cross-domain service, leveraging a broad range of technologies. The role focuses on sitereliabilityengineering practices … to ensure service resilience, continuous improvement, and operational excellence. Location: Cheltenham, Gloucestershire Area (on-site minimum 4 days per week) Rate: £500 - £600 per day Clearance: Active DV clearance required Start : ASAP Duration : 6 months Key Responsibilities Build & Deploy Manage and maintain CI/CD pipelines using Java, Maven, and NPM. Configure and execute automated test suites with Maven … and conduct root cause analysis. Implement proactive changes to improve service stability. Maintenance Automate tasks to reduce manual workload. Conduct OS health checks, patching, and database housekeeping. Support multi-site data centre operations. Key Skills Experience in a managed service environment with strong service delivery focus. Hands-on with Infrastructure as Code (Terraform, Ansible). Application development experience (Java More ❯
and modernising our digital estate to build a market-leading digital offering with customer experience at its heart. This is an exciting and key role, partnering with business aligned engineering and product teams, to ensure a collaborative team culture is at the heart of what we do. To be successful in this role you should have: Strong hands-on … running of ForgeRock COTS based IAM solutions (PingGateway, PingAM, PingIDM, PingDS), including designing and implementing cloud-based, scalable and resilient IAM solutions for large corporate organisations. Experience with IAM engineering experience across authentication, authorisation, single sign-on, multi-factor authentication, identity lifecycle management, OAuth2.0, OpenID Connect, SAML and policy management Expertise with JavaScript, Java, Python, and must be comfortable … with API and microservices development. Strong working knowledge of SiteReliabilityEngineering principles Experience with Cloud computing (AWS is essential, Azure is a plus) Some other highly desirable skills include: Experience in DevSecOps - knowledge of Product Operating Model Knowledge of Infrastructure as a Code tooling (Chef is essential, Ansible is a plus), containerization knowledge of authentication and More ❯
Security Clearance Our client is a global professional services and technology consultancy helping public sector organisations modernise, transform, and scale using cloud and emerging technologies. Their teams combine strategy, engineering, and innovation to solve complex challenges at national scale - from digital infrastructure to secure government systems. They're now hiring a Google Cloud Architect (Associate Manager level) to join … landing zones, and network architectures. Mentor and guide small technical teams to ensure high-quality delivery. Translate complex technical requirements into scalable, enterprise-grade solutions. Work with stakeholders across engineering, operations, and client teams to shape cloud strategy and adoption. What you'll bring: At least one Google Cloud Professional certification. Strong experience across core GCP services - Compute Engine … of Infrastructure-as-Code (Terraform), automation (CI/CD), and scripting (Python, Bash). Knowledge of cloud security controls, IAM, encryption, and hybrid networking. Bonus points for: Experience in sitereliabilityengineering or production operations. Previous public sector project delivery. Leadership of small engineering teams in a project or consulting context. What's in it for More ❯
developers from all backgrounds - front end, back end or full stack - who want to shape the future of Defence and Security technology. What matters most is your passion for engineering, your willingness to learn, and your drive to progress in a fast-paced, mission-focused environment. What you'll do Work across a wide range of projects, whether that … Spring Boot, NodeJS, Python, FastAPI, Oracle, PostgreSQL, MongoDB and beyond. Collaborate in a DevSecOps environment, leveraging Atlassian, Jenkins, GitLab, OWASP and AWS toolsets. Apply automation, Infrastructure-as-Code, and SiteReliabilityEngineering principles to ensure scalability and resilience. Join cross-functional teams including developers, UX specialists, integration experts and end users to solve problems end-to-end. … tools and techniques. What we're looking for Experience in software development, in any stack or language - whether your expertise is JavaScript, TypeScript, Java, Python, C#, or others. Solid engineering fundamentals, with an interest in developing your skills further. Experience working in collaborative, agile teams (Scrum or Kanban). Curiosity, initiative and a team-first mindset - you're as More ❯
as AWS Lambda, Spring Boot, NodeJS, Python FastAPI, Oracle, PostgreSQL and MongoDB Contributing to DevSecOps delivery pipelines, using tooling such as Atlassian, Jenkins, GitLab, OWASP and AWS services Applying SiteReliabilityEngineering principles to ensure solutions are resilient, reliable and cost-effective Supporting clients and end users in making technical product decisions by clearly explaining trade-offs … and recommended approaches Participating in a community of engineers who share knowledge, run workshops and contribute to the wider engineering culture Looking beyond day-to-day responsibilities to identify small details, opportunities for improvement and added value for clients What we're looking for: UK Developed Vetting (DV) clearance is essential Hands-on experience in software development and a … strong interest in writing quality code Solid understanding of back-end development using one or more of the following: Java, Python, TypeScript or JavaScript Familiarity with good engineering patterns and practices, and the ability to articulate them clearly Experience working in agile environments (Scrum, Kanban or similar) Enthusiastic about learning, collaborating with diverse teams and solving problems creatively Confident More ❯
Leeds, West Yorkshire, Yorkshire, United Kingdom Hybrid / WFH Options
Fruition Group
Software Engineer/SRE JavaScript/TypeScript, Node.js, AWS, Observability Leeds/Hybrid, c. 2x per week Salary up to £65,000 We're looking for a Software Engineer with strong AWS and Observability experience to join a growing engineering team in Leeds. This is a hybrid role, giving you the flexibility to split your time between home and … a modern city-centre office. You'll work across both engineering and sitereliability, helping to build and scale systems that are reliable, secure, and observable. You'll be a key part of improving platform performance and automation, while collaborating with developers, product teams, and operations. What you'll be doing: Building and maintaining scalable cloud infrastructure … in AWS Implementing and improving observability tools (monitoring, logging, tracing) Automating deployments and improving CI/CD pipelines Driving reliability, availability and performance across systems Working with developers and SREs to solve complex problems What we're looking for: Strong experience with AWS (EC2, ECS, Lambda, RDS etc.) Good knowledge of observability tools (Grafana, Prometheus, OpenTelemetry, Datadog, or similar More ❯
Manchester, Lancashire, United Kingdom Hybrid / WFH Options
Alcidion Corp
in the EPR (or other complex systems) space Substantial experience in managing complex customer accounts at both a strategic and tactical level Experience supporting cloud hosted solutions alongside a Site-Reliability-Engineering/Managed Services team Proven experience working to processes aligned to ISO9001 (quality management), ISO27001 (information security) and ISO31000 (risk management) How to apply To More ❯