reliability of all cloud systems while keeping levels of manual work low. SREs are expected to be experienced in software engineering principals, operational discipline, and automation. The SRE team work on a fully remote basis and work in conjunction with their US and Australian teams as well. This company are a market leader in Student community management software … ensure high availability and performance Collaborate with product engineering teams to design/build fit-for-purpose and observable software Required Skills and Experience: Proven experience in a SRE/DevOps/Platform Engineering role and having previously worked in a Software Engineering role in .Net and C# or Java or similar OO development language. Proficiency in … and this job is part of a large program of change and improvement in their Cloud SaaS products over the coming years. If you are looking for an interesting SRE role with a forward-thinking global organisation, then this would be a tremendous career opportunity to consider. Please apply with your CV to find out more. More ❯
occasional travel to Scotland Employment Type: 6 month Contract Rate: £550 per day, Outside of IR35 Role Overview Morgan Hunt are seeking an experienced SiteReliability Engineer (SRE)/Unix Infrastructure Engineer to support the deployment, migration, and optimisation of critical infrastructure services. The role involves ensuring high availability, disaster recovery readiness, and automation-driven improvements across RHEL More ❯
and capable of adapting to changing customer needs. This role offers full-time working from our Central Stockholm office. The Opportunity As a Senior SiteReliability Engineer (SRE), you'll be joining a team whose mission is to ensure the availability, performance, security and reliability of our platform and core services, ensuring that they meet the needs … be responsible for visibility and monitoring of those systems, for building tooling and automation to reduce TOIL and for responding to incidents as part of our 24/7 SRE on-call team. ReliabilityEngineering at Board Intelligence The SRE team: Strives to provide the highest standards of Availability, Scalability, Performance and Security for our Software as a … work Proactively monitors our platform and responds to incidents as part of a 24/7 rota Key responsibilities of the role We're looking for a great Senior SRE to be a hands on individual contributor to key technical projects and to help us build a first-class SRE function. This role will involve: Project work Hands on work More ❯
Cambridge, Cambridgeshire, United Kingdom Hybrid / WFH Options
Arm Limited
for that journey! We're building a robust internal data platform to empower engineers and enable innovation across the business. To support that mission, we're growing our Data Engineering Platform team and investing deeply in modern, reliable infrastructure. We're seeking a DevOps engineer with hands-on expertise in containerisation, orchestration, cloud platforms, continuous-delivery pipelines, and cloud … cloud deployments (AWS-first) using Terraform and platform tooling Improve security posture across IAM, secrets, and networking Help the team ship faster and safer by mentoring on DevOps and SRE practices We're solving for reliability, compliance, performance, and speed - at once. You'll be key to making it work. Required Skills: Knowledge of one or more programming languages … highly leveraged platform, enabling hundreds of engineers to use critical data systems with confidence. You'll have ownership, impact, and a seat at the table as we define how SRE and platform thinking shape our next-generation data infrastructure. If you're looking to scale not just systems but the capabilities of the engineers around you, this is your team. More ❯
the next level? We have a brand-new opportunity for a bright, driven, customer focussed professional to join our Hybrid Cloud 'Delivery' team, and work alongside our Enterprise Data Engineering consultants to accelerate and drive data engineering opportunities. The Advisory and Professional Services (A&PS) delivery team within HPE Pointnext Services is responsible for bringing thought-leadership, industry … implementation of scalable clustered Big Data solutions, with a specific focus on automated dynamic scaling, self-healing systems. Participating in the full lifecycle of data solution development, from requirements engineering through to continuous optimisation engineering and all the typical activities in between Providing technical thought-leadership and advisory on technologies and processes at the core of the data … Infrastructure as Code and CI/CD paradigms and systems such as: Ansible, Terraform, Jenkins, Bamboo, Concourse etc. Monitoring utilising products such as: Prometheus, Grafana, ELK, filebeat etc. Observability - SRE Big Data solutions (ecosystems) and technologies such as: Apache Spark and the Hadoop Ecosystem Edge technologies e.g. NGINX, HAProxy etc. Excellent knowledge of YAML or similar languages The following Technical More ❯
Bristol, Avon, England, United Kingdom Hybrid / WFH Options
Robert Walters
and operation of cloud infrastructure and applications on Google Cloud Platform. You will work collaboratively with engineering and infrastructure teams to implement sitereliabilityengineering (SRE) principles, focusing on system reliability, observability, automation, and operational excellence. This role follows a hybrid working model, requiring attendance at the Bristol office for at least two days per … week or 40% of the working time. Key Responsibilities Promote and embed SRE best practices within engineering teams and microservices environments Partner with infrastructure and DevOps engineers to improve system resilience and performance Troubleshoot complex incidents and implement long-term solutions through code and automation Develop and improve automation pipelines to reduce manual operations and enhance system efficiency Contribute … to multiple strategic digital initiatives and collaborate across engineering domains Essential Skills and Experience Background in software engineering or telemetry, with current focus on SRE Extensive experience with public cloud platforms, particularly Google Cloud (or AWS/Azure) Proven ability to manage Kubernetes clusters in production environments Competence in scripting and development using languages such as Python, Java More ❯
operating our infrastructure, middleware, and CI/CD systems to ensure our teams have access to the best tools available. We combine problem-solving skills with software and systems engineering to take a proactive approach in building fault-tolerant and secure systems, improving observability and zealously automating away toil. In this role you will: Use your sitereliability … internal services. Improving their performance, availability, scalability, latency and efficiency. Drive technical excellence in everything we do, fostering a culture of data-driven reliability, monitoring and automation, following SRE best-practices. Work alongside development teams to design and build scalable and high available services, while establishing effective build frameworks for continuous deployment and self-service automation. Work on incident More ❯
Leeds, Yorkshire, United Kingdom Hybrid / WFH Options
BAE Systems (New)
a mix of disciplines, which allows us to come up with cutting edge, high quality solutions. What background we are looking for: Experience working in a similar DevOps/SRE/Infrastructure role An appreciation of Infrastructure as Code, and CI/CD tooling Scripting abilities with languages such as Shell, Bash, or Python etc A working knowledge of Linux … Digital Intelligence We are embracing Hybrid Working. This means you and your colleagues may be working in different locations, such as from home, another BAE Systems office or client site, some or all of the time, and work might be going on at different times of the day. By embracing technology, we can interact, collaborate and create together, even More ❯
live and transferrable DV Clearance Are you passionate about reliability, automation, and supporting mission-critical systems? Join this global defence organisation as a SiteReliability Engineer (SRE) and help shape the future of one of the UK's most vital national security platforms. You'll be joining a growing SRE team at the heart of the customer … s mission, focused on ensuring performance, availability, and scalability-while driving continuous improvement and innovation. About the Role As an SRE, you'll combine your operational expertise with software engineering skills to minimise manual effort and drive automation across complex systems. This role is perfect for someone who thrives on solving hard problems, automating the mundane, and building intelligent … overtime. Proactively enhance system availability, performance, and resilience. Develop tools and solutions to automate repetitive tasks and reduce operational toil. Collaborate with development teams to embed best practices and SRE principles. Deploy and manage monitoring systems to provide intelligent observability. Engage with the wider DevOps/SRE community within the organisation. Ideal Skills & Experience We're more interested in your More ❯
Gloucester, Gloucestershire, South West Hybrid / WFH Options
CGI
SiteReliability Engineer (DV Security Clearance) Position Description CGI was recognised in the Sunday Times Best Places to Work List 2025 and has been named one of the 'World's Best Employers' by Forbes magazine. We offer a competitive salary, excellent pension, private healthcare, plus a share scheme (3.5% + 3.5% matching) which makes you a CGI partner … ELK stack, Terraform, Grafana, Sonarqube, Openshift, Linux Required qualifications to be successful in this role Proven experience in SiteReliabilityEngineering or a similar DevOps/SRE role supporting cloud-based applications. Strong scripting and automation skills using Bash, Python, or Go. Experience with CI/CD pipelines and tools such as Jenkins, GitLab CI, and Ansible. … on big data projects is highly advantageous. Qualifications: Degree in Computer Science, Engineering, or related technical field (or equivalent practical experience). Relevant certifications in AWS, DevOps, or SRE practices are a plus. #LI-JS2 Together, as owners, let's turn meaningful insights into action. Life at CGI is rooted in ownership, teamwork, respect and belonging. Here, you'll More ❯
Bristol, Avon, South West, United Kingdom Hybrid / WFH Options
IO Associates
SiteReliability Engineer Location: Bristol (Hybrid) Salary: Up to £100,000 per annum (Depending on Experience) Clearance: Applicants must be eligible for SC and/or DV clearance You will play a key role in maintaining and enhancing the performance, availability, and reliability of both cloud-based and on-premise systems. … This is a hands-on technical role, working across development, support, and infrastructure teams to ensure the services remain scalable and cost-efficient. Skills & Experience You'll Need Strong SRE/DevOps background Hands-on experience with configuration management tools (e.g., Ansible, Chef) Proficient with infrastructure-as-code tooling like Terraform Containerisation knowledge (Docker) and orchestration platforms (Kubernetes, OpenShift, Docker More ❯
European cloud revolution. We supercharge our customers to innovate in hyperscaler cloud, enabling seamless migration, advanced security, and data-driven success. Currently, we are looking for a Senior Azure SiteReliability Engineer to join our team in the UK. Your daily responsibilities: Architect, implement, and improve existing monitoring and alerting systems Proactively investigate and identify performance anomalies and More ❯
Liverpool, Lancashire, United Kingdom Hybrid / WFH Options
The Investigo Group
re a hands-on cloud engineer with a passion for building scalable infrastructure and empowering those around you. You thrive in collaborative environments and enjoy mentoring others while ensuring reliability, scalability, and security across systems. You bring clarity, energy, and technical credibility to every conversation. About The Team: You'll join a collaborative and growing Cloud function embedded within … to a more modern, proactive, automation-first culture with strong leadership backing. About The Role: We're looking for a Senior Cloud Engineer to help us transform our cloud engineering capability from reactive, manual operations to a proactive, automation-first approach embedded across the entire SDLC. You'll collaborate on our infrastructure strategy, tooling decisions, and reliability posture … in delivering high-performing, resilient products What We're Looking For: Proven experience as a Senior Cloud Engineer in a modern Agile environment Deep understanding of infrastructure automation frameworks, SRE principles, and continuous delivery Hands-on skills with tools like Terraform, Ansible, Docker, Kubernetes Familiarity with cloud platforms (AWS, Azure, GCP) Programming/scripting in Python, Bash, or similar Strong More ❯
to undertake other relevant and appropriate duties as reasonably required. Travel should be expected as part of this role. Experience required: 2-3 years of experience in a DevOps, SRE, or related engineering role. Hands-on experience with at least one major cloud provider (OCI preferred). Proficiency in scripting languages (e.g., Go, Python)and Linux server scripting. Experience More ❯
are seeking a foundational member for the Cloud Infrastructure team at Writer. This role involves contributing to the development and implementation of our SiteReliabilityEngineering (SRE) program. The ideal candidate will ensure the reliability, scalability, performance, and security of Writer's critical systems, proactively guaranteeing that our high-ROI products reach customers seamlessly. Your responsibilities … ensure cost efficiency. Ensure the security and compliance of our systems, adhering to industry standards and regulations. Provide mentorship and technical guidance to junior engineers, fostering a culture of reliability and continuous improvement. Stay current with emerging technologies and industry trends to improve our sitereliability practices. Is this you? Proven expertise in SiteReliabilityEngineering with at least 7 years of hands-on experience. Deep understanding of system architecture and infrastructure design for high availability and performance. Bachelor's degree in Computer Science, Engineering, or a related field. Strong proficiency in programming languages such as Python, Java, or Go for automation and monitoring. Experience with cloud platforms like AWS, Azure, or More ❯
Bristol, Avon, South West, United Kingdom Hybrid / WFH Options
Twinstream Limited
SiteReliability Engineer Hybrid – Bristol (with occasional travel to other sites & possible 24/7 callout when on rota | £80,000 – £110,000 DOE Join a Team Built on Technical Excellence At TwinStream , we're not just technologists—we're mission-driven engineers solving some of the UK government's most complex cross-domain challenges. Founded by engineers … site and remotely, we continue to grow—driven by demand for our high-trust, high-performance services. Now, we're looking for a SiteReliability Engineer (SRE) to join our fast-growing team. Why this SiteReliability Engineer role? Our SREs are the quiet heroes behind the scenes—ensuring that our mission-critical systems stay … across cloud and on-prem environments, shaping infrastructure, delivery pipelines, and monitoring systems while partnering closely with dev and support teams. This role is ideal for someone passionate about engineering excellence, automation, and reliability at scale. Why TwinStream? We believe the best engineering happens in environments that respect life outside of work. Here's what we offer More ❯
Join us as a Senior SiteReliability Engineer - Oracle where you'll spearhead the evolution of our digital landscape, driving innovation and excellence. This role will include: applying software engineering techniques, automation, and best practices in incident response, ensuring the reliability, availability, and scalability of the systems, platforms, and technology through them To be successful as … a Senior SiteReliability Engineer - Oracle you should have experience with: Oracle Enterprise manager (OEM), Oracle Internet Directory (OID),Oracle database Performance Tuning - SME Deep understanding of LDAP protocols and directory services. SQL Optimization Strong skills in scripting languages (e.g., Python, Bash) to automate repetitive tasks and knowledge of configuration management tools (e.g., Ansible, Puppet, Chef). Expertise … strategic thinking and digital and technology, as well as job-specific technical skills This role will be based in our Knutsford campus. Purpose of the role To apply software engineering techniques, automation, and best practices in incident response, to ensure the reliability, availability, and scalability of the systems, platforms, and technology through them. Accountabilities Availability, performance, and scalability More ❯
Bristol, Gloucestershire, United Kingdom Hybrid / WFH Options
TwinStream
organisations TwinStream was formed to consolidate their collective expertise and experience into one business, providing technical excellence and exceptional service to their clients. We have teams working both on-site with clients and remotely from home. Location: Hybrid working in Bristol (occasional visits to other sites) with possible 24/7 call out when on rota Security Clearance: Eligible … demand for these services continues to grow in both scope and scale. We are seeking an experienced SiteReliability Engineer to help satisfy that demand. As an SRE you will be responsible for ensuring the availability, performance and cost effectiveness of these services. You will be working with multiple feature development teams and the BAU/Support team … to define and evolve our cloud & on-prem infrastructure & delivery pipelines, improving system observability, demonstrating performance and capacity improvements and proactively identifying and mitigating reliability risks. Key Responsibilities of the SiteReliability Engineer: Collaborate with Software Engineers to improve reliability and performance in their subsystems Partner with System Administrators in automating toil and eliminating alerts Evolve More ❯
Senior SiteReliability Engineer Start: ASAP Duration: 6-12 months Location: hybrid, London (Tuesdays, Thursdays WFH) Pay: negotiable, inside IR35 We're looking for an experienced DevOps Engineer to join our team on a contract basis, with a focus on AWS infrastructure, observability tooling, and CI/CD automation. This is a hands-on role supporting high-availability … Python, Bash, Go or SQL - Work with Git-based workflows for infrastructure as code - Troubleshoot Kubernetes workloads and containerised services - Participate in an on-call rotation to ensure system reliability Your Profile Essential: - Solid hands-on AWS experience in a DevOps setting - Background in incident, change, and problem management - Strong with Prometheus, Grafana, Splunk, and PromQL - Proficient in scripting More ❯
Farnborough, Hampshire, United Kingdom Hybrid / WFH Options
Searchability
DOE + Benefits Farnborough-based - Hybrid working model Must hold active SC or DV Clearance (Eligibility) To apply, email: ABOUT THE ROLE This is a hands-on engineering position in a small, agile consultancy delivering rapid proofs-of-concept and high-quality platform solutions into the Defence and Security sector. I am looking for a Senior Platform Engineer to … CI/CD, Kubernetes configuration, GitOps workflows, and infrastructure monitoring. You'll work closely with a highly capable Platform team to support security, maintainability, and performance across a modern engineering environment. WHAT YOU'LL BE DOING Lead design and build of automation toolchains and CI/CD workflows Implement Kubernetes configuration and orchestration best practices Develop scalable, secure infrastructure … oversight and guidance to cross-functional teams Stay ahead of emerging tech trends to enhance platform capabilities WHAT I'M LOOKING FOR 5+ years' experience in Platform, DevOps or SRE roles Expertise in Kubernetes and container orchestration Strong experience with Terraform , Ansible , and CI/CD tooling (e.g., Jenkins, GitLab CI/CD) Solid understanding of Git and version control More ❯
/ad hoc duties as required to meet the needs of the business. Experience/Competences Essential Deep and broad experience of AWS Cloud platform and services DevOps and SRE principles Very good working knowledge of incorporating testing into CI/CD pipelines Understanding of various deployment patterns such as blue-green and canary Platforms; Windows Server, Amazon Linux, RHEL More ❯
/ad hoc duties as required to meet the needs of the business. Experience/Competences Essential Deep and broad experience of AWS Cloud platform and services DevOps and SRE principles Very good working knowledge of incorporating testing into CI/CD pipelines Understanding of various deployment patterns such as blue-green and canary Platforms; Windows Server, Amazon Linux, RHEL More ❯
distributor services across asset managers, insurance companies, retirement providers, and wealth management platforms. Job Overview As the Head of Production Engineering and SiteReliabilityEngineering (SRE) for the GIDS organisation, you will lead a team responsible for the scalability, resilience, performance, and reliability of cloud and hybrid infrastructure powering some of the most critical client … with metrics, and build systems and teams that proactively address issues before they impact clients. Key Responsibilities: Define and execute the vision and roadmap for Production Engineering and SRE within GIDS. Build and lead globally distributed, high-performance teams with a focus on talent development, SRE culture, and operational excellence. Collaborate cross-functionally with Engineering, Product, Compliance, and … in around-the-clock operations, including tooling, automation, and shift rotation planning. Qualifications Required: 10+ years of experience in engineering, with 5+ years in a leadership role in SRE, DevOps, or Production Engineering. Proven track record managing reliable, scalable systems in a high-compliance environment (e.g., FinTech, HealthTech). Strong understanding of modern software development lifecycle, CI/CD More ❯
Caldecotte, Milton Keynes, Buckinghamshire, England, United Kingdom
Connells Group HQ
across customer platforms, internal systems, and exploring AI-driven solutions - making it an exciting space for anyone looking to shape the future of property technology in the UK.The Lead SiteReliability Engineer owns the operational reliability of the Connells Microsoft Azure public cloud platform. You are data-informed, customer-first and know from your own engineering experience that engineering quality drives service reliability. Your engineering background helps you understand your customers, your platform and how automation is the only way to scale quality.You will be a champion of sitereliability practices within and without your team. You form part of a ‘you build it … you run it’ platform with an agile mindset. We want to hear from you if: You are a Sitereliability Engineer in a past life, or an SRE ready to step up and lead. You have recent Microsoft Azure experience, but we also recognise the transferability of cloud engineering and operations fundamentals from Amazon Web Services and More ❯
Cambourne, Cambridgeshire, United Kingdom Hybrid / WFH Options
Remotestar
to gemstone supplies They have a presence in London, Hong Kong, Amsterdam, and as well in Mumbai and now in New York in 2001. About the role : As the SRE Manager, you will play a critical role in ensuring the reliability, scalability, and performance of our infrastructure and services through both direct technical contribution along with team building and … tooling. Drive automation initiatives to streamline operational workflows and improve efficiency. Develop and maintain tools, scripts, and dashboards to monitor system health, performance, and reliability. Build a first class SRE team. Through a combination of leading by example, coaching and mentoring, mould the team would want to have around you. Provide leadership and guidance to the SRE team, fostering a … culture of collaboration, innovation, and continuous improvement. RESPONSIBILITIES: Proven experience in a senior or lead SRE role, with a strong track record of building and maintaining highly reliable infrastructure and services. Expertise in incident management, including incident response, resolution, and post-mortem analysis. Proficiency in monitoring, alerting, and observability tools such as Prometheus, Grafana, ELK stack or Datadog. Experience with More ❯