modern technologies, embracing Infrastructure as code at all levels with automation as a core requirement for all projects. We are looking for an Observability Engineer to work within our SRE teams to design, build and iterate on our O11Y platform. This engineer will have to work both hands on and strategically with our architects, global service delivery and product teams … organized in a multi-tasking environment. Self-starter. Outstanding collaboration and communication, and documentation skills with a proven ability to work cross-functionally. • BS/MS in computer science, engineering, or a related technical discipline or equivalent experience. Applicants must have a valid work permit in the UK. In line with Thales' Baseline Security requirements, candidates will be asked More ❯
Sheffield, Yorkshire, United Kingdom Hybrid / WFH Options
Reach Studios Limited
culture and best practices across the development team Advising on and managing cloud infrastructure (AWS, Azure etc.) What You'll Need Must-haves: Comprehensive experience in a DevOps or SRE role, ideally in a multi-project environment Deep experience with web stacks: Nginx/Apache, PHP-FPM, MySQL, Redis, Varnish, Elasticsearch Proven expertise in managing and optimising Cloudflare across DNS More ❯
SiteReliability Engineer (SRE) – Market leading company - Milton Keynes (Tech stack: .Net, C#, ASP.Net Core, SQL Server, PowerShell, Azure CLI, Bash, Azure DevOps, Jenkins, GitHub Actions, Docker, Kubernetes) Help shape the tech future of UK market leader! Backed by a major financial institution with soaring profits - my client is modernising platforms, embracing AI, and driving automation at scale. … We're hiring a Lead SiteReliability Engineer (SRE) to drive reliability, observability, and performance across our Azure cloud infrastructure. You’ll work in a modern engineering environment where we live by "you build it, you run it", focused on automation, scale, and resilience. 🛠️ Tech stack you’ll work with: .NET, C#, ASP.NET Core, SQL Server … PowerShell, Azure CLI, Bash, Azure DevOps, Jenkins, GitHub Actions, Docker, Kubernetes We want to hear from you if: ✅ As a SiteReliability Engineer (SRE) you've delivered scalable systems using .NET, C#, and ASP.NET Core , with real-world experience managing production workloads ✅ You’ve automated operations using PowerShell, Azure CLI, and Bash to reduce toil and boost efficiency More ❯
Manchester, England, United Kingdom Hybrid / WFH Options
bet365
Who we are looking for A SiteReliability Engineer, who will enhance system reliability, observability and performance through a strong engineering approach and assist with incident resolution and best practices. You will have software engineering skills, focusing on system reliability and observability. You will monitor the health, performance and availability of critical systems, directly … impacting operational efficiency. Using your engineering expertise, you will implement solutions that enhance reliability, including service instrumentation with tools such as Open Telemetry, improve logging practices and develop features for maintainability. You will also help engineer tools and automation for effective service management. Collaboration is key, working across multiple functions to integrate reliability and observability best practices … user demands and enhance overall service performance. This role is eligible for inclusion in the Company’s hybrid working from home policy. Preferred skills and experience Excellent knowledge of SiteReliabilityEngineering principles, including the creation and management of effective Service Level Indicators (SLI) and Service Level Objectives (SLO) for reliability and customer satisfaction. Knowledge of More ❯
Stoke-on-Trent, England, United Kingdom Hybrid / WFH Options
bet365
Who we are looking for A SiteReliability Engineer, who will enhance system reliability, observability and performance through a strong engineering approach and assist with incident resolution and best practices. You will have software engineering skills, focusing on system reliability and observability. You will monitor the health, performance and availability of critical systems, directly … impacting operational efficiency. Using your engineering expertise, you will implement solutions that enhance reliability, including service instrumentation with tools such as Open Telemetry, improve logging practices and develop features for maintainability. You will also help engineer tools and automation for effective service management. Collaboration is key, working across multiple functions to integrate reliability and observability best practices … user demands and enhance overall service performance. This role is eligible for inclusion in the Company’s hybrid working from home policy. Preferred skills and experience Excellent knowledge of SiteReliabilityEngineering principles, including the creation and management of effective Service Level Indicators (SLI) and Service Level Objectives (SLO) for reliability and customer satisfaction. Knowledge of More ❯
South East London, England, United Kingdom Hybrid / WFH Options
Unitary
SRE (Unitary AI) Description The company We are a rapidly growing startup developing solutions that blend human expertise and AI agents to handle manual customer and marketplace operations tasks. Our unique approach combines the strengths of human expertise (high accuracy and nuanced decision-making) with the advantages of AI automation (speed and cost efficiency). This cutting-edge technology helps … the beginning of our journey - and we are very excited about our plans for growth over the coming year and beyond! The role We are now looking for a SiteReliability Engineer to ensure our systems run smoothly and reliably at scale. Your expertise in monitoring, observability, and system automation will help maintain the high availability and performance … such as Terraform for scalable system deployment Are familiar with MLOps practices and tools, and monitoring machine learning systems in production This role will report to the VP of Engineering and can be based anywhere within a 3-hour time zone of the UK. Benefits About us The team Unitary is a remote-first team of c. 20 people More ❯
Operations SiteReliability Engineer page is loaded Operations SiteReliability Engineer Apply locations United Kingdom-Bristol-Almondsbury-Hempton Court time type Full time posted on Posted 30+ Days Ago job requisition id R022662 Please Note: 1. If you are a first time user, please create your candidatelogin account before you apply for a job. (Click Sign … Provide feedback and coaching to upstream teams (both internal and vendors) to reduce escalations and to continually improve overall experience for customers. Professional Experience Required A degree in Systems Engineering, Computer Science or related fields with related experience preferred 5+ years of experience administering Linux systems Strong hands-on experience of variants of linux distros 2+ years Operational experience … salary Generous bonus scheme Equity package Competitive company pension Employee stock purchase plan (ESPP) Private Medical Insurance (Individual or family) Life Assurance scheme (up to 4x salary) Ample on-site parking. This role will need to participate in weekends and holidays on-call support as and when required. Broadcom is proud to be an equal opportunity employer. We will More ❯
Role: SRE Lead Location: Birmingham, UK (Hybrid, 2-3 days WFO) Contract: 3 months (Possible extension ) Are you a skilled SiteReliability Engineer (SRE) with experience in maintaining scalable and reliable infrastructure? We're looking for a proactive leader with a passion for automation, incident management, and system optimization. Key Skills Required: 5+ years of SRE or similar More ❯
the real estate space. Backed by a major financial institution and with a brand-new, tech-committed CEO at the helm, this is a rare opportunity to lead platform reliability across a business that touches millions. This is not just a hands-on role, it’s a leadership opportunity at the centre of a £multi-million transformation programme. You … ll shape and grow a SiteReliability function from the ground up, beginning with owning the Azure-based App Platform and evolving it into a modern, scalable engineering hub for over 400 IT professionals and 100 software engineers. Our client defines this role as ‘sitereliabilityengineering’ but are understanding and open to you … you do not have to have had a previous leadership/management position. You will however have to have the gravitas, hunger and ability to lead and grow an SRE team. What You’ll Do: Own the operational reliability of a large-scale Azure cloud platform. Drive automation-first culture using Terraform, Azure CLI, PowerShell and more. Lead incident More ❯
Cheltenham, Gloucestershire, United Kingdom Hybrid / WFH Options
TwinStream
organisations TwinStream was formed to consolidate their collective expertise and experience into one business, providing technical excellence and exceptional service to their clients. We have teams working both on-site with clients and remotely from home. Details : Employment: Contract (outside of IR35) Security Level: Must have live DV- clearance About the role: Successful candidates will be working as part … of an on-site team to maintain and support a managed cross-domain service using a wide range of technology, platforms and tools. The team employ sitereliabilityengineering tools and practices to continuously verify and improve the service. Responsibilities: Build and Deploy code from multiple project teams: Maintenance and administration of a CI pipeline building More ❯
the scale it takes for us to feed the nation. The level of data, transactions and variety it involves. Then you'll realise that ours is a modern software engineering environment because it has to be. We've made serious investment into a Tech Academy and into setting standards and principles. We iterate, learn, experiment and push ways of … are looking for an Azure Edge Cloud Engineer to join and represent a team dedicated to delivering and managing the Azure Cloud Platform. Your work will ensure the scalability, reliability, security, and efficiency of the environments hosting Sainsbury's applications. Role Summary: We are seeking a Cloud Platform Engineer with a passion for Azure, Infrastructure as Code and automation … positive about tracking (JIRA), system monitoring, security & auditing Exposure to source and version control tools such as GitHub and preferably experience in GitHub Actions Good understanding of Service Operations, SRE & ITIL responsibilities Experience of Agile Methodologies and Frameworks. An up-to-date understanding of Cloud technology base and a general understanding of other technology bases Proven track record as a More ❯
Manchester, Lancashire, United Kingdom Hybrid / WFH Options
WorksHub
the end. We have a mature DevOps culture in place where teams are responsible for the infrastructure and deployment of those applications. We are actively expanding our Manchester born SRE function, which aims to advance our knowledge and innovation globally in areas such as Observability, Reliability and Availability. We have the autonomy to choose the technologies and processes that … production. Being a critical path multi-region service means we set the bar high for availability. We serve billions of requests per week with double digit millisecond response times. Reliability, scalability, and cryptographic agility is critical to us! Subscription Services Responsible for supporting new customer signup and retention flows, billing services and payment integrations. The Subscription team supports high … significant part of the implementation, design, testing, and deployment of services within your team. Leverage, learn and apply cutting edge technologies on challenging and varied business domains. Apply principled engineering practices including unit testing, integration testing, and continuous integration. Develop your technical understanding to support and build your career. Act as a mentor and an example to others in More ❯
Lead Cloud Engineer As a Lead Cloud Engineer, you’ll have SME level knowledge across AWS and Platform Engineering disciplines. You’ll build AWS Cloud Solutions to ensure the organisation can take full advantage of Cloud based technologies. You’ll contribute to standards, guardrails and best practices, and implement improvements to processes and tooling to ensure engineering excellence. … You’ll have a strong understanding of operational requirements, and ensure Scalability, Resiliency, Observability, Security, Cost and Maintainability are at the forefront of all engineering activities. This specific project will involve Real Time Payments value stream, Form 3 gateway set-up and setting up the infrastructure for connectivity. What you’ll bring SME Level knowledge in AWS and Platform … and secure code delivery (ie SCA, SAST, DAST Networks/Security/Middleware & Apps Scripting/Coding (Bash, Python) End to End Observability solutions (logging, monitoring, alerting) Knowledge of SRE principles and practices More ❯
and tooling (e.g., Terraform) as data sources for observability. Solid programming ability in Golang (preferred) or Python for automation and integration. Strong collaboration skills to work with cross-functional engineering teams. Experience working in Linux-based environments. Bonus/Nice … to-Have Skills: Experience deploying Grafana instances via code (provisioning dashboards, datasources). Familiarity with OpenTelemetry, metric instrumentation, and telemetry pipelines. Background in data center environments, infrastructure monitoring, or SRE practices. Exposure to CI/CD workflows, containers (Podman/Docker), and cloud-native systems. More ❯
and optimizing cloud-modernization solutions that drive our business forward Extensive experience in modernizing Java based monolith applications to Microservices based architecture on Azure Extensive experience in DevSecOps and SRE Primary Responsibilities: Lead the design and implementation of microservices architecture on Azure. Architect and deploy scalable, reliable, and secure solutions using AKS. Design and manage PostgreSQL databases in a cloud … Conduct assessments of existing applications and recommend modernization strategies. Develop and maintain architectural documentation and guidelines. Ensure compliance with security and governance policies. Provide technical leadership and mentorship to engineering teams. Stay updated with the latest industry trends and technologies. Implement Infrastructure as Code (IaC) using tools like Terraform or ARM templates. Oversee DevOps practices and CI/CD More ❯
range of industries, including power and rail, and also has interests in a number of R&D projects in various scientific sectors. At Camlin, we believe in high-quality engineering and design, allowing us to develop market-leading products and services. We love creating value for our customers by solving difficult problems. Currently, Camlin operates in over 20 countries … The successful candidate must be available for participation in the on-call rotation schedule and occasional travel. What you'll need: 5+ years of experience in a similar role (SRE, Cloud infrastructure/DevOps) An understanding of the SDLC process Familiarity with configuration management tools (Ansible, Puppet) Infrastructure as Code (Terraform) Experience with container technologies (Docker or Podman) Experience with More ❯
Halian Technology looking for a talented and driven SiteReliability Engineer (SRE) to join our growing technology team. In this role, youll ensure the reliability, scalability, and performance of our digital platforms that support memorable customer experiences across the hospitality sector. Youll work alongside our engineering, product, and infrastructure teams to build high-availability systems and … automated operations that support the future of digital hospitality. Key Responsibilities: Drive system reliability, availability, and performance through engineering excellence. Design and implement monitoring, alerting, and observability tools using platforms like Datadog. Automate operational tasks using scripting, Infrastructure as Code (IaC), and configuration management tools. Troubleshoot incidents, lead root cause analysis, and improve Mean Time to Resolution (MTTR … infrastructure meets security and compliance standards. Optimise system resources for both performance and cost-effectiveness. Contribute to incident response and participate in on-call rotations. Track and improve key SRE metrics such as error rates, incident count, and monitoring coverage. What Youll Bring: 3+ years of experience in SiteReliabilityEngineering, DevOps, or equivalent roles. Strong skills More ❯
vouchers for their team and the ability to "work from anywhere" for two weeks of the year Paid one month sabbatical after four years' employment Role Overview Luminance's SRE team combines strong problem solving, infrastructure tooling and wider DevOps practices to provide a service of Luminance's unique software applications. The team plays a crucial role in incident response … and issue resolution, swiftly addressing and resolving service interruptions to maintain the highest level of customer satisfaction. With a focus on automation, scalability, reliability and security, the team enable Luminance to ensure a performant, seamless experience for its users. You will join a small, dynamic team of creative engineers and work together to tackle some of Luminance's greatest More ❯
MCS Group is working with one of their closest clients as they seek to appoint a SiteReliability Engineer to their growing team. An award winning business which has seen exponential growth over the last 2 years off the back of their transformative technology being utilised by organisations across the UK and Ireland and beyond. They've grown … required. Strong knowledge of Linux, Windows, and IP networking, covering routing, DNS, firewalls, and load balancing. Commercial experience with Docker, Kubernetes, and container orchestration. Familiarity with Elasticsearch. Understanding of SRE principles, DevOps, and DevSecOps methodologies. Strong problem-solving skills, attention to detail, and the ability to work autonomously. Full right to work in Ireland or UK. The client is unable More ❯
Join us as a Senior PostgreSQL SRE at Barclays where you'll effectively monitor and maintain the bank's critical technology infrastructure and resolve more complex technical issues, whilst minimizing disruption to operations. In this role you will assume a key technical leadership role. You will shape the direction of our database administration, ensuring our technological approaches are innovative and … aligned with the Bank's business goals. You will guide high-impact projects to completion, collaborate with management, and implement SRE practices using software engineering and database administration to address infrastructure and operational challenges at scale. To be successful as a Senior PostgreSQL SRE , you should have: Strong experience as a Principal Level Database Administrator, with a focus on … PostgreSQL A Proven track record of implementing and leading SRE practices across large organizations or complex teams. Extensive hands-on experience on Containers and Kubernetes In depth experience with DevOps automation tools such as Code versioning (git), JIRA, Ansible, database CI/CD tools and their implementation. Some other highly valued skills may include: Expert expertise with scripting languages (e.g. More ❯
Join us as a Senior PostgreSQL SRE at Barclays where you'll effectively monitor and maintain the bank's critical technology infrastructure and resolve more complex technical issues, whilst minimizing disruption to operations. In this role you will assume a key technical leadership role. You will shape the direction of our database administration, ensuring our technological approaches are innovative and … aligned with the Bank's business goals. You will guide high-impact projects to completion, collaborate with management, and implement SRE practices using software engineering and database administration to address infrastructure and operational challenges at scale. To be successful as a Senior PostgreSQL SRE , you should have: Strong experience as a Principal Level Database Administrator, with a focus on … PostgreSQL A Proven track record of implementing and leading SRE practices across large organizations or complex teams. Extensive hands-on experience on Containers and Kubernetes In depth experience with DevOps automation tools such as Code versioning (git), JIRA, Ansible, database CI/CD tools and their implementation. Some other highly valued skills may include: Expert expertise with scripting languages (e.g. More ❯
Join us as a Senior PostgreSQL SRE at Barclays where you'll effectively monitor and maintain the bank’s critical technology infrastructure and resolve more complex technical issues, whilst minimizing disruption to operations. In this role you will assume a key technical leadership role. You will shape the direction of our database administration, ensuring our technological approaches are innovative and … aligned with the Bank’s business goals. You will guide high-impact projects to completion, collaborate with management, and implement SRE practices using software engineering and database administration to address infrastructure and operational challenges at scale. br br All potential applicants are encouraged to scroll through and read the complete job description before applying. br br To be successful … as a Senior PostgreSQL SRE, you should have: br br Strong experience as a Principal Level Database Administrator, with a focus on PostgreSQL br br A Proven track record of implementing and leading SRE practices across large organizations or complex teams. br br Extensive hands-on experience on Containers and Kubernetes br br In depth experience with DevOps automation tools More ❯
Sheffield, South Yorkshire, United Kingdom Hybrid / WFH Options
Context Recruitment
Senior Azure SiteReliability Engineer A leading Cloud Consultancy are headhunting for a DevOps Platform Architect to join their impressive Cloud Services team. As a DevOps advocate, you will be empowered to streamline processes through innovative use of code, platforms, and tools. Your team will provide standardized approaches and frameworks, collaborating within the Cloud Services Group to architect … Introduce valuable new technologies and tools. Stay updated with emerging technologies and industry trends. Independently handle tasks and projects. Requirements: Understanding of the software development lifecycle and DevOps/SRE methodologies. Microsoft technology background, especially Azure PaaS. Familiarity with CI/CD implementations and IaC tools (e.g., Terraform, Bicep, ARM). Proficient in multiple programming languages (e.g., .Net (C#), PowerShell More ❯
Leeds, England, United Kingdom Hybrid / WFH Options
MRJ Recruitment
signed off on a £multi million 3 year technology & digital programme and this is definitely the time to be joining the journey. About the role As the new Devops Engineering Manager, you'll be responsible for building from scratch, a high-performing team of Platform Engineers. Your mission? To orchestrate and evolve the core technology platforms that underpin the … architecture and reducing technical debt. Define SLIs and SLOs across latency, availability, and throughput, aligning internal goals with platform performance. Promote and embed SiteReliabilityEngineering (SRE) practices to improve stability, monitoring, and response. Manage a growing toolset for orchestration, observability, and automation. Partner closely with Engineering, Delivery, and Architecture teams to ensure seamless integration and … Azure Boards, and Azure Repos. Knowledge of Azure infrastructure as code (IaC) tools like Terraform, ARM templates, or Azure CL Strong knowledge of modern platform management practices (DevOps, Agile, SRE, ITIL). Deep experience with platform tooling — including IaC, APIs, automation, and observability frameworks. Proven ability to design, build, and scale robust platform services across complex, regulated environments. Experienced in More ❯
Bradford, Yorkshire, United Kingdom Hybrid / WFH Options
Freemans Grattan Holdings (fgh)
capabilities and optimise and enhance our customer journey. Working collaboratively with a team of transformation experts you will have the flexibility to leverage your professional experience to solve computer engineering issues across a variety of technical areas, dependent on where your interests lie. Innovation is key as we look for new ideas which will improve the customer experience and … Essential 5+ years of experience in a DevOps, or SiteReliabilityEngineering building high-traffic, high availability systems. Experience with sitereliabilityengineering (SRE) principles and monitoring tools, including New Relic. Experience in website performance monitoring and tuning using tools such as Lighthouse and the ability to troubleshoot performance issues. Proficiency in CI/ More ❯