Site Reliability Engineering Jobs in the UK

51 to 75 of 254 Site Reliability Engineering Jobs in the UK

Head of Product Operations

London, United Kingdom
Rewardgateway
Support teams to input into business reviews Be a visionary Ops champion for our internal teams Skills Bachelor's or Master's degree in a STEM field (Computer Science, Engineering, Mathematics, etc.) or equivalent experience Demonstrable experience in product management or product operations Strong product and technical background with proven ability to communicate effectively with engineers and technical team … management best practices-user research, market insights, goal setting, prioritisation, execution, and leadership Familiarity with monitoring tools, incident management protocols, and collaboration with Site Reliability Engineering (SRE) teams Proven ability to develop relationships and align teams across product, engineering, and leadership to ensure the effective execution of strategic priorities Hands-on experience analysing workflows and implementing … of improvement, develop solutions, and inspire change with autonomy The Interview Process Online interview with the Senior Talent Partner In-person interview with the Director of Product Operations and Engineering team member Online interview with Director of Product Operations and CPO At Reward Gateway Edenred, we are committed to ensuring an inclusive and accessible recruitment process for all candidates. More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Staff Site Reliability Engineer / DevOps

London, United Kingdom
Almedia
Staff Site Reliability Engineer/DevOps London or Remote About you An SRE or DevOps engineer with hands-on experience in high-traffic production systems Strong in Linux, databases (MySQL, Postgres, MongoDB, Redis), and networking fundamentals Comfortable with Kubernetes, CI/CD pipelines, and observability tools like Datadog A self-starter who thrives in scaling environments and can … work independently without PMs Pragmatic, able to balance prevention, maintenance, and firefighting when needed Your mission is to Take ownership of uptime and reliability for a platform serving 50M+ users Build robust monitoring, alerting, and incident response practices Improve CI/CD pipelines and enable safe deployments (blue-green, canary) Partner with engineers across teams to fix pain points … CD best practices Observability tools like Datadog, OpenTelemetry, or ELK stack Nice-to-haves: RabbitMQ, Kafka, Terraform, Ansible, GCP, Datadog What makes this role exciting Be the first senior SRE hire with ownership of reliability across the entire platform Shape infrastructure and processes for a scale-up growing beyond 100 FTE Work on a product serving millions of users More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Senior Site Reliability Engineer

London, United Kingdom
Blip
Overview Blip is a leading tech company focused on software engineering solutions for sports entertainment. We operate at scale. As part of Flutter Entertainment, we play an essential role in the Group's goal of becoming the global leader in online sports betting and iGaming, developing innovative products and platforms for over 14 million monthly customers worldwide. We are … ex. Deciding which technology, or pattern to create or leverage) Experience being "on-call" for a service, and familiarity with incident notification tooling (ex. Pagerduty, Opsgenie) Comprehensive understanding of SRE principles (ex. Working knowledge of the Google SRE book) Demonstrated strength in leading a project in a agile/scrum environment Thrives in a diverse work environment We'd Like … distributed dev environments) Built and maintained a system and culture that supported and implemented SLOs Has shown to be a thought leader contributing to the broader industry conversation about SRE principals and topics (ex. Speaking at conferences) Perks and Benefits This is what you should have. What do we have, you ask? Well, you can check our amazing perks & benefits More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Site Reliability Engineer

City of London, London, United Kingdom
Hybrid / WFH Options
Digital Realty (UK) Limited
Position Title: Site Reliability Engineer, Interconnection Service and Network Delivery Location: Hybrid: Austin, Dallas, Boston, Ashburn, Atlanta, London, or Amsterdam Your role In this role, you will be responsible for deploying and maintaining all Digital Realty interconnection fabric network infrastructure. The ideal candidate can demonstrate a unique blend of network engineering, network operations, and software understanding through … the application of engineering principals. You will focus on delivering operational discipline and embrace key operational principals including automation, agile development, and scripting. What youll do You will be part of the global Fabric Engineering organization and work in tandem with other teams to build and maintain a global network infrastructure. Ideal candidates for this role will bring … an understanding carrier class network infrastructure as well as experience working in a fast-paced development environment. What youll need 5+ years of operations and engineering experience Bachelors degree in Computer Science (or equivalent) preferred Strong experience with automation tools (Ansible, Terraform, etc) Strong experience working with Linux systems and tools Experience with Python (or equivalent high-level language More ❯
Employment Type: Permanent
Posted:

Site Reliability Engineer (DV Security Clearance)

Remote, UK
Hybrid / WFH Options
CGI
want it to go. *** Applicants Must be solely UK National and already hold HMG HLC clearance *** Role Location: Gloucester or Manchester We are seeking a highly skilled and motivated Site Reliability Engineers to join our team. The ideal candidates will possess a good understanding of engineering principals, and broad understanding of full-stack software technologies, with hands … and cost optimisation (rightsizing, reserved instances, auto scaling). • Disaster Recovery & Business Continuity Planning Develop and test backup/DR strategies, restore drills, and self healing infrastructure to ensure reliability and uptime. • Collaboration & Knowledge Sharing Work closely with DevOps, development, security and operations teams; prepare architecture/design documents, network diagrams, runbooks and training materials. Required qualifications to be … encryption, audit logging, network isolation, and compliance frameworks. • Monitoring & Optimization Tools: Familiarity with CloudWatch, Grafana, Datadog, Prometheus, ELK or similar The position requires team members to work from client-site to ensure the reliability and availability of critical systems. Together, as owners, let’s turn meaningful insights into action. Life at CGI is rooted in ownership, teamwork, respect More ❯
Employment Type: Full-time
Posted:

Site Reliability Engineer

London, United Kingdom
Alokknight
We offer an exciting opportunity to join a world-class network team in a dynamic environment that feels like a start-up. As a Site Reliability Engineer (SRE) , you will deploy, manage, troubleshoot, and innovate the tools, services, and components that enable our network engineers to automate and maintain network operations. Your internal customers are your network engineering More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Solutions Engineer/Site Reliability Engineer

London, United Kingdom
Hybrid / WFH Options
Zefr
the globe. What you'll do: As a Site Reliability Engineer at Zefr, you'll apply your expertise in cloud infrastructure, CI/CD, Observability, and core SRE concepts, to deliver high-quality, reliable, and scalable solutions. A significant aspect of this role involves working closely with Zefr's Engineering and Data Science teams ensuring the infrastructure … secure, resilient, scalable, and cost-efficient applications and systems/pipelines in AWS and GCP. Foster and push our DevOps culture and philosophy by encouraging continuous improvement across all engineering teams. Proactively maintain the health of production environments, including monitoring application performance and resource utilization. Participate in 24/7 on-call rotation, respond to system performance issues and … at the application and infrastructure level. Mature our CI/CD workflows and release process. Maintains a forward-thinking approach, actively researching and proposing new solutions. Propose and review Engineering Request for Comments (RFC) to drive Engineering architecture and practices. Technology Stack at Zefr: Core Infrastructure & Cloud Platforms: Cloud Providers: Google Cloud Platform (GCP), Amazon Web Services (AWS More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Mid/Senior DevOps Engineer

London, United Kingdom
Intelmatix
applications, delivering scalable, secure, and data-driven solutions to global clients. Role Overview: We are looking for a highly motivated Mid/Senior DevOps Engineer to join our Platform Engineering team. This role plays a critical part in shaping and supporting the infrastructure that powers our data and AI-driven platforms. You will work closely with engineers, data scientists … cloud-native solutions, and enabling the deployment of complex applications, including AI/ML models. Key Responsibilities: Maintain and optimize our cloud infrastructure (primarily AWS) with a focus on reliability, scalability, and cost efficiency. Automate infrastructure provisioning using Infrastructure-as-Code (IaC) tools such as Terraform. Build and maintain CI/CD pipelines for application, data, and model deployment … workflows. Collaborate with engineering and data science teams to deploy and monitor machine learning models and analytical services. Implement and enforce security best practices across cloud and network environments. Troubleshoot deployment and performance issues across multiple environments. Set up and maintain observability tools for logging, monitoring, and alerting (e.g., Prometheus, Grafana, Loki). Contribute to internal tooling to streamline More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Splunk Site Reliability Engineer / Migration Specialist

Birmingham, England, United Kingdom
Prestige Talent Partners
Splunk Site Reliability Engineer/Migration Specialist - Fixed Term Contract 6-12 Months Job Summary: The Splunk SRE/Migration Specialist is responsible for leading and executing the migration of data, dashboards, alerts, and configurations from Splunk systems to Elasticsearch. This role involves deep technical expertise in Splunk architecture, data ingestion, and observability tools, along with strong project … models and recreate in Kibana. Incident Response Ensure the smooth functioning of Splunk platform across BT maintaining the Splunk’s infrastructure in Production & Non-Production environments. To support Splunk SRE & Application teams in investigating incidents following established procedures. Upgrades: Keep the Splunk components to the latest version applicable and carry out the necessary pre & post upgrade checks accordingly. Change Requests … security measures and ensure compliance with relevant standards and best practices. Skills and Qualifications: Hands-on experience with enterprise-level monitoring tools and applications, and familiarity with DevOps/SRE’s best practices. Proven experience with Splunk and Elasticsearch (ELK Stack). Familiarity with containerized environments (Docker, Kubernetes). Proficiency in Unix/Linux systems, Networking protocols, and possess strong More ❯
Posted:

Staff Platform Engineer

Fleet, Hampshire, United Kingdom
Hybrid / WFH Options
RVU Co UK
Staff Platform Engineer Department: Engineering Employment Type: Permanent Location: Fleet Description Hybrid - 2 Days per week in the Fleet office Tempcover Tempcover is at the forefront of the fast-growing world of short term insurance. Our mission is to make car insurance flexible, quick, and easy for drivers. We've sold millions of policies that have helped drivers get … ownership, empowerment and impact. Each Engineer plays an integral role in the development, delivery, maintenance, and support of our insurance-based systems, both public-facing and internal. The platform engineering team enables our engineers to quickly build and run safe, secure and cost effective systems in our public cloud. What you'll be doing As a Staff Platform Engineer … you'll be working as part of an agile team that provides services and tools to our internal engineering teams Suggest and drive change across the engineering team and both challenge and improve existing practices and systems Mentor and coach other engineers; helping them grow whilst fostering a strong engineering culture You'll be introducing new technologies More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Staff Platform Engineer

Hart, Yorkshire, United Kingdom
Hybrid / WFH Options
RVU Co UK
Staff Platform Engineer Department: Engineering Employment Type: Permanent Location: Fleet Description Hybrid - 2 Days per week in the Fleet office Tempcover is at the forefront of the fast-growing world of short term insurance. Our mission is to make car insurance flexible, quick, and easy for drivers. We've sold millions of policies that have helped drivers get where … ownership, empowerment and impact. Each Engineer plays an integral role in the development, delivery, maintenance, and support of our insurance-based systems, both public-facing and internal. The platform engineering team enables our engineers to quickly build and run safe, secure and cost effective systems in our public cloud. What you'll be doing As a Staff Platform Engineer … you'll be working as part of an agile team that provides services and tools to our internal engineering teams Sug gest and drive change across the engineering team and both challenge and improve existing practices and systems Mentor and coach other engineers; helping them grow whilst fostering a strong engineering culture You'll be introducing new More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Site Reliability Engineer - London

London, United Kingdom
Hybrid / WFH Options
Valarian Technologies Limited
edge software, platforms, and infrastructure. The Role Join us as a Site Reliability Engineer and help us build the future of data sovereignty! We're seeking an SRE passionate about creating high-performance, scalable, and reliable services for our production infrastructure. You'll have a direct impact, improving existing systems and developing innovative solutions to complex challenges. Our … small, collaborative engineering teams own the full lifecycle of their services, from development to production operations. We champion automation and empower you to choose the best tools for the job. If you thrive in a fast-paced environment where you can make a real difference, we want to hear from you! Required skills/expertise: Develop and implement a … and applications to support large concurrent user bases and sustained daily usage. This will involve performance tuning, capacity planning, and optimization of resource utilization. Collaborate closely with the product engineering team to influence the design and implementation of new products and features, ensuring they meet our reliability and scalability standards from the outset. Preferred Qualifications Bachelor's degree More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Site Reliability Engineer

Reigate, Surrey, England, United Kingdom
Hybrid / WFH Options
esure Group
driven insights alongside exceptional service, to deliver personalised experiences that meet our customers ever-changing needs today and in the future. Job Description We are currently recruiting for a Site Reliability Engineer to join our Tech Enablement function. The successful candidate will be responsible for our monitoring estate, and for the continuous improvements … and maintenance of it, and to assist in incident investigation and resolution when required. They also share skills within our Tech Enablement team and should be an evangelist for SRE techniques and goals to the broader IT community. What you’ll do: Deliver proactive and reactive activities to meet SLAs and availability. Partner with development squads pre-launch to embed More ❯
Employment Type: Full-Time
Salary: Competitive salary
Posted:

Public Cloud Infrastructure Engineer

Leeds, Yorkshire, United Kingdom
Lloyds Banking Group
compute infrastructure, CI/CD and Terraform. You'll help build secure, scalable systems and drive automation across deployment and operations. About you You will work collaboratively with the Engineering and Product Owner leads, in building, and executing our road maps, whilst participating in the planning, and delivery of our goals, driving prioritisation, escalate impediments, acting on learnings and … with Terraform. Implement security guidelines and access controls. Troubleshoot and resolve infrastructure issues. Engage in Agile team ceremonies and continuous improvement efforts. What you'll need Extensive DevOps or Site Reliability Engineering experience. Strong CI/CD pipeline design and implementation skills. Experience with Windows & Linux VM base images. Proficiency in configuration management tools (e.g., Chef). More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Application Support Engineer

Cheltenham, Gloucestershire, South West, United Kingdom
itecopeople
DV-Cleared Application Support Engineer - Contract (Outside IR35) The Role We are seeking a DV-cleared Application Support Engineer to join our client's on-site team in the Cheltenham area. You will help maintain and support a managed cross-domain service, leveraging a broad range of technologies. The role focuses on site reliability engineering practices … to ensure service resilience, continuous improvement, and operational excellence. Location: Cheltenham, Gloucestershire Area (on-site minimum 4 days per week) Rate: £500 - £600 per day Clearance: Active DV clearance required Start : ASAP Duration : 6 months Key Responsibilities Build & Deploy Manage and maintain CI/CD pipelines using Java, Maven, and NPM. Configure and execute automated test suites with Maven … and conduct root cause analysis. Implement proactive changes to improve service stability. Maintenance Automate tasks to reduce manual workload. Conduct OS health checks, patching, and database housekeeping. Support multi-site data centre operations. Key Skills Experience in a managed service environment with strong service delivery focus. Hands-on with Infrastructure as Code (Terraform, Ansible). Application development experience (Java More ❯
Employment Type: Contract
Rate: £500.0 - £600.0 per day
Posted:

Google Cloud Architect

London, United Kingdom
WeAreTechWomen
Cloud Airgapped solutions. You will build expertise in deploying and operating these solutions at customer sites as well as internal reference implementations. Your expertise in Google Cloud architecture and engineering, combined with your leadership experience in guiding small teams, will ensure the successful delivery of robust and scalable cloud solutions for our enterprise clients. Minimum of 5 years of … Expertise in a wide range of Google Cloud products and services (Engine, App Engine, Cloud Storage, GKE, etc.) and broader IaaS solutions (Kubernetes, systems virtualization, etc.) Experience architecting and engineering technical cloud-based solutions to meet business and non-functional requirements Hands-on experience creating comprehensive technical documentation, including architecture diagrams, design specifications, and operational runbooks Experience implementing foundational … mentorship to junior team members Strong communication skills with the ability to articulate complex technical concepts to both internal and client technical, non-technical, and management stakeholders Experience in site reliability engineering or IT production systems operations including troubleshooting and debugging live incidents Excellent problem-solving abilities with demonstrable examples of implementing technical innovation or process improvements More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Dev Ops Engineer

London, South East, England, United Kingdom
Hybrid / WFH Options
C4S Search Ltd
alignment with security baselines Implement and maintain cloud security controls, including identity and access management Key Skills/experience required: 5+ years of professional experience in a DevOps or Site Reliability Engineering role Expert-level experience with Microsoft Azure and Azure DevOps Strong hands-on experience with Kubernetes in production environments Proficient with Helm for Kubernetes application More ❯
Employment Type: Full-Time
Salary: £70,000 - £80,000 per annum
Posted:

Principal Engineer - CIAM XDP

London, UK
Barclays Bank PLC
transforming and modernising our digital estate to build a market-leading digital offering with customer experience at its heart.This is an exciting and key role, partnering with business aligned engineering and product teams, to ensure a collaborative team culture is at the heart of what we do.To be successful in this role you should have:Strong hands-on experience … and running of ForgeRock COTS based IAM solutions (PingGateway, PingAM, PingIDM, PingDS), including designing and implementing cloud-based, scalable and resilient IAM solutions for large corporate organisations.Experience with IAM engineering experience across authentication, authorisation, single sign-on, multi-factor authentication, identity lifecycle management, OAuth2.0, OpenID Connect, SAML and policy managementExpertise with JavaScript, Java, Python, and must be comfortable with … API and microservices development.Strong working knowledge of Site Reliability Engineering principlesExperience with Cloud computing (AWS is essential, Azure is a plus)Some other highly desirable skills include:Experience in DevSecOps - knowledge of Product Operating ModelKnowledge of Infrastructure as a Code tooling (Chef is essential, Ansible is a plus), containerizationknowledge of authentication and biometric system design is highly More ❯
Posted:

Principal Engineer - CIAM XDP

Middlesex, United Kingdom
Barclays Bank PLC
and modernising our digital estate to build a market-leading digital offering with customer experience at its heart. This is an exciting and key role, partnering with business aligned engineering and product teams, to ensure a collaborative team culture is at the heart of what we do. To be successful in this role you should have: Strong hands-on … running of ForgeRock COTS based IAM solutions (PingGateway, PingAM, PingIDM, PingDS), including designing and implementing cloud-based, scalable and resilient IAM solutions for large corporate organisations. Experience with IAM engineering experience across authentication, authorisation, single sign-on, multi-factor authentication, identity lifecycle management, OAuth2.0, OpenID Connect, SAML and policy management Expertise with JavaScript, Java, Python, and must be comfortable … with API and microservices development. Strong working knowledge of Site Reliability Engineering principles Experience with Cloud computing (AWS is essential, Azure is a plus) Some other highly desirable skills include: Experience in DevSecOps - knowledge of Product Operating Model Knowledge of Infrastructure as a Code tooling (Chef is essential, Ansible is a plus), containerization knowledge of authentication and More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

SRE

Glasgow, Lanarkshire, Scotland, United Kingdom
Hybrid / WFH Options
McGregor Boyall Associates Limited
Reliability Engineer - GCP, Kubernetes, Python, Terraform 6-Month Contract, Glasgow (hybrid working) Up to £320/day PAYE Global financial services organisation is looking for as an SRE Engineer on a 6-month contract basis, you will be working to implement new features, deal with user requests and reduce repeatable tasks allowing us more time for strategic initiatives. … the push to Google Cloud that is due to go live in the new year. What you will be doing: Delivering services to customers running in Google Cloud Supporting Engineering & Operational tasks, ranging from service delivery, automation, DevOps task Onboard new tools to support business needs Participate in design … discissions and code reviews Skills and experience required: Experience working with GCP Strong Kubernetes Knowledge and hands-on experience of IaaC with Terraform Scripting skills: Python or Bash Previous SRE experience including knowledge about SLO/SLA/SLI and error budgets, is advantageous *Bonus* - Experience with HashiCorp Vaul If this is of interest and you have the required skills More ❯
Employment Type: Contract, Work From Home
Rate: Up to £320 per day
Posted:

Dynatrace Subject Matter Expert - Data Resilience

London, South East, England, United Kingdom
Adecco
observability across complex, hybrid cloud environments. The ideal candidate will have deep expertise in Dynatrace implementation (SaaS and On-Premises), monitoring configuration, and AI-driven insights to support performance, reliability, and business alignment.You will:* Collaborate with Application Stewards and Site Reliability Engineers (SREs) to confirm the list of critical assets in scope for monitoring verification and enhancement. … Resilience.* Play a key part in providing an automatically maintained end to end business flow for each important business process within the Dynatrace toolset.* Collaborate with Application Stewards and Site Reliability Engineers (SREs) to ensure altering configuration is optimal and fit for purpose.* Participate in workshops with third party software suppliers to review observability standards.What You'll Need More ❯
Employment Type: Contractor
Rate: Salary negotiable
Posted:

Dynatrace Subject Matter Expert - Data Resilience

City of London, London, United Kingdom
Adecco
observability across complex, hybrid cloud environments. The ideal candidate will have deep expertise in Dynatrace implementation (SaaS and On-Premises), monitoring configuration, and AI-driven insights to support performance, reliability, and business alignment. You will: * Collaborate with Application Stewards and Site Reliability Engineers (SREs) to confirm the list of critical assets in scope for monitoring verification and … Resilience. * Play a key part in providing an automatically maintained end to end business flow for each important business process within the Dynatrace toolset. * Collaborate with Application Stewards and Site Reliability Engineers (SREs) to ensure altering configuration is optimal and fit for purpose. * Participate in workshops with third party software suppliers to review observability standards. What You'll More ❯
Employment Type: Contract
Posted:

Counterparty Risk Analyst - Middle Office

London Area, United Kingdom
Lorien
with SQL and Python Data Visualisation skills with PowerBI, other Automation and Metrics knowledge handy. Proficiency with tools like Jira, Confluence, Excel, and SharePoint Familiarity with Agile, DevOps, and Site Reliability Engineering Excellent communication and stakeholder management skills More ❯
Posted:

Counterparty Risk Analyst - Middle Office

City of London, London, United Kingdom
Lorien
with SQL and Python Data Visualisation skills with PowerBI, other Automation and Metrics knowledge handy. Proficiency with tools like Jira, Confluence, Excel, and SharePoint Familiarity with Agile, DevOps, and Site Reliability Engineering Excellent communication and stakeholder management skills More ❯
Posted:

Counterparty Risk Analyst - Middle Office

london, south east england, united kingdom
Lorien
with SQL and Python Data Visualisation skills with PowerBI, other Automation and Metrics knowledge handy. Proficiency with tools like Jira, Confluence, Excel, and SharePoint Familiarity with Agile, DevOps, and Site Reliability Engineering Excellent communication and stakeholder management skills More ❯
Posted:
Site Reliability Engineering
10th Percentile
£57,463
25th Percentile
£65,000
Median
£80,000
75th Percentile
£95,000
90th Percentile
£115,000