Cheltenham, Gloucestershire, United Kingdom Hybrid / WFH Options
TwinStream
organisations TwinStream was formed to consolidate their collective expertise and experience into one business, providing technical excellence and exceptional service to their clients. We have teams working both on-site with clients and remotely from home. Details : Employment: Contract (outside of IR35) Security Level: Must have live DV- clearance About the role: Successful candidates will be working as part … of an on-site team to maintain and support a managed cross-domain service using a wide range of technology, platforms and tools. The team employ sitereliabilityengineering tools and practices to continuously verify and improve the service. Responsibilities: Build and Deploy code from multiple project teams: Maintenance and administration of a CI pipeline building More ❯
with cloud technologies Google Cloud Platform (GCP) or others (AWS, ICP/OCP, Azure) Containers, Orchestration, and Service Mesh: Experience with Docker, Kubernetes, Istio, etc. Experience with DevOps or SiteReliabilityEngineering About working for us Our focus is to ensure we're inclusive every day, building an organisation that reflects modern society and celebrates diversity in More ❯
Wokingham, Berkshire, United Kingdom Hybrid / WFH Options
Nordcloud
the European cloud revolution. We supercharge our customers to innovate in hyperscaler cloud, enabling seamless migration, advanced security, and data-driven success. Currently, we are looking for an Azure SiteReliability Engineer to join our team in the UK. Your daily responsibilities: Architect, implement, and improve existing monitoring and alerting systems Proactively investigate and identify performance anomalies and … solving We encourage you to apply , even if you don't meet all of the requirements. We value your growth potential and enthusiasm! This role is required to on site in Wokingham twice a week, please do not apply if this is not possible for you. What we offer: Individual training budget and exam fees for certifications Flexible working More ❯
Cloud Airgapped solutions. You will build expertise in deploying and operating these solutions at customer sites as well as internal reference implementations. Your expertise in Google Cloud architecture and engineering, combined with your leadership experience in guiding small teams, will ensure the successful delivery of robust and scalable cloud solutions for our enterprise clients. Minimum of 5 years of … Expertise in a wide range of Google Cloud products and services (Engine, App Engine, Cloud Storage, GKE, etc.) and broader IaaS solutions (Kubernetes, systems virtualization, etc.) Experience architecting and engineering technical cloud-based solutions to meet business and non-functional requirements Hands-on experience creating comprehensive technical documentation, including architecture diagrams, design specifications, and operational runbooks Experience implementing foundational … mentorship to junior team members Strong communication skills with the ability to articulate complex technical concepts to both internal and client technical, non-technical, and management stakeholders Experience in sitereliabilityengineering or IT production systems operations including troubleshooting and debugging live incidents Excellent problem-solving abilities with demonstrable examples of implementing technical innovation or process improvements More ❯
the core of one of the UK's biggest financial service transformations. Our core technology focus is on Microsoft Azure and Google Cloud Platform. Your role as a Lead SRE is to be a visionary leader who works collaboratively with the Product Owners and Engineering Leads to ensure that we maintain high levels of reliability across all our … Engineering Leads, using data, to balance product improvements covering aspects such as reliability, observability and performance, with new feature development Key Responsibilities You will help improve the SRE framework and principles to strengthen focus, behaviours, and culture You will support the POs and ELs to ensure our products and services are sufficiently resilient and to address any SLA … reliability and performance, you will work with Product Owners and Engineering Leads using data and your experience to prioritise the backlog You will train and mentor the SRE's in the latest platform and product features and technology developments and be a go to person for SREs to seek direction and help What you'll need SRE experience More ❯
Leeds, West Yorkshire, Yorkshire, United Kingdom Hybrid / WFH Options
VIQU IT Recruitment
SiteReliability Engineer with a strong focus on leadership and team management . Around 70% of this role is about building, mentoring and directing a high-performing SRE team, setting strategy and driving operational excellence. The remaining 30% will be hands-on involvement in AWS-based platforms, automation and performance tuning. Key Responsibilities Lead and develop a team … of SRE engineers, setting priorities, providing coaching and creating a culture of reliability and continuous improvement Define and own SRE strategy, standards and ways of working across the organisation Collaborate with engineering, operations and product teams to ensure seamless delivery and robust systems Oversee system reliability, availability and performance across large, business-critical platforms Provide technical guidance … GitLab, Concourse) and ensure AWS platforms meet operational best practice Produce regular reporting and communicate clearly with senior stakeholders Key Requirements Strong experience managing or leading engineering/SRE/DevOps teams in a complex environment Track record of mentoring, coaching and growing technical teams Excellent stakeholder engagement skills with the ability to influence at all levels Broad technical More ❯
as AWS Lambda, Spring Boot, NodeJS, Python FastAPI, Oracle, PostgreSQL and MongoDB Contributing to DevSecOps delivery pipelines, using tooling such as Atlassian, Jenkins, GitLab, OWASP and AWS services Applying SiteReliabilityEngineering principles to ensure solutions are resilient, reliable and cost-effective Supporting clients and end users in making technical product decisions by clearly explaining trade-offs … and recommended approaches Participating in a community of engineers who share knowledge, run workshops and contribute to the wider engineering culture Looking beyond day-to-day responsibilities to identify small details, opportunities for improvement and added value for clients What we're looking for: UK Developed Vetting (DV) clearance is essential Hands-on experience in software development and a … strong interest in writing quality code Solid understanding of back-end development using one or more of the following: Java, Python, TypeScript or JavaScript Familiarity with good engineering patterns and practices, and the ability to articulate them clearly Experience working in agile environments (Scrum, Kanban or similar) Enthusiastic about learning, collaborating with diverse teams and solving problems creatively Confident More ❯
as AWS Lambda, Spring Boot, NodeJS, Python FastAPI, Oracle, PostgreSQL and MongoDB Contributing to DevSecOps delivery pipelines, using tooling such as Atlassian, Jenkins, GitLab, OWASP and AWS services Applying SiteReliabilityEngineering principles to ensure solutions are resilient, reliable and cost-effective Supporting clients and end users in making technical product decisions by clearly explaining trade-offs … and recommended approaches Participating in a community of engineers who share knowledge, run workshops and contribute to the wider engineering culture Looking beyond day-to-day responsibilities to identify small details, opportunities for improvement and added value for clients What we're looking for: UK Developed Vetting (DV) clearance is essential Hands-on experience in software development and a … strong interest in writing quality code Solid understanding of back-end development using one or more of the following: Java, Python, TypeScript or JavaScript Familiarity with good engineering patterns and practices, and the ability to articulate them clearly Experience working in agile environments (Scrum, Kanban or similar) Enthusiastic about learning, collaborating with diverse teams and solving problems creatively Confident More ❯
flexible remoteworking locations within UK/Europe) Employment type: Permanent Working Hours: Full time (9-6 UK) Salary: Up to £110K + Shares + Benefits TransFICC is hiring a SiteReliability Engineer to provide high-performance services to our customers. We develop an integration service … product that enables our clients to have a flexible, hosted service without requiring their internal resources to respond to connectivity challenges across trading venues. You will be joining our SRE team and contributing to TransFICC's automation culture. We are a multi-disciplinary team covering everything from desktop and laptop support to data centre provisioning of servers and vendor network More ❯
Docker.2. Hands on Experience on Java, Cloud AWSRequired Core Skills:• Strong Hands on experience in Java and CI CD pipelines.• Hands on Kubernets and Docker.• Technical Troubleshooting• Requirements Analysis• SiteReliabilityEngineering• Software Design• Strategic Thinking• Information Technology Trends Technology Advisory• Change and Transformation• Professional Collaboration• Digital and Technology Communication• Business Acumen• Problem Solving Tools• Risk and … Controls Distributed Systems.• To apply software engineering techniques automation and best practices in incident response More ❯
Stratford-upon-avon, Warwickshire, United Kingdom Hybrid / WFH Options
NFU Mutual
Stratford-upon-Avon Office, 35-hour working week, Mon-Fri between 9am-5pm About the role We have an exciting opportunity for an IT Scrum Master to join the SiteReliabilityEngineering Team within the IT Division on a 12-Month FTC. As a Scrum Master, you'll be responsible for overseeing the needs of the teams … of our internal corporate customers up and running across the business. You will be accountable for the effective delivery and oversight of key products and services delivered in the SRE team, providing delivery leadership to your workstream, acting as the key point of engagement with several teams and business areas. The Scrum Master is part of the delivery team and … Clear and concise verbal communication skills Understanding of Agile development principles Experienced with ITIL methodology (Desirable) Experienced with tooling such as Dynatrace, Grafana, Nexthink (Desirable) Working experience in an SRE, Observability or Tooling function (Desirable) At NFU Mutual, we support an inclusive workplace and value all the differences that make us unique. We celebrate the creativity and innovation that comes More ❯
with SQL and Python Data Visualisation skills with PowerBI, other Automation and Metrics knowledge handy. Proficiency with tools like Jira, Confluence, Excel, and SharePoint Familiarity with Agile, DevOps, and SiteReliabilityEngineering Excellent communication and stakeholder management skills More ❯
with SQL and Python Data Visualisation skills with PowerBI, other Automation and Metrics knowledge handy. Proficiency with tools like Jira, Confluence, Excel, and SharePoint Familiarity with Agile, DevOps, and SiteReliabilityEngineering Excellent communication and stakeholder management skills More ❯
with SQL and Python Data Visualisation skills with PowerBI, other Automation and Metrics knowledge handy. Proficiency with tools like Jira, Confluence, Excel, and SharePoint Familiarity with Agile, DevOps, and SiteReliabilityEngineering Excellent communication and stakeholder management skills More ❯
with SQL and Python Data Visualisation skills with PowerBI, other Automation and Metrics knowledge handy. Proficiency with tools like Jira, Confluence, Excel, and SharePoint Familiarity with Agile, DevOps, and SiteReliabilityEngineering Excellent communication and stakeholder management skills More ❯
london (city of london), south east england, united kingdom
Lorien
with SQL and Python Data Visualisation skills with PowerBI, other Automation and Metrics knowledge handy. Proficiency with tools like Jira, Confluence, Excel, and SharePoint Familiarity with Agile, DevOps, and SiteReliabilityEngineering Excellent communication and stakeholder management skills More ❯
the leading platform for brands navigating virtual worlds, and the reliability and scalability of that platform are paramount to our success. Your Focus As a Senior DevOps/SRE/Platform Engineer, you will be a key technical leader responsible for the reliability, scalability, and security of the entire GEEIQ platform. You'll tackle our biggest infrastructure challenges … GitHub Actions to make them faster, more reliable, and more secure. Champion developer productivity by building tools, automating workflows, and reducing friction in the development lifecycle. Observability & Reliability (SRE) Lead the charge on improving our observability strategy. Design and implement a robust monitoring, logging, and alerting framework using tools like Grafana, Prometheus, and native AWS services. Enhance our incident … and manage security controls related to IAM, network security (VPCs, security groups), vulnerability scanning, and secrets management. Skills, Knowledge and Expertise Experience: Extensive hands-on experience in a DevOps, SRE, or Platform Engineering role, managing production systems in a cloud environment. Deep expertise with AWS and its core services (e.g., EKS, RDS, Lambda, EC2, S3, IAM, VPC). Proven More ❯
SRE Manager (Traffic and Secure Networking) London, England, United Kingdom Software and Services Description As the Senior Manager you will help shape the future of load balancing working with teams in Canada and the United States. You will have a proven ability to solve problems whether they require a rapid response or a long term strategy. You will know when … Explore and evaluate new technologies and solutions to push our capabilities forward and solve tomorrow's problems, not just today's Minimum Qualifications Demonstrable traffic management experience Demonstrable software engineering or reliabilityengineering experience Bachelor of Science in relevant engineering disciplines Preferred Qualifications Track record of building and running high-performance teams Advanced experience with programming … deliver results on time with high quality Extensive experience leading customer facing systems in a high uptime 24/7 environment Passionate about leading a global 24/7 engineering organization A consistent track record of automation Lifelong learner who furthers diversity of thought in their approach to management More ❯
Woking, Surrey, United Kingdom Hybrid / WFH Options
Experis
Role Title: SiteReliability Engineer (SRE) Duration: 5 month contract Location: Wokingham (Reading). Hybrid, 60% remote and 40% onsite Clearance required: Active SC is essential Key Skills/requirements Detect and mitigate system issues to ensure high availability. Automate operational tasks to improve efficiency and reduce manual intervention. Prepare disaster recovery plans and ensure business continuity. Monitor … Implement CI/CD pipelines for seamless deployment and release management. Ensure compliance with security standards, governance policies, and regulatory requirements. Required Skills & Experience Expertise in software development and engineering for large-scale distributed systems. Strong proficiency in programming languages such as Golang, Java, or Python. Extensive experience with cloud infrastructure providers (AWS, Azure, or GCP). Deep knowledge More ❯
function at Birdie; working with the Platform team, you'll develop and expand on our current Kubernetes micro-services architecture, building and maintaining abstractions and services for our Core engineering teams Coaching fellow Platform engineers and Software engineers, working … with the Platform Lead, fellow Staff engineers and our VP of Eng to develop strategy You'll be championing reliability and stability across the Engineering Organisation. Instilling SRE principles within teams and leaving your mark on our product You'll be a key part of our "shift-left" DevOps success, whether it's security best-practices, CI/… and adapting to the feedback you gain from them, working with GitOps Experience working in a security-conscious engineering organisation, with proficiencies in DevSecOps principals Clear understanding of SRE practices, goals and implementations A deep capacity to get things done What are the benefits? People are our core strength. We are social entrepreneurs, boasting an outstanding culture with strong More ❯
sector, our technology is truly flexible and designed to transform any business at scale. We've created a unified platform that adapts to diverse needs, offering the scalability and reliability legacy systems simply can't match. At ZILO, our DNA is built on Character, Creativity, and Craftsmanship. We face every challenge with integrity, explore new ideas with a curious … role, drives our progress and creates real impact. If you're ready to shape the future, let's talk. We are seeking an experienced SiteReliability Engineer (SRE) with deep subject-matter expertise in data processing and reporting. In this role, you will own the reliability, performance, and operational excellence of our real-time and batch data … and trace data to pinpoint failure points across AWS, Flink, Kafka, and Python layers. Lead post-incident reviews: identify root causes, document findings, and drive corrective actions to closure. Reliability & Monitoring Design, implement, and maintain robust observability for data pipelines: dashboards, alerts, distributed tracing. Define SLOs/SLIs for data freshness, throughput, and error rates; continuously monitor and optimize. More ❯
disciplinary teams to broaden our scope of work: Cloud Platform team - owns and operates a thin central platform for our AWS estate Developer Experience and Finops team - manages core engineering tooling, proactively works to enhance developer practice & experience and ensures value from our SaaS services Engineering Access Operations team - owns and operates identity and access management for our … tooling and services, supporting business impact and agility We are seeking developers and senior developers to join the following teams: Cloud Platform Team (1 Senior, 1 Mid) - working alongside SiteReliability Engineers you'll develop and manage our core AWS cloud platform Developer Experience and Finops team (1 Senior) - you'll build and maintain automations to effectively manage … our core engineering tooling, and act as a custodian and advocate for great engineering practice across GDS Engineering Access Operations team - (1 Senior, 1 Mid) - you'll build and maintain automation supporting our core identity and access management system based on Microsoft Entra ID. All of the roles will involve being working across a range of technologies More ❯
disciplinary teams to broaden our scope of work: Cloud Platform team - owns and operates a thin central platform for our AWS estate Developer Experience and Finops team - manages core engineering tooling, proactively works to enhance developer practice & experience and ensures value from our SaaS services Engineering Access Operations team - owns and operates identity and access management for our … tooling and services, supporting business impact and agility We are seeking developers and senior developers to join the following teams: Cloud Platform Team (1 Senior, 1 Mid) - working alongside SiteReliability Engineers you'll develop and manage our core AWS cloud platform Developer Experience and Finops team (1 Senior) - you'll build and maintain automations to effectively manage … our core engineering tooling, and act as a custodian and advocate for great engineering practice across GDS Engineering Access Operations team - (1 Senior, 1 Mid) - you'll build and maintain automation supporting our core identity and access management system based on Microsoft Entra ID. All of the roles will involve being working across a range of technologies More ❯
mission is to create AI systems that can accurately understand the universe and aid humanity in its pursuit of knowledge. Our team is small, highly motivated, and focused on engineering excellence. This organization is for individuals who appreciate challenging themselves and thrive on curiosity. We operate with a flat organizational structure. All employees are expected to be hands-on More ❯
Leeds, Yorkshire, United Kingdom Hybrid / WFH Options
William Hill PLC
bets per second, accommodate 20 million users, and process 160 terabytes a day. You can be sure there are many challenges waiting for you. The Leeds-based, highly skilled SRE team are primarily managing the Kubernetes clusters within the organisation for multiple departments, and through a DevOps culture enabling those departments with observability and pipelines for their business applications. Their … job is to guarantee system reliability, performance, and supportability with a strong engineering emphasis on building autonomous solutions that deliver value to end-users early, often, and fast. We are also open to candidates that come from a Software Engineering background - As long as you show the willingness to learn, we are more than happy to invest … Storage Platforms, developing any necessary integration Supporting Incidents - Assist Incident Management in Production all the way through impact assessment, service restoration and post-mortems, including being part of the SRE on call rotation Sharing Knowledge - Enabling development teams within the DevOps Culture, promoting best practice, documenting runbooks, presenting talks, working with production engineering teams Who we are looking for More ❯