The SiteReliability Engineering (SRE) team at Pendo is responsible for provisioning and maintaining cloud infrastructure from development through production for all product initiatives, and working with developers and product managers to ensure that our products are not only reliable and performant, but also cost-efficient. Our platform … on-call and incident management functions, supporting a high-throughput platform which processes more than 15 billion events per day. To ensure the reliability of this environment for our customers, SREs work closely with developers and product managers to understand service level objectives, think through failures scenarios, and design … systems which balance cost with reliability objectives. Additionally, SREs collaborate with the Information Security team to ensure that cloud infrastructure is properly secured, and that sufficient controls are in place to meet our compliance goals with respect to industry standards such as SOC 2. Role Responsibilities Write high-quality More ❯
Manchester, Lancashire, United Kingdom Hybrid / WFH Options
bet365 Group
A SiteReliabilityEngineer, who will enhance system reliability, observability and performance through a strong engineering approach and assist with incident resolution and best practices. You will have software engineering skills, focusing on system reliability and observability. You will monitor the health, performance and availability … of critical systems, directly impacting operational efficiency. Using your engineering expertise, you will implement solutions that enhance reliability, including service instrumentation with tools such as Open Telemetry, improve logging practices and develop features for maintainability. You will also help engineer tools and automation for effective service management. Collaboration … is key, working across multiple functions to integrate reliability and observability best practices into the software development life cycle. By supporting governance standards set by the central teams, you will foster a culture where these principles are integral to development. Your contributions will ensure our systems meet user demands More ❯
Manchester Area, United Kingdom Hybrid / WFH Options
bet365
Who we are looking for A SiteReliabilityEngineer, who will enhance system reliability, observability and performance through a strong engineering approach and assist with incident resolution and best practices. You will have software engineering skills, focusing on system reliability and observability. You will monitor … the health, performance and availability of critical systems, directly impacting operational efficiency. Using your engineering expertise, you will implement solutions that enhance reliability, including service instrumentation with tools such as Open Telemetry, improve logging practices and develop features for maintainability. You will also help engineer tools and automation … for effective service management. Collaboration is key, working across multiple functions to integrate reliability and observability best practices into the software development life cycle. By supporting governance standards set by the central teams, you will foster a culture where these principles are integral to development. Your contributions will ensure More ❯
Manchester, Lancashire, United Kingdom Hybrid / WFH Options
Embarcaderomediagroup
SiteReliability & Platform Engineer to help lead the way. You'll sit at the heart of our engineering operations, bringing together SRE principles and modern platform engineering practices. This includes combining principles of SRE - such as service-level reliability, observability, incident response - with platform engineering practices … ship faster, safer, and more cost-efficiently. What you'll be doing: Designing and operating highly reliable, scalable, and secure Azure-based platforms Applying SRE principles like SLOs, observability, and incident management to drive service reliability Building Infrastructure as Code using Terraform (v1.7+) and GitOps workflows Enabling teams through … opportunity for someone passionate about building robust infrastructure and enabling others to move faster and more securely. You might come from a cloud engineering, SRE, or DevOps background - what matters most is your curiosity, systems thinking, and drive to improve operational efficiency. At Sorted, we are committed to fostering an More ❯
re Looking For: Basic Required Qualifications: Bachelor's degree in Computer Science, Information Technology, or a related field. 5+ years of experience as a SiteReliabilityEngineer or equivalent in a similar role. Proficient in application and infrastructure observability, Splunk OpenTelemetry preferred Experienced in production environments running … troubleshooting and problem-solving skills with a knack for identifying and resolving complex technical issues Familiarity working in an Agile environment True understanding of SiteReliability Engineering Ability to build and maintain a system and culture that supports and implements SLOs. Familiar with Docker & Kubernetes, specifically EKS & ECS More ❯
Newcastle Upon Tyne, Tyne And Wear, United Kingdom
Sage City
Job Description We are looking for a SiteReliabilityEngineer to join our SRE Enablement team, a specialised function within Cloud Operations focused on building reusable infrastructure, automation, and tools that enable CloudOps and Engineering teams to operate more efficiently. You will have the opportunity to be … a key driver for SRE adoption within Sage, taking the helm in developing scalable frameworks to improve developer experience, remove toil and ultimately focus on embedding SRE best practices within the wider business. If you have experience working with Terraform and modern CI/CD workflows this could be the … also engage with broader teams to help implement these new approaches. You will have oversight of the entirety of Sage's product-suite and SRE teams as you work closely with them to build tools to make them more successful. Please note this is a hybrid role - you will be More ❯
and ensure Morrisons’ applications and infrastructure are resilient, efficient, and aligned with architectural goals. This is a key role for those passionate about advancing SRE practices at enterprise scale. Responsibilities Act as SME within their Domain teams for advice & guidance in terms of CI/CD, automation and product ways … of working and SRE/Engineering standards Drive the adoption of Engineering standards and Continuous Delivery principles within multiple domains The escalation point for SRE/Engineering ways of working Influence good practices and standards within SDLC throughout the business Influence partners Infrastructure best practices Implementation of least privilege approach … strategy and patterns Engineering Tooling, Patterns, Framework and Standards Proprietary code quality management inclusive of technical debt About you Knowledge In depth understanding of SRE/Engineering, Architecture and Testing practices In depth understanding of the principals of CI/CD within SRE/Engineering In depth understanding of Cloud More ❯
and ensure Morrisons’ applications and infrastructure are resilient, efficient, and aligned with architectural goals. This is a key role for those passionate about advancing SRE practices at enterprise scale. Responsibilities Act as SME within their Domain teams for advice & guidance in terms of CI/CD, automation and product ways … of working and SRE/Engineering standards Drive the adoption of Engineering standards and Continuous Delivery principles within multiple domains The escalation point for SRE/Engineering ways of working Influence good practices and standards within SDLC throughout the business Influence partners Infrastructure best practices Implementation of least privilege approach … strategy and patterns Engineering Tooling, Patterns, Framework and Standards Proprietary code quality management inclusive of technical debt About you Knowledge In depth understanding of SRE/Engineering, Architecture and Testing practices In depth understanding of the principals of CI/CD within SRE/Engineering In depth understanding of Cloud More ❯
Altrincham, Greater Manchester, United Kingdom Hybrid / WFH Options
Future Talent Group
SiteReliabilityEngineer – FinTech/Global Payments – London HQ/Remote First Salary - £80,000/£85,000 + Bonus Location - This UK-based team offers a fully remote working option, with a headquarters in Central London. In this role, you will be joining a leading SaaS … market. The business aims to scale its platform significantly over the next few years to support a growing international client base. Responsibilities Champion core SRE practices: define SLIs/SLOs/SLAs, reduce toil through automation, and plan for Disaster Recovery. Refine KPIs to support data-driven decisions around reliability … teams to build resilient, observable, and maintainable features. Promote DevOps culture by leading knowledge-sharing sessions and supporting issue resolution. Skills Strong grounding in SRE principles and operational best practices. Proficient with observability tools (Prometheus, Grafana, OTEL, Cloudwatch) and telemetry pipelines. Solid programming skills in Python and/or Go More ❯
Leigh, Greater Manchester, United Kingdom Hybrid / WFH Options
Future Talent Group
SiteReliabilityEngineer – FinTech/Global Payments – London HQ/Remote First Salary - £80,000/£85,000 + Bonus Location - This UK-based team offers a fully remote working option, with a headquarters in Central London. In this role, you will be joining a leading SaaS … market. The business aims to scale its platform significantly over the next few years to support a growing international client base. Responsibilities Champion core SRE practices: define SLIs/SLOs/SLAs, reduce toil through automation, and plan for Disaster Recovery. Refine KPIs to support data-driven decisions around reliability … teams to build resilient, observable, and maintainable features. Promote DevOps culture by leading knowledge-sharing sessions and supporting issue resolution. Skills Strong grounding in SRE principles and operational best practices. Proficient with observability tools (Prometheus, Grafana, OTEL, Cloudwatch) and telemetry pipelines. Solid programming skills in Python and/or Go More ❯
Bury, Greater Manchester, United Kingdom Hybrid / WFH Options
Future Talent Group
SiteReliabilityEngineer – FinTech/Global Payments – London HQ/Remote First Salary - £80,000/£85,000 + Bonus Location - This UK-based team offers a fully remote working option, with a headquarters in Central London. In this role, you will be joining a leading SaaS … market. The business aims to scale its platform significantly over the next few years to support a growing international client base. Responsibilities Champion core SRE practices: define SLIs/SLOs/SLAs, reduce toil through automation, and plan for Disaster Recovery. Refine KPIs to support data-driven decisions around reliability … teams to build resilient, observable, and maintainable features. Promote DevOps culture by leading knowledge-sharing sessions and supporting issue resolution. Skills Strong grounding in SRE principles and operational best practices. Proficient with observability tools (Prometheus, Grafana, OTEL, Cloudwatch) and telemetry pipelines. Solid programming skills in Python and/or Go More ❯
Leeds, West Yorkshire, United Kingdom Hybrid / WFH Options
Future Talent Group
SiteReliabilityEngineer – FinTech/Global Payments – London HQ/Remote First Salary - £80,000/£85,000 + Bonus Location - This UK-based team offers a fully remote working option, with a headquarters in Central London. In this role, you will be joining a leading SaaS … market. The business aims to scale its platform significantly over the next few years to support a growing international client base. Responsibilities Champion core SRE practices: define SLIs/SLOs/SLAs, reduce toil through automation, and plan for Disaster Recovery. Refine KPIs to support data-driven decisions around reliability … teams to build resilient, observable, and maintainable features. Promote DevOps culture by leading knowledge-sharing sessions and supporting issue resolution. Skills Strong grounding in SRE principles and operational best practices. Proficient with observability tools (Prometheus, Grafana, OTEL, Cloudwatch) and telemetry pipelines. Solid programming skills in Python and/or Go More ❯
Bolton, Greater Manchester, United Kingdom Hybrid / WFH Options
Future Talent Group
SiteReliabilityEngineer – FinTech/Global Payments – London HQ/Remote First Salary - £80,000/£85,000 + Bonus Location - This UK-based team offers a fully remote working option, with a headquarters in Central London. In this role, you will be joining a leading SaaS … market. The business aims to scale its platform significantly over the next few years to support a growing international client base. Responsibilities Champion core SRE practices: define SLIs/SLOs/SLAs, reduce toil through automation, and plan for Disaster Recovery. Refine KPIs to support data-driven decisions around reliability … teams to build resilient, observable, and maintainable features. Promote DevOps culture by leading knowledge-sharing sessions and supporting issue resolution. Skills Strong grounding in SRE principles and operational best practices. Proficient with observability tools (Prometheus, Grafana, OTEL, Cloudwatch) and telemetry pipelines. Solid programming skills in Python and/or Go More ❯
Bury, east anglia, united kingdom Hybrid / WFH Options
Future Talent Group
SiteReliabilityEngineer – FinTech/Global Payments – London HQ/Remote First Salary - £80,000/£85,000 + Bonus Location - This UK-based team offers a fully remote working option, with a headquarters in Central London. In this role, you will be joining a leading SaaS … market. The business aims to scale its platform significantly over the next few years to support a growing international client base. Responsibilities Champion core SRE practices: define SLIs/SLOs/SLAs, reduce toil through automation, and plan for Disaster Recovery. Refine KPIs to support data-driven decisions around reliability … teams to build resilient, observable, and maintainable features. Promote DevOps culture by leading knowledge-sharing sessions and supporting issue resolution. Skills Strong grounding in SRE principles and operational best practices. Proficient with observability tools (Prometheus, Grafana, OTEL, Cloudwatch) and telemetry pipelines. Solid programming skills in Python and/or Go More ❯
Ashton-Under-Lyne, Greater Manchester, United Kingdom Hybrid / WFH Options
Future Talent Group
SiteReliabilityEngineer – FinTech/Global Payments – London HQ/Remote First Salary - £80,000/£85,000 + Bonus Location - This UK-based team offers a fully remote working option, with a headquarters in Central London. In this role, you will be joining a leading SaaS … market. The business aims to scale its platform significantly over the next few years to support a growing international client base. Responsibilities Champion core SRE practices: define SLIs/SLOs/SLAs, reduce toil through automation, and plan for Disaster Recovery. Refine KPIs to support data-driven decisions around reliability … teams to build resilient, observable, and maintainable features. Promote DevOps culture by leading knowledge-sharing sessions and supporting issue resolution. Skills Strong grounding in SRE principles and operational best practices. Proficient with observability tools (Prometheus, Grafana, OTEL, Cloudwatch) and telemetry pipelines. Solid programming skills in Python and/or Go More ❯