We are looking for an experienced SiteReliabilityEngineer (SRE) to join our Technical Operations team within Microlise. Your key responsibilities will include implementing and supporting the Microlise infrastructure. This involves bringing automation and observability to the core infrastructure by applying development principles. Do you have experience … we are looking for: Experience in TechOps, especially with Infrastructure as Code Familiarity with development technologies like C# and SQL, Git Deep understanding of SRE practices and infrastructure monitoring frameworks Knowledge of diverse monitoring tools and requirements Enthusiasm and ability to learn new technologies Excellent investigation and problem-solving skills More ❯
re Looking For: Basic Required Qualifications: Bachelor's degree in Computer Science, Information Technology, or a related field. 5+ years of experience as a SiteReliabilityEngineer or equivalent in a similar role. Proficient in application and infrastructure observability, Splunk OpenTelemetry preferred Experienced in production environments running … troubleshooting and problem-solving skills with a knack for identifying and resolving complex technical issues Familiarity working in an Agile environment True understanding of SiteReliability Engineering Ability to build and maintain a system and culture that supports and implements SLOs. Familiar with Docker & Kubernetes, specifically EKS & ECS More ❯
Move to Skip to Content Link Select how often (in days) to receive an alert: Select how often (in days) to receive an alert: SiteReliability Engineering Professional Posting Date: 1 May 2025 Unit: Networks Location: Snowhill, Birmingham, United Kingdom Why this job matters Professional Services was formed … this by ensuring we have the right people to achieve our high ambitions. We are seeking a passionate SiteReliabilityEngineer(SRE), to join the unit. This role will specialise inengineering with a curious approach towards automation. Candidates will be required to leverage their knowledge in engineering … new technologies, so a desire to learn is equally as valuable as experience. As part of this role, you'll directly contribute to improving reliability, scalability, and performance of services by driving automation, monitoring, and operational excellence across a number of environments. What you'll be doing Integration: Collaborate More ❯
Newcastle Upon Tyne, Tyne And Wear, United Kingdom
Sage City
Job Description We are looking for a SiteReliabilityEngineer to join our SRE Enablement team, a specialised function within Cloud Operations focused on building reusable infrastructure, automation, and tools that enable CloudOps and Engineering teams to operate more efficiently. You will have the opportunity to be … a key driver for SRE adoption within Sage, taking the helm in developing scalable frameworks to improve developer experience, remove toil and ultimately focus on embedding SRE best practices within the wider business. If you have experience working with Terraform and modern CI/CD workflows this could be the … also engage with broader teams to help implement these new approaches. You will have oversight of the entirety of Sage's product-suite and SRE teams as you work closely with them to build tools to make them more successful. Please note this is a hybrid role - you will be More ❯
and ensure Morrisons’ applications and infrastructure are resilient, efficient, and aligned with architectural goals. This is a key role for those passionate about advancing SRE practices at enterprise scale. Responsibilities Act as SME within their Domain teams for advice & guidance in terms of CI/CD, automation and product ways … of working and SRE/Engineering standards Drive the adoption of Engineering standards and Continuous Delivery principles within multiple domains The escalation point for SRE/Engineering ways of working Influence good practices and standards within SDLC throughout the business Influence partners Infrastructure best practices Implementation of least privilege approach … strategy and patterns Engineering Tooling, Patterns, Framework and Standards Proprietary code quality management inclusive of technical debt About you Knowledge In depth understanding of SRE/Engineering, Architecture and Testing practices In depth understanding of the principals of CI/CD within SRE/Engineering In depth understanding of Cloud More ❯
and ensure Morrisons’ applications and infrastructure are resilient, efficient, and aligned with architectural goals. This is a key role for those passionate about advancing SRE practices at enterprise scale. Responsibilities Act as SME within their Domain teams for advice & guidance in terms of CI/CD, automation and product ways … of working and SRE/Engineering standards Drive the adoption of Engineering standards and Continuous Delivery principles within multiple domains The escalation point for SRE/Engineering ways of working Influence good practices and standards within SDLC throughout the business Influence partners Infrastructure best practices Implementation of least privilege approach … strategy and patterns Engineering Tooling, Patterns, Framework and Standards Proprietary code quality management inclusive of technical debt About you Knowledge In depth understanding of SRE/Engineering, Architecture and Testing practices In depth understanding of the principals of CI/CD within SRE/Engineering In depth understanding of Cloud More ❯
eDV SiteReliabilityEngineer Looking for an eDV SRE. Someone with a defence industry specialism with a passion … for creating efficient and secure cloud infrastructure. You will play a critical part in transforming and enhancing both internal and external operations through effective SRE practices. Core Responsibilities Infrastructure Excellence: Design, manage, and evolve our cloud-based infrastructure to support high-traffic applications and seamless service delivery. Secure Deployment: Develop More ❯
As a Senior SiteReliabilityEngineer at Convera, your role is pivotal in ensuring the stability and resilience of our systems. You'll spearhead our incident management strategy, swiftly identifying and mitigating risks to uphold our service reliability. You will be responsible for: Taking the lead on … architecture, deployment processes, and observability practices. Elevating the customer experience as the ultimate benchmark of our reliability standards. Sharing industry best practices in SRE, ensuring our team remains at the forefront of innovation. Facilitating blameless post-mortems, instituting actionable alerts, and streamlining incident management through automation. You should apply More ❯
As a Senior SiteReliabilityEngineer at Convera, your role is pivotal in ensuring the stability and resilience of our systems. You'll spearhead our incident management strategy, swiftly identifying and mitigating risks to uphold our service reliability. You will be responsible for: Taking the lead on … architecture, deployment processes, and observability practices. Elevating the customer experience as the ultimate benchmark of our reliability standards. Sharing industry best practices in SRE, ensuring our team remains at the forefront of innovation. Facilitating blameless post-mortems, instituting actionable alerts, and streamlining incident management through automation. You should apply … industries. Familiarity with the Grafana observability stack. Experience in Chaos Engineering methodologies. Your expertise will be instrumental in fortifying our infrastructure and delivering exceptional reliability to our customers. About Convera Convera is the largest non-bank B2B cross-border payments company in the world. Formerly Western Union Business Solutions More ❯
Leigh, Greater Manchester, United Kingdom Hybrid / WFH Options
Future Talent Group
SiteReliabilityEngineer – FinTech/Global Payments – London HQ/Remote First Salary - £80,000/£85,000 + Bonus Location - This UK-based team offers a fully remote working option, with a headquarters in Central London. In this role, you will be joining a leading SaaS … market. The business aims to scale its platform significantly over the next few years to support a growing international client base. Responsibilities Champion core SRE practices: define SLIs/SLOs/SLAs, reduce toil through automation, and plan for Disaster Recovery. Refine KPIs to support data-driven decisions around reliability … teams to build resilient, observable, and maintainable features. Promote DevOps culture by leading knowledge-sharing sessions and supporting issue resolution. Skills Strong grounding in SRE principles and operational best practices. Proficient with observability tools (Prometheus, Grafana, OTEL, Cloudwatch) and telemetry pipelines. Solid programming skills in Python and/or Go More ❯
Leeds, West Yorkshire, United Kingdom Hybrid / WFH Options
Future Talent Group
SiteReliabilityEngineer – FinTech/Global Payments – London HQ/Remote First Salary - £80,000/£85,000 + Bonus Location - This UK-based team offers a fully remote working option, with a headquarters in Central London. In this role, you will be joining a leading SaaS … market. The business aims to scale its platform significantly over the next few years to support a growing international client base. Responsibilities Champion core SRE practices: define SLIs/SLOs/SLAs, reduce toil through automation, and plan for Disaster Recovery. Refine KPIs to support data-driven decisions around reliability … teams to build resilient, observable, and maintainable features. Promote DevOps culture by leading knowledge-sharing sessions and supporting issue resolution. Skills Strong grounding in SRE principles and operational best practices. Proficient with observability tools (Prometheus, Grafana, OTEL, Cloudwatch) and telemetry pipelines. Solid programming skills in Python and/or Go More ❯
Bolton, Greater Manchester, United Kingdom Hybrid / WFH Options
Future Talent Group
SiteReliabilityEngineer – FinTech/Global Payments – London HQ/Remote First Salary - £80,000/£85,000 + Bonus Location - This UK-based team offers a fully remote working option, with a headquarters in Central London. In this role, you will be joining a leading SaaS … market. The business aims to scale its platform significantly over the next few years to support a growing international client base. Responsibilities Champion core SRE practices: define SLIs/SLOs/SLAs, reduce toil through automation, and plan for Disaster Recovery. Refine KPIs to support data-driven decisions around reliability … teams to build resilient, observable, and maintainable features. Promote DevOps culture by leading knowledge-sharing sessions and supporting issue resolution. Skills Strong grounding in SRE principles and operational best practices. Proficient with observability tools (Prometheus, Grafana, OTEL, Cloudwatch) and telemetry pipelines. Solid programming skills in Python and/or Go More ❯
Altrincham, Greater Manchester, United Kingdom Hybrid / WFH Options
Future Talent Group
SiteReliabilityEngineer – FinTech/Global Payments – London HQ/Remote First Salary - £80,000/£85,000 + Bonus Location - This UK-based team offers a fully remote working option, with a headquarters in Central London. In this role, you will be joining a leading SaaS … market. The business aims to scale its platform significantly over the next few years to support a growing international client base. Responsibilities Champion core SRE practices: define SLIs/SLOs/SLAs, reduce toil through automation, and plan for Disaster Recovery. Refine KPIs to support data-driven decisions around reliability … teams to build resilient, observable, and maintainable features. Promote DevOps culture by leading knowledge-sharing sessions and supporting issue resolution. Skills Strong grounding in SRE principles and operational best practices. Proficient with observability tools (Prometheus, Grafana, OTEL, Cloudwatch) and telemetry pipelines. Solid programming skills in Python and/or Go More ❯
Bury, Greater Manchester, United Kingdom Hybrid / WFH Options
Future Talent Group
SiteReliabilityEngineer – FinTech/Global Payments – London HQ/Remote First Salary - £80,000/£85,000 + Bonus Location - This UK-based team offers a fully remote working option, with a headquarters in Central London. In this role, you will be joining a leading SaaS … market. The business aims to scale its platform significantly over the next few years to support a growing international client base. Responsibilities Champion core SRE practices: define SLIs/SLOs/SLAs, reduce toil through automation, and plan for Disaster Recovery. Refine KPIs to support data-driven decisions around reliability … teams to build resilient, observable, and maintainable features. Promote DevOps culture by leading knowledge-sharing sessions and supporting issue resolution. Skills Strong grounding in SRE principles and operational best practices. Proficient with observability tools (Prometheus, Grafana, OTEL, Cloudwatch) and telemetry pipelines. Solid programming skills in Python and/or Go More ❯
london, south east england, united kingdom Hybrid / WFH Options
Future Talent Group
SiteReliabilityEngineer – FinTech/Global Payments – London HQ/Remote First Salary - £80,000/£85,000 + Bonus Location - This UK-based team offers a fully remote working option, with a headquarters in Central London. In this role, you will be joining a leading SaaS … market. The business aims to scale its platform significantly over the next few years to support a growing international client base. Responsibilities Champion core SRE practices: define SLIs/SLOs/SLAs, reduce toil through automation, and plan for Disaster Recovery. Refine KPIs to support data-driven decisions around reliability … teams to build resilient, observable, and maintainable features. Promote DevOps culture by leading knowledge-sharing sessions and supporting issue resolution. Skills Strong grounding in SRE principles and operational best practices. Proficient with observability tools (Prometheus, Grafana, OTEL, Cloudwatch) and telemetry pipelines. Solid programming skills in Python and/or Go More ❯
london (city of london), south east england, united kingdom Hybrid / WFH Options
Future Talent Group
SiteReliabilityEngineer – FinTech/Global Payments – London HQ/Remote First Salary - £80,000/£85,000 + Bonus Location - This UK-based team offers a fully remote working option, with a headquarters in Central London. In this role, you will be joining a leading SaaS … market. The business aims to scale its platform significantly over the next few years to support a growing international client base. Responsibilities Champion core SRE practices: define SLIs/SLOs/SLAs, reduce toil through automation, and plan for Disaster Recovery. Refine KPIs to support data-driven decisions around reliability … teams to build resilient, observable, and maintainable features. Promote DevOps culture by leading knowledge-sharing sessions and supporting issue resolution. Skills Strong grounding in SRE principles and operational best practices. Proficient with observability tools (Prometheus, Grafana, OTEL, Cloudwatch) and telemetry pipelines. Solid programming skills in Python and/or Go More ❯
london (west end), south east england, united kingdom Hybrid / WFH Options
Future Talent Group
SiteReliabilityEngineer – FinTech/Global Payments – London HQ/Remote First Salary - £80,000/£85,000 + Bonus Location - This UK-based team offers a fully remote working option, with a headquarters in Central London. In this role, you will be joining a leading SaaS … market. The business aims to scale its platform significantly over the next few years to support a growing international client base. Responsibilities Champion core SRE practices: define SLIs/SLOs/SLAs, reduce toil through automation, and plan for Disaster Recovery. Refine KPIs to support data-driven decisions around reliability … teams to build resilient, observable, and maintainable features. Promote DevOps culture by leading knowledge-sharing sessions and supporting issue resolution. Skills Strong grounding in SRE principles and operational best practices. Proficient with observability tools (Prometheus, Grafana, OTEL, Cloudwatch) and telemetry pipelines. Solid programming skills in Python and/or Go More ❯
Ashton-Under-Lyne, Greater Manchester, United Kingdom Hybrid / WFH Options
Future Talent Group
SiteReliabilityEngineer – FinTech/Global Payments – London HQ/Remote First Salary - £80,000/£85,000 + Bonus Location - This UK-based team offers a fully remote working option, with a headquarters in Central London. In this role, you will be joining a leading SaaS … market. The business aims to scale its platform significantly over the next few years to support a growing international client base. Responsibilities Champion core SRE practices: define SLIs/SLOs/SLAs, reduce toil through automation, and plan for Disaster Recovery. Refine KPIs to support data-driven decisions around reliability … teams to build resilient, observable, and maintainable features. Promote DevOps culture by leading knowledge-sharing sessions and supporting issue resolution. Skills Strong grounding in SRE principles and operational best practices. Proficient with observability tools (Prometheus, Grafana, OTEL, Cloudwatch) and telemetry pipelines. Solid programming skills in Python and/or Go More ❯
london, south east england, united kingdom Hybrid / WFH Options
MarkJames Search
Job Title: SiteReliability Engineering (SRE) Lead – Observability Location: Stratford, London (Hybrid – 2 days per week onsite) Contract Length: 6 months Rate: £450–£500 per day (Inside IR35) Industry: Financial Services A leading Financial Services organisation in London is seeking a SiteReliability Engineering (SRE) Lead … a hybrid role requiring two days per week onsite at their Stratford, London offices. The role sits Inside IR35 . Key Responsibilities: Lead the SRE Observability team and champion observability practices across multiple product groups. Provide thought leadership from the Cognizant delivery team on all things SRE. Leverage hands-on … creation and QA of project-level Observability Plans. Input into and assure the quality of testing strategies and results. Requirements Proven experience in an SRE role with a strong focus on Observability. Expert-level proficiency with DevOps tools including GitHub, GitHub Actions, Jenkins, Nexus, CloudFormation/Terraform, and CodeQL. Extensive More ❯
Insight Global is looking for an Operations SiteReliabilityEngineer to help with global operational support for a leading infrastructure software product company’s customer-facing Saas products. You will be part of a … team of engineers that demonstrates superb technical competency, operates mission-critical infrastructure and ensures the highest levels of availability (24x7x365), performance and security. This SRE would be part of the critical operations function that is responsible for the monitoring, availability and performance of production services. They would be driving automation More ❯
Crawley, England, United Kingdom Hybrid / WFH Options
James Chase
Are you an Azure Devops/SRE looking for your next opportunity? Are you passionate about ensuring application reliability and performance? Do you thrive in a collaborative, high-impact environment? If yes, this could be your next big opportunity!!! Our client, a leading provider of financial services are looking … a permanent basis. Responsibilities: Managing incidents and post-mortems for on-premises and cloud applications. Monitoring performance using modern tools and implementing automation. Driving SRE and DevOps best practices. Supporting releases with minimal downtime. Key Skills & Experience: Experience in SRE, IT operations, software development, or DevOps. Familiarity with CI/… KQL, and incident management. Hands on experience with YAML pipelines. Experience with Bicep, SolarWinds, Terraform and PowerShell. Want to be part of a growing SRE team driving automation and reliability? Click Apply now or send your CV to chinmaye.ramnath@james-chase.com *This role is hybrid working with one day More ❯
The SRE Manager is responsible for leading the SiteReliability Engineering function across Europe, ensuring the reliability, scalability, and performance of critical infrastructure and services. This role plays a key part in the global follow-the-sun support model, working closely with the Global SRE Leader to … impact team. You'll collaborate with Engineering, Infrastructure, and Operations teams to maintain high availability and resilient service delivery, while also mentoring a regional SRE team focused on continuous improvement and innovation. Key Responsibilities: Technical Leadership Develop deep expertise in the Titanium trading platform to lead and support critical business … ensuring priorities align with business goals and resource capacity. Operational Excellence Champion initiatives that enhance system availability, scalability, and performance. Collaborate with the Global SRE Leader to refine and enforce operational policies (e.g., Capacity Planning, Change Management, Disaster Recovery). Cross-Functional Collaboration Partner with Software Engineering, Infrastructure, Operations, Security More ❯
Southampton, Hampshire, United Kingdom Hybrid / WFH Options
NICE
production environment by monitoring availability and taking a holistic view of system health Build software and systems to manage platform infrastructure and applications Improve reliability, quality, and time-to-market of our suite of software solutions Measure and optimize system performance, with an eye toward pushing our capabilities forward … Participate in system design consulting, platform management, and capacity planning Create sustainable systems and services through automation and uplifts Balance feature development speed and reliability with well-defined service level objectives Have you got what it takes? 3-6 years of working experience in a similar role, with a … Python, Go, Java, C#) and experience with scripting languages (e.g., Bash, PowerShell). Deep understanding of cloud computing platforms (e.g., AWS), the working and reliability constraints of some of the prominent services (e.g., EC2, ECS, Lambda, DynamoDB etc) Experience with infrastructure as code tools such as CloudFormation, Terraform. Deep More ❯
This role plays a key part in the global follow-the-sun support model, working closely with the Global SRE Leader to support platforms worldwide. We are looking for SRE talent with experience in an On-Prem/Datacenter environment. The ideal candidate will bring strong technical leadership, experience in … high-impact team. You'll collaborate with Engineering, Infrastructure, and Operations teams to maintain high availability and resilient service delivery, while also mentoring a SRE team focused on continuous improvement and innovation. Key Responsibilities: Technical Leadership Develop deep expertise in the Titanium trading platform to lead and support critical business … ensuring priorities align with business goals and resource capacity. Operational Excellence Champion initiatives that enhance system availability, scalability, and performance. Collaborate with the Global SRE Leader to refine and enforce operational policies (e.g., Capacity Planning, Change Management, Disaster Recovery). Cross-Functional Collaboration Partner with Software Engineering, Infrastructure, Operations, Security More ❯
Cheltenham, Gloucestershire, United Kingdom Hybrid / WFH Options
TwinStream
consolidate their collective expertise and experience into one business, providing technical excellence and exceptional service to their clients. We have teams working both on-site with clients and remotely from home. Location: Hybrid working in Cheltenham with possible 24/7 call out when on rota Security Clearance: Must … practices. Experience building and maintaining robust CI/CD pipelines. Proven experience deploying full-stack solutions to cloud infrastructure. Comprehensive experience in implementing Service Reliability processes. Understanding of agile software development principles and practices, with the ability to collaborate in a fast-paced, evolving environment. Knowledge of or understanding More ❯