Social network you want to login/join with: Our partner, an innovative PaaS company specializing in remote monitoring and network management solutions, is looking for a SiteReliabilityEngineer to help ensure the critical infrastructure and applications' reliability, scalability, and performance. In this role, you’ll build and maintain highly available systems, support and optimize … job fairs Experience and Education: Bachelor's or higher degree in Computer Science, Information Systems, Information Technology, or a related technical field/experience. 7+ years of experience in SiteReliability Engineering, DevOps, Infrastructure, or related roles. Deep understanding of AWS and its various modules and services. Strong background in Linux administration and troubleshooting. Proven experience in implementing More ❯
Social network you want to login/join with: SiteReliabilityEngineer (SRE) - Crypto High-Frequency Trading, Slough Client: Selby Jennings Location: Slough, United Kingdom Job Category: Other EU work permit required: Yes Job Views: 2 Posted: 06.06.2025 Expiry Date: 21.07.2025 Job Description: We are looking for a SiteReliabilityEngineer (SRE) to help … role ensures our trading systems remain highly available, scalable, and robust, supporting a fast-paced environment. Responsibilities: Develop scalable tools for automation, deployment, and infrastructure management. Enhance system performance, reliability, and efficiency through automation. Manage AWS infrastructure, ensuring smooth configuration and deployment. Implement observability tools for monitoring and debugging. Ensure fault tolerance, redundancy, and high availability of trading systems. … IaC) tools like Terraform or Ansible. Experience in low-latency or high-performance environments. Proactive problem-solving skills and team collaboration. We seek talented engineers passionate about automation and reliability, thriving in a high-performance environment, working across development, trading, and infrastructure teams to optimize system performance. #J-18808-Ljbffr More ❯
We are looking for a SiteReliabilityEngineer (SRE) to help design and build the automation, configuration, and deployment tooling that underpins our high-frequency trading (HFT) platform. This role is at the heart of ensuring our trading systems remain highly available, scalable, and robust, supporting the fast-paced and demanding nature of our environment. What You … ll Be Doing Developing scalable production tools to automate deployment, monitoring, and infrastructure management. Improving system performance, reliability, and efficiency through automation and tooling. Managing AWS-based infrastructure, ensuring seamless configuration and deployment. Implementing observability tools to enhance monitoring, debugging, and performance insights. Ensuring fault tolerance, redundancy, and high availability across critical trading systems. Supporting infrastructure for C++ and More ❯
We are looking for a SiteReliabilityEngineer (SRE) to help design and build the automation, configuration, and deployment tooling that underpins our high-frequency trading (HFT) platform. This role is at the heart of ensuring our trading systems remain highly available, scalable, and robust, supporting the fast-paced and demanding nature of our environment. What You … ll Be Doing Developing scalable production tools to automate deployment, monitoring, and infrastructure management. Improving system performance, reliability, and efficiency through automation and tooling. Managing AWS-based infrastructure, ensuring seamless configuration and deployment. Implementing observability tools to enhance monitoring, debugging, and performance insights. Ensuring fault tolerance, redundancy, and high availability across critical trading systems. Supporting infrastructure for C++ and More ❯
Working for an industry leading, high-growth SaaS business with some of the biggest brand names in the world as clients, the Senior SiteReliabilityEngineer (SRE) will join the global SRE team, working closely with software engineers to build, maintain, and scale resilient systems and provide first line operational support. You’ll be part of a … mission in close collaboration with DevOps team Maintaining and enhancing Engineering Operational Documentation for supported products Providing expertise to build and maintain products operational documentation and setting up product SRE practices Working in close collaboration with SRE team members and Engineering teams based in around the world Helping build a strong culture of reliability and performance in their services. … The Senior SiteReliabilityEngineer will have: Strong experience in SRE, DevOps Engineer or production engineer Experience in Infrastructure as code (IaC) using Terraform Experience in building continuous integration declarative pipelines in Jenkins or CircleCI Experience with platforms like Kubernetes, Containers and public clouds (GCP or AWS) Experience with deployment and monitoring of highly scalable More ❯
london (city of london), south east england, united kingdom
RedCat Digital
Working for an industry leading, high-growth SaaS business with some of the biggest brand names in the world as clients, the Senior SiteReliabilityEngineer (SRE) will join the global SRE team, working closely with software engineers to build, maintain, and scale resilient systems and provide first line operational support. You’ll be part of a … mission in close collaboration with DevOps team Maintaining and enhancing Engineering Operational Documentation for supported products Providing expertise to build and maintain products operational documentation and setting up product SRE practices Working in close collaboration with SRE team members and Engineering teams based in around the world Helping build a strong culture of reliability and performance in their services. … The Senior SiteReliabilityEngineer will have: Strong experience in SRE, DevOps Engineer or production engineer Experience in Infrastructure as code (IaC) using Terraform Experience in building continuous integration declarative pipelines in Jenkins or CircleCI Experience with platforms like Kubernetes, Containers and public clouds (GCP or AWS) Experience with deployment and monitoring of highly scalable More ❯
Working for an industry leading, high-growth SaaS business with some of the biggest brand names in the world as clients, the Senior SiteReliabilityEngineer (SRE) will join the global SRE team, working closely with software engineers to build, maintain, and scale resilient systems and provide first line operational support. You’ll be part of a … mission in close collaboration with DevOps team Maintaining and enhancing Engineering Operational Documentation for supported products Providing expertise to build and maintain products operational documentation and setting up product SRE practices Working in close collaboration with SRE team members and Engineering teams based in around the world Helping build a strong culture of reliability and performance in their services. … The Senior SiteReliabilityEngineer will have: Strong experience in SRE, DevOps Engineer or production engineer Experience in Infrastructure as code (IaC) using Terraform Experience in building continuous integration declarative pipelines in Jenkins or CircleCI Experience with platforms like Kubernetes, Containers and public clouds (GCP or AWS) Experience with deployment and monitoring of highly scalable More ❯
Slough, England, United Kingdom Hybrid / WFH Options
JR United Kingdom
play a critical role in ensuring system reliability, scalability, and performance across both AWS and Azure environments. This is your opportunity to lead cloud-native transformation and embed SRE best practices into engineering at scale. What you’ll be doing as their SiteReliabilityEngineer: You’ll be the go-to expert for designing and maintaining … CI/CD pipelines to reduce toil and accelerate deployment frequency. Build observability into everything—own monitoring, alerting, and incident response to minimize MTTR and improve system health. Champion SRE culture and reliability-focused engineering—help shape sustainable engineering practices, SLAs, SLOs, and error budgets. Contribute across the stack with flexibility in tooling—experience with Python, Go, or TypeScript … dental insurance 25 days annual leave + bank holidays R&D and personal training budgets And much more... This is an incredibly rare chance for a seasoned, high-performing SRE to leave your mark on high-impact transformation projects in a business that’s truly committed to doing things the right way. #J-18808-Ljbffr More ❯
weeks ago Be among the first 25 applicants SiteReliabilityEngineer - Healthcare Technology UK | Hybrid | Full-time | Permanent We're working with a leading healthtech company to find a SiteReliabilityEngineer to support and optimise the platforms … behind critical clinical systems. This is a hybrid role offering flexibility, technical challenge, and the chance to make a direct impact on healthcare delivery. You'll join a collaborative SRE team focused on maintaining cloud and on-premise environments, improving deployment pipelines, reducing manual work, and supporting project delivery. You'll work closely with internal teams across software development, support … and delivery. Key Technologies Include Azure, AWS, GCP Kubernetes, Terraform, Azure DevOps Linux, and Windows Server We're looking for enthusiastic people with experience in SRE or DevOps roles, particularly in environments using containerised and cloud-based applications. Strong communication skills and the ability to work across teams are essential. Applicants must have the right to live and work in More ❯
actual pay will be based on your skills and experience — talk with your recruiter to learn more. Base pay range Direct message the job poster from Prism Digital Lead SiteReliabilityEngineer | Azure, .NET, Kubernetes/Docker | Build the Future of PropTech Our client, the UK's largest property services group, is on a transformative mission to … the real estate space. Backed by a major financial institution and with a brand-new, tech-committed CEO at the helm, this is a rare opportunity to lead platform reliability across a business that touches millions. This is not just a hands-on role, it’s a leadership opportunity at the centre of a £multi-million transformation programme. You … you do not have to have had a previous leadership/management position. You will however have to have the gravitas, hunger and ability to lead and grow an SRE team. What You’ll Do: Own the operational reliability of a large-scale Azure cloud platform. Drive automation-first culture using Terraform, Azure CLI, PowerShell and more. Lead incident More ❯
Social network you want to login/join with: Head of SiteReliability Engineering (SRE), slough col-narrow-left Client: O Partners Location: slough, United Kingdom Job Category: Other - EU work permit required: Yes col-narrow-right Job Views: 2 Posted: 31.05.2025 Expiry Date: 15.07.2025 col-wide Job Description: Head of SiteReliability Engineering (SRE) Are … you ready to lead a global SRE and Production Engineering function for a business-critical suite of platforms used by leading players in financial services? My client is hiring a Head of Production Engineering & SRE to drive the reliability, scalability, and performance of infrastructure that supports mission-critical, client-facing applications across global markets. Experience required: 10+ years of … experience in engineering, with 5+ years in a leadership role in SRE, DevOps, or Production Engineering. Proven track record managing reliable, scalable systems in a high-compliance environment (e.g., FinTech, HealthTech). Strong understanding of modern software development lifecycle, CI/CD, IaC, and cloud-native technologies. Expertise in Kubernetes, AWS (or Azure/GCP), GitOps workflows, observability tools, and More ❯
Head of SiteReliability Engineering (SRE) Are you ready to lead a global SRE and Production Engineering function for a business-critical suite of platforms used by leading players in financial services? My client is hiring a Head of Production Engineering & SRE to drive the reliability, scalability, and performance of infrastructure that supports mission-critical, client-facing … applications across global markets. Experience required: 10+ years of experience in engineering, with 5+ years in a leadership role in SRE, DevOps, or Production Engineering. Proven track record managing reliable, scalable systems in a high-compliance environment (e.g., FinTech, HealthTech). Strong understanding of modern software development lifecycle, CI/CD, IaC, and cloud-native technologies. Expertise in Kubernetes, AWS … observability tools, and automation frameworks. Excellent leadership, communication, and stakeholder management skills. Experience in financial services, especially asset management or investor services. Why Join? Shape and scale a modern SRE function with true global impact Work in a fast-moving, high-accountability, engineering-led culture Lead on strategy, talent, and technology across global teams Competitive package, flexible working, and strong More ❯
Head of SiteReliability Engineering (SRE) Are you ready to lead a global SRE and Production Engineering function for a business-critical suite of platforms used by leading players in financial services? My client is hiring a Head of Production Engineering & SRE to drive the reliability, scalability, and performance of infrastructure that supports mission-critical, client-facing … applications across global markets. Experience required: 10+ years of experience in engineering, with 5+ years in a leadership role in SRE, DevOps, or Production Engineering. Proven track record managing reliable, scalable systems in a high-compliance environment (e.g., FinTech, HealthTech). Strong understanding of modern software development lifecycle, CI/CD, IaC, and cloud-native technologies. Expertise in Kubernetes, AWS … observability tools, and automation frameworks. Excellent leadership, communication, and stakeholder management skills. Experience in financial services, especially asset management or investor services. Why Join? Shape and scale a modern SRE function with true global impact Work in a fast-moving, high-accountability, engineering-led culture Lead on strategy, talent, and technology across global teams Competitive package, flexible working, and strong More ❯
london (city of london), south east england, united kingdom
O Partners
Head of SiteReliability Engineering (SRE) Are you ready to lead a global SRE and Production Engineering function for a business-critical suite of platforms used by leading players in financial services? My client is hiring a Head of Production Engineering & SRE to drive the reliability, scalability, and performance of infrastructure that supports mission-critical, client-facing … applications across global markets. Experience required: 10+ years of experience in engineering, with 5+ years in a leadership role in SRE, DevOps, or Production Engineering. Proven track record managing reliable, scalable systems in a high-compliance environment (e.g., FinTech, HealthTech). Strong understanding of modern software development lifecycle, CI/CD, IaC, and cloud-native technologies. Expertise in Kubernetes, AWS … observability tools, and automation frameworks. Excellent leadership, communication, and stakeholder management skills. Experience in financial services, especially asset management or investor services. Why Join? Shape and scale a modern SRE function with true global impact Work in a fast-moving, high-accountability, engineering-led culture Lead on strategy, talent, and technology across global teams Competitive package, flexible working, and strong More ❯
on our PSL and will contact one of those if we need to. About the role You will be responsible for managing a small team of experienced DevOps/SRE Engineers, to support, maintain, and consistently improve the development estate so the development teams can focus and innovate constantly to drive Zen's business and the teams' priorities. Reporting to … the engineering manager, you must have an excellent track record in a DevOps or better SRE role and have a genuine interest in developing your Line Management experience. Zen are committed to investing in and developing you - we believe that leaders have a huge impact on our people and play a vital part in our success. As a service provider … work with exceptional experienced Lead DevOps engineers and developers using frameworks, standards, and defined operational process to make the unfamiliar comfortable and familiar. Over time we'll increasingly leverage SRE principles to further optimize and enhance capabilities - it's a great opportunity to get in at the start and shape this journey. You'll own the relationship with the operations More ❯
Crawley, England, United Kingdom Hybrid / WFH Options
JR United Kingdom
Job Opportunity: SiteReliabilityEngineer (SRE) Are you an Azure DevOps/SRE professional looking for your next opportunity? Do you have a passion for ensuring application reliability and performance? Do you thrive in a collaborative, high-impact environment? If yes, this could be your next big opportunity! Our client, a leading provider of financial services … to join their team on a permanent basis. Responsibilities: Managing incidents and post-mortems for on-premises and cloud applications. Monitoring performance using modern tools and implementing automation. Driving SRE and DevOps best practices. Supporting releases with minimal downtime. Key Skills & Experience: Experience in SRE, IT operations, software development, or DevOps. Familiarity with CI/CD, IaC, Agile, and ITIL … frameworks. Proficiency in Azure Monitor, Application Insights, KQL, and incident management. Hands-on experience with YAML pipelines. Experience with Bicep, SolarWinds, Terraform, and PowerShell. Interested in joining a growing SRE team focused on automation and reliability? Click Apply now or send your CV to email@domain.com . This role offers hybrid working with one day a week in the More ❯
Slough, England, United Kingdom Hybrid / WFH Options
JR United Kingdom
Social network you want to login/join with: Junior SiteReliabilityEngineer, Slough Client: Trust In SODA Location: Slough, United Kingdom Job Category: Other EU work permit required: Yes Job Views: 7 Posted: 16.06.2025 Expiry Date: 31.07.2025 Job Description: Are you interested in a Junior SiteReliabilityEngineer role? Join an exciting InsureTech … days annual leave, and flexible working arrangements. Our client is a rapidly scaling InsureTech business making significant impacts in the Premium Finance space. They are looking for a Junior SRE to help scale their infrastructure using cutting-edge serverless technologies. The ideal candidate will have experience with: Programming/Scripting (.NET, C#, PowerShell) Infrastructure as Code (Terraform, Ansible) CICD tools More ❯
Lead SiteReliabilityEngineer Central London (Hybrid) Up to £95k + Car Allowance & Bonus TRIA are working with a leading hospitality client for a Lead SRE, where they are investing heavily in the performance, stability, and reliability of its digital platforms. This is a hands-on leadership role - you won’t just guide others, you’ll … uptime The stack includes Kubernetes , Terraform , AWS , Python , and modern CI/CD tools, and it's evolving. If you're confident in a crisis, understand what a good SRE practice looks like, and want to leave systems in a better place than you found them, please apply to be considered and learn more! What you’ll bring : Experience in … high-traffic digital or eCommerce platforms 5+ years in SRE/DevOps roles; strong background in incident response Observability, automation, and infrastructure as code expertise Leadership skills - mentoring others or leading from the front More ❯
Lead SiteReliabilityEngineer Central London (Hybrid) Up to £95k + Car Allowance & Bonus TRIA are working with a leading hospitality client for a Lead SRE, where they are investing heavily in the performance, stability, and reliability of its digital platforms. This is a hands-on leadership role - you won’t just guide others, you’ll … uptime The stack includes Kubernetes , Terraform , AWS , Python , and modern CI/CD tools, and it's evolving. If you're confident in a crisis, understand what a good SRE practice looks like, and want to leave systems in a better place than you found them, please apply to be considered and learn more! What you’ll bring : Experience in … high-traffic digital or eCommerce platforms 5+ years in SRE/DevOps roles; strong background in incident response Observability, automation, and infrastructure as code expertise Leadership skills - mentoring others or leading from the front More ❯
Lead SiteReliabilityEngineer Central London (Hybrid) Up to £95k + Car Allowance & Bonus TRIA are working with a leading hospitality client for a Lead SRE, where they are investing heavily in the performance, stability, and reliability of its digital platforms. This is a hands-on leadership role - you won’t just guide others, you’ll … uptime The stack includes Kubernetes , Terraform , AWS , Python , and modern CI/CD tools, and it's evolving. If you're confident in a crisis, understand what a good SRE practice looks like, and want to leave systems in a better place than you found them, please apply to be considered and learn more! What you’ll bring : Experience in … high-traffic digital or eCommerce platforms 5+ years in SRE/DevOps roles; strong background in incident response Observability, automation, and infrastructure as code expertise Leadership skills - mentoring others or leading from the front More ❯
london (city of london), south east england, united kingdom
TRIA
Lead SiteReliabilityEngineer Central London (Hybrid) Up to £95k + Car Allowance & Bonus TRIA are working with a leading hospitality client for a Lead SRE, where they are investing heavily in the performance, stability, and reliability of its digital platforms. This is a hands-on leadership role - you won’t just guide others, you’ll … uptime The stack includes Kubernetes , Terraform , AWS , Python , and modern CI/CD tools, and it's evolving. If you're confident in a crisis, understand what a good SRE practice looks like, and want to leave systems in a better place than you found them, please apply to be considered and learn more! What you’ll bring : Experience in … high-traffic digital or eCommerce platforms 5+ years in SRE/DevOps roles; strong background in incident response Observability, automation, and infrastructure as code expertise Leadership skills - mentoring others or leading from the front More ❯
Crawley, England, United Kingdom Hybrid / WFH Options
Manor Royal Business District
Vacancy Name: SiteReliabilityEngineer Vacancy No: VN1607 Employment Type: Full-Time Primary Work Location: People's Partnership - Manhattan Building, Crawley Description: SiteReliabilityEngineer About People’s Partnership: At the heart of our not-for-profit organisation is a commitment and a motivation to make the future-saving experience a simple one for … to reduce toil, and improve availability, reliability, security, and velocity. Maintain effective feedback loops so that findings can be prioritised and acted upon in a timely fashion. Follow SRE and DevOps core principles to drive adoption and utilisation. What we’re looking for: Strong background in one or more of the following areas: SRE/application support/IT …/software development/DevOps. Experience working within both Agile and ITIL frameworks. Experience working with DevOps principles and concepts such as CI/CD and IaC. Experience of SRE environments and processes specifically in the areas of availability, incident management and monitoring. Knowledge of scripting languages and desired state configuration such as Bicep or Terraform, and PowerShell. Experience using More ❯
enabling large corporations to manage complex infrastructure projects, we provide exceptional service while staying at the forefront of cloud technology advancements. Role Description This is a full-time on-site role 3 days a week minimum in Kings Cross London. We are seeking a skilled SiteReliabilityEngineer with a strong focus on Google Cloud Platform … and respond to cloud incidents using incident.io, ensuring timely resolution. Use JIRA to log, track, and prioritize support tickets and workflow tasks. Monitor and maintain cloud infrastructure for performance, reliability, and security. Collaborate with teams to identify and implement solutions to technical challenges. Assist in deploying, configuring, and optimising GCP resources. Create and maintain documentation for troubleshooting processes and More ❯
london (city of london), south east england, united kingdom
WALT Labs
enabling large corporations to manage complex infrastructure projects, we provide exceptional service while staying at the forefront of cloud technology advancements. Role Description This is a full-time on-site role 3 days a week minimum in Kings Cross London. We are seeking a skilled SiteReliabilityEngineer with a strong focus on Google Cloud Platform … and respond to cloud incidents using incident.io, ensuring timely resolution. Use JIRA to log, track, and prioritize support tickets and workflow tasks. Monitor and maintain cloud infrastructure for performance, reliability, and security. Collaborate with teams to identify and implement solutions to technical challenges. Assist in deploying, configuring, and optimising GCP resources. Create and maintain documentation for troubleshooting processes and More ❯
Slough, England, United Kingdom Hybrid / WFH Options
JR United Kingdom
Social network you want to login/join with: SiteReliabilityEngineer (SRE) Lead – Observability Location: London (Hybrid, 2 days on site per week) Contract Role Overview: Join a high-impact team where you'll lead and shape the SRE and Observability function for a major transformation programme. This role goes beyond traditional SRE – you’ll … champion best practices across product teams, drive observability strategy, and work hands-on with cutting-edge tools like Datadog and AWS. Key Responsibilities: Lead the SRE function and promote observability-first thinking across development and operations teams. Define and implement the observability roadmap across product domains in collaboration with the client. Be hands-on with Datadog for infrastructure and application … and improvements across observability platforms. Partner with engineering squads to deliver on observability requirements in an agile, demand-led way. Core Skills & Experience: Proven experience as a hands-on SRE Engineer. Deep understanding of observability and monitoring practices. Practical experience with Datadog (or similar observability platforms). Strong DevOps toolchain knowledge: GitHub, GitHub Actions, Jenkins, CodeQL, Nexus, CloudFormation, Terraform. Solid More ❯