SiteReliabilityEngineer (Datadog), London JLL is seeking a SiteReliabilityEngineer to support and administer the Datadog monitoring platform. The role focuses on ensuring the reliability, scalability, and efficiency of Datadog for monitoring and AIOps within the organization. The primary goal is … to maximize the availability and performance of applications, infrastructure, and network services, while improving overall system stability. The engineer will collaborate with multiple internal and external teams, requiring critical thinking, adaptability, and the ability to manage multiple priorities. Responsibilities Establish monitoring solutions to track system health and performance, setting … Configure AIOps capabilities for noise reduction, event correlation, and root cause analysis. Experience & Education At least 5 years of experience as an Observability or SiteReliabilityEngineer supporting networks, infrastructure, and applications. Experience developing Ansible scripts for automation. Experience supporting and administering AIOps platforms (e.g., Watchdog, Moogsoft More ❯
Senior SiteReliabilityEngineer - Reuters The Reuters Professional DevOps team is a global squad with members from over five countries. Our work reflects on which is a source of real-time, nonpartisan information on world events, trends and culture. The DevOps team takes a factory approach to … mitigates attacks and helps our customers stay informed wherever they are. Intrigued by a challenge? Reuters Professional DevOps Team is looking for an experienced engineer, who's passionate about automation and scalability to work from our London Office . About the Role: As a Senior SiteReliabilityEngineer at Reuters , you will: Work with a global team, responsible for the infrastructure powering and other products Architect, diagram, document and implement highly scalable solutions for our clients that are resilient, cost-effective, and secure Plan and implement AWS Cloud infrastructure in Terraform and other IaC products More ❯
London, England, United Kingdom Hybrid / WFH Options
JR United Kingdom
Senior SiteReliabilityEngineer (SRE) - London (City of London) Location: Central London (Hybrid - 1-2 days per week) Salary: £80,000 - £100,000 + benefits Why Apply? This is a fantastic opportunity for a seasoned Senior SiteReliabilityEngineer to take a lead role … Terraform/OpenTofu, Ansible, and scripting languages such as PowerShell or Python. Mentor junior team members and promote best practices. Requirements Extensive experience in SRE or DevOps in high-availability, cloud-native environments. Strong AWS expertise (EKS, MSK, RDS, VPC, encryption, IAM). Experience with Kubernetes and Argo CD in More ❯
London, England, United Kingdom Hybrid / WFH Options
ZipRecruiter
Job Description SiteReliabilityEngineer (SRE) - Kubernetes, Observability, Prometheus, Dynatrace, OpenTelemetry Role Overview This is a fantastic opportunity with a consulting company seeking to fill multiple SRE roles. You will play a key role in managing client platforms with a strong emphasis on observability and Kubernetes expertise. … visits will be required for meetings, which will be fully funded. Requirements Minimum of 2 years' commercial experience in a Platform/DevOps/SRE role At least 6 months' experience specifically as a SiteReliabilityEngineer (SRE) Solid experience with Observability tools such as Prometheus and … Strong exposure to Kubernetes Must have resided in the UK for over 5 years to obtain Security Clearance Salary & Benefits The salary for the SRE roles is negotiable based on experience, with an expected starting point of £55,000 basic, along with an excellent benefits package and comprehensive training opportunities. More ❯
Job Description Job Title: Senior SiteReliabilityEngineer (SRE) Location: London, UK – Onsite (5 days/week) Employment Type: Permanent Salary: Up to £80,000 per annum (Gross) About the Role: We are seeking a highly skilled and motivated SiteReliabilityEngineer (SRE) to … join our London-based team. This role is ideal for someone passionate about service reliability, scalability, and performance. As an SRE, you will collaborate with development and operations teams to automate infrastructure, enhance observability, and reduce manual processes (TOIL) to improve overall system health. Key Responsibilities: Design, build, and … Qualifications: Bachelor’s degree in Computer Science, Engineering, or a related technical field (or equivalent practical experience). 8+ years of relevant experience in SRE, DevOps, or Infrastructure Engineering roles. #J-18808-Ljbffr More ❯
London, England, United Kingdom Hybrid / WFH Options
ZipRecruiter
Job Description Job Title: Senior SiteReliabilityEngineer (SRE) Location: Central London (Hybrid - c. 1-2 days per week) Salary: £80,000 - £100,000 + benefits Why Apply? This is a fantastic opportunity for a seasoned Senior SiteReliabilityEngineer to take a lead … most innovative businesses in their market. Working with cutting-edge technology, this role offers high-impact challenges, meaningful collaboration, and excellent career progression. Senior SRE Responsibilities Manage and optimise cloud infrastructure to ensure scalability, high availability, and security. Design and implement robust CI/CD pipelines for efficient product delivery. … tools like GitlabCI, Terraform/OpenTofu, Ansible, and scripting such as PowerShell or Python. Champion infrastructure best practices and mentor junior team members. Senior SRE Requirements Extensive experience in SRE or DevOps roles within high-availability, cloud- environments. Strong expertise with AWS (including EKS, MSK, RDS, VPC design, encryption, and More ❯
SiteReliabilityEngineer - Healthcare Technology Location: Northampton, United Kingdom Job Type: Full-time, Permanent Work Arrangement: Hybrid (office and remote) Client: iO Associates - UK/EU Job Category: Other EU Work Permit Required: Yes Job Views: 2 Posted Date: 06.06.2025 Expiry Date: 21.07.2025 Job Description: Site … clinical systems. This role offers flexibility, technical challenges, and the opportunity to make a significant impact on healthcare delivery. You will join a collaborative SRE team focused on maintaining cloud and on-premise environments, improving deployment pipelines, reducing manual work, and supporting project delivery. You will work closely with internal … teams across software development, support, and delivery. Key technologies include: Linux and Windows Server Applicants should have experience in SRE or DevOps roles, especially in environments using containerised and cloud-based applications. Strong communication skills and the ability to work across teams are essential. Applicants must have the right to More ❯
break into the world of corporate and government consultancy. We’re recruiting for a proactive and technically skilled SiteReliabilityEngineer (SRE) with a strong automation mindset and previous DevOps experience to join our customers team. This is a hands-on role, supporting critical customer platforms and … tools. Troubleshoot and resolve complex system and application issues in production environments Design, implement, and maintain automation processes to improve operational efficiency and system reliability Collaborate closely with development and operations teams to ensure seamless platform integration and performance Maintain and support automation products used by our customers Evaluate … days ago Hounslow, England, United Kingdom 6 days ago London, England, United Kingdom 1 day ago London, England, United Kingdom 1 day ago SRE/DevOps Engineer – High Frequency Trading - Multi Strategy Hedge Fund - Multi Billion Dollar Hedge Fund - Multiple Headcount - Open to Relocation - Up to £700k TC London More ❯
SiteReliabilityEngineer - Multi Cloud, Glasgow Client: iO Associates - UK/EU Location: Glasgow, United Kingdom Job Category: Other EU work permit required: Yes Job Views: 2 Posted: 06.06.2025 Expiry Date: 21.07.2025 Job Description: SiteReliabilityEngineer - Healthcare Technology UK | Hybrid | Full-time | Permanent … is a hybrid role offering flexibility, technical challenge, and the chance to make a direct impact on healthcare delivery. You'll join a collaborative SRE team focused on maintaining cloud and on-premise environments, improving deployment pipelines, reducing manual work, and supporting project delivery. You'll work closely with internal … teams across software development, support, and delivery. Key technologies include: Linux, and Windows Server We're looking for enthusiastic people with experience in SRE or DevOps roles, particularly in environments using containerised and cloud-based applications. Strong communication skills and the ability to work across teams are essential. Applicants must More ❯
Wakefield, England, United Kingdom Hybrid / WFH Options
JR United Kingdom
SiteReliabilityEngineer - Multi Cloud, Wakefield Client: iO Associates - UK/EU Location: Wakefield, United Kingdom Job Category: Other EU work permit required: Yes Job Views: 2 Posted: 06.06.2025 Expiry Date: 21.07.2025 Job Description: SiteReliabilityEngineer - Healthcare Technology Location: UK | Hybrid | Full-time … systems. This hybrid role offers flexibility, technical challenges, and the opportunity to make a direct impact on healthcare delivery. You will join a collaborative SRE team focused on maintaining cloud and on-premise environments, improving deployment pipelines, reducing manual work, and supporting project delivery. You will work closely with internal … teams across software development, support, and delivery. Key technologies include: Linux, and Windows Server We are looking for enthusiastic individuals with experience in SRE or DevOps roles, particularly in environments using containerised and cloud-based applications. Strong communication skills and the ability to work across teams are essential. Applicants must More ❯
London, England, United Kingdom Hybrid / WFH Options
DeepL
say about life at DeepL on LinkedIn, Instagram and our Blog. Meet the team behind this journey This exciting opportunity is open within our SRE & Platform Unit. SRE & Platform Unit is responsible for delivering a seamless, Kubernetes-based platform that supports hybrid deployment across self-hosted and cloud environments. Consisting … operate, and scale applications reliably. The unit is composed of two specialized teams—one focused on Platform Engineering and Kubernetes, and the other on SRE and Cloud Infrastructure. Together, they manage core services from the compute platform over databases and other tools to incident response. Anything from CI to production … champion technical strategy that supports both innovation and stability at scale. Qualities we look for Proven experience in a leadership role managing Platform/SRE/DevOps teams Hands-on engineering background running on-premise or cloud-based open source solutions Strong understanding of open source solutions such as Flatcar More ❯
Liverpool, England, United Kingdom Hybrid / WFH Options
Bellrock Group
SiteReliabilityEngineer - Liverpool (Hybrid Working) As a SiteReliabilityEngineer at Concerto (part of Bellrock Group), you will play a pivotal role in ensuring the reliability, performance, and scalability of our Intelligent Assets Management SaaS platform. You will lead the improvement of … across our systems—empowering the engineering team to release features faster and more safely. Your hands-on experience and strategic thinking will help embed SRE principles throughout the team, improving customer experience, system health and developer productivity. You’ll work across internal environments and customer-facing systems, shaping operational excellence … and reliability at every level. Key responsibilities: Act as a technical leader, mentoring engineers and playing a key role in shaping the platform roadmap. Hands on development of CI/CD pipelines using GitHub Actions, and Octopus Deploy to optimise release quality and efficiency, moving away from TeamCity. Lead More ❯
SiteReliabilityEngineer (SRE) Remote (UK) £85,000 – £105,000 (DoE) We’re a growing FinTech scale-up and we’re on the lookout for an experienced SiteReliabilityEngineer to join our remote-first engineering team. Things are moving fast here, and as … we continue to grow; reliability, automation, and scalability have never been more important to us. You will be our first SRE so a strong background in implementing SRE best practices would be Ideal. You will know what good looks like and strive to continuously improve automation, availability and resilience. … tooling using AWS, Terraform, Docker, and CI/CD pipelines. Supporting and evolving our container-based architecture (we use ECS and Fargate). Driving SRE best practices: SLIs/SLOs, error budgets, reducing toil, and improving observability. Using (and hopefully enjoying!) tools like Datadog, Prometheus, Grafana, and Nix to support More ❯
Job Description Job Title: SiteReliabilityEngineer (SRE) – High-Frequency Trading Infrastructure Location: Onsite – New York City, London, or Singapore Our Client, a leading high-frequency trading firm, is seeking a SiteReliabilityEngineer (SRE) to architect and build next- production tools and infrastructure … , scalability, and performance in one of the most competitive and technologically advanced industries. About the Role This opportunity is ideal for an experienced SRE who thrives in production-critical environments. The successful candidate will join a high-caliber team of engineers and work on automating, scaling, and securing systems … that drive global trading operations. Key Responsibilities Design and develop scalable production tools for deployment, monitoring, and infrastructure automation. Ensure the reliability and efficiency of trading systems through proactive automation and tooling. Collaborate with developers and traders to support the live trading environment. Manage and optimize configuration and deployment More ❯
Social network you want to login/join with: SiteReliabilityEngineer, burton upon trent col-narrow-left Client: Halian Location: burton upon trent, United Kingdom Job Category: Other - EU work permit required: Yes col-narrow-right Job Views … Posted: 10.06.2025 Expiry Date: 25.07.2025 col-wide Job Description: Halian Technology looking for a talented and driven SiteReliabilityEngineer (SRE) to join our growing technology team. In this role, you’ll ensure the reliability, scalability, and performance of our digital platforms that support memorable customer … standards. Optimise system resources for both performance and cost-effectiveness. Contribute to incident response and participate in on-call rotations. Track and improve key SRE metrics such as error rates, incident count, and monitoring coverage. What You’ll Bring: 3+ years of experience in SiteReliability Engineering , DevOps More ❯
make a meaningful impact. See more about our culture on . Role Summary We are seeking a Lead SiteReliabilityEngineer (SRE) to drive our infrastructure team in their mission to build a reliable, fault tolerant and scalable infrastructure. You will be responsible for ensuring the reliability … environments and improving how our customers interact with our core products. Reporting line: Head of Engineering Location: What you will do As a Lead SiteReliabilityEngineer, you balance team supervision, project management, day-to-day operations on production systems with long-term software engineering improvements to … across the team • Contribute to open-source projects, research publications, blog articles and conferences About you • 10+ years of experience in a DevOps/SRE role. • Experience with building and leading high-performing teams. • Experience with cloud computing and highly available distributed systems • Exposure to sitereliability issues More ❯
the job poster from Xcede A technology-focused, multi-strat investment firm, operating at the cutting edge of their industry, is looking for a SiteReliabilityEngineer to join their highly skilled, innovative team. Essential skills: Strong proficiency in Python for infrastructure and automation Hands-on experience in SRE, DevOps or production engineering roles Deep understanding of monitoring, incident response workflows, and system architecture Productive approach to improving systems and reducing technical debt Strong … operations, deployments, monitoring and incident management, as well as owning the observability stack (metrics, logs, traces and alerting). You will also: apply core SRE principles (SLIs, SLOs, error budgets) to enhance system reliability; build, document, and improve high-performance system designs; lead incident response and implement improvements; collaborate More ❯
line: “Application Support Request”. Role: Senior SiteReliabilityEngineer Location: London Job Type: Permanent Are you looking to take your SRE skills to the next level? We’ve got a great opportunity for you – Senior SiteReliabilityEngineer Careers at TCS: It means … react to them. Partner across teams to make performance, scalability, and user experience part of the whole engineering mindset. The Role As a Senior SiteReliabilityEngineer , you will be playing a key role in operational support, integration of applications and building and maintaining infrastructure. Your responsibilities … on-call responsibilities. Your Profile Essential skills/knowledge/experience: Working knowledge and prior hands-on experience using AWS services at the DevOps Engineer level. Previous experience with incidents, change and problem management. Strong background in setup and operation of enterprise observability tooling, specifically Prometheus, Grafana and Splunk More ❯
line: “Application Support Request”. Role: Senior SiteReliabilityEngineer Location: London Job Type: Permanent Are you looking to take your SRE skills to the next level? We’ve got a great opportunity for you – Senior SiteReliabilityEngineer Careers at TCS: It means … react to them. Partner across teams to make performance, scalability, and user experience part of the whole engineering mindset. The Role As a Senior SiteReliabilityEngineer , you will be playing a key role in operational support, integration of applications and building and maintaining infrastructure. Your responsibilities … on-call responsibilities. Your Profile Essential skills/knowledge/experience: Working knowledge and prior hands-on experience using AWS services at the DevOps Engineer level. Previous experience with incidents, change and problem management. Strong background in setup and operation of enterprise observability tooling, specifically Prometheus, Grafana and Splunk More ❯
line: “Application Support Request”. Role: Senior SiteReliabilityEngineer Location: London Job Type: Permanent Are you looking to take your SRE skills to the next level? We’ve got a great opportunity for you – Senior SiteReliabilityEngineer Careers at TCS: It means … react to them. Partner across teams to make performance, scalability, and user experience part of the whole engineering mindset. The Role As a Senior SiteReliabilityEngineer , you will be playing a key role in operational support, integration of applications and building and maintaining infrastructure. Your responsibilities … on-call responsibilities. Your Profile Essential skills/knowledge/experience: Working knowledge and prior hands-on experience using AWS services at the DevOps Engineer level. Previous experience with incidents, change and problem management. Strong background in setup and operation of enterprise observability tooling, specifically Prometheus, Grafana and Splunk More ❯
london (city of london), south east england, united kingdom
Tata Consultancy Services
line: “Application Support Request”. Role: Senior SiteReliabilityEngineer Location: London Job Type: Permanent Are you looking to take your SRE skills to the next level? We’ve got a great opportunity for you – Senior SiteReliabilityEngineer Careers at TCS: It means … react to them. Partner across teams to make performance, scalability, and user experience part of the whole engineering mindset. The Role As a Senior SiteReliabilityEngineer , you will be playing a key role in operational support, integration of applications and building and maintaining infrastructure. Your responsibilities … on-call responsibilities. Your Profile Essential skills/knowledge/experience: Working knowledge and prior hands-on experience using AWS services at the DevOps Engineer level. Previous experience with incidents, change and problem management. Strong background in setup and operation of enterprise observability tooling, specifically Prometheus, Grafana and Splunk More ❯
line: “Application Support Request”. Role: Senior SiteReliabilityEngineer Location: London Job Type: Permanent Are you looking to take your SRE skills to the next level? We’ve got a great opportunity for you – Senior SiteReliabilityEngineer Careers at TCS: It means … react to them. Partner across teams to make performance, scalability, and user experience part of the whole engineering mindset. The Role As a Senior SiteReliabilityEngineer , you will be playing a key role in operational support, integration of applications and building and maintaining infrastructure. Your responsibilities … on-call responsibilities. Your Profile Essential skills/knowledge/experience: Working knowledge and prior hands-on experience using AWS services at the DevOps Engineer level. Previous experience with incidents, change and problem management. Strong background in setup and operation of enterprise observability tooling, specifically Prometheus, Grafana and Splunk More ❯
Swindon, England, United Kingdom Hybrid / WFH Options
JR United Kingdom
Social network you want to login/join with: SiteReliabilityEngineer, swindon, wiltshire col-narrow-left Client: Harrington Starr Location: swindon, wiltshire, United Kingdom Job Category: Other - EU work permit required: Yes col-narrow-right Job Views: 8 Posted: 04.06.2025 Expiry Date: 19.07.2025 col-wide Job … Description: SiteReliabilityEngineer – Fintech Up to £85,000 | Fully Remote (UK Only) We’re working with a forward-thinking technology company that’s helping to transform how global financial transactions are monitored and managed. Their platform is used by some of the world’s leading financial … and reducing operational noise Working with AWS (EKS, EC2, Lambda, RDS), Terraform, and CI/CD tools What They’re Looking For: Experience in SRE or DevOps roles in a production environment Strong knowledge of observability tools , especially Prometheus in AWS Experience with tracing , metrics, and logs to support development More ❯
Senior SiteReliabilityEngineer, Production Engineering Please note that we have a hybrid approach to work and would like to find someone who can come into our offices in London at least one day a week. Who We Are Cisco ThousandEyes is a leading Digital Experience Assurance … within Cisco’s Networking, Security, Collaboration, and Observability portfolios. About The Role We are seeking a skilled Senior SiteReliabilityEngineer (SRE) in Production Engineering with a strong background in SaaS and operations. You will design and manage large-scale, highly available distributed systems in the cloud … collaborating directly with application development teams to enhance the reliability, performance, and security of our platform. What You’ll Do Collaborate with software engineers to optimize architecture and services for availability, latency, performance, and reliability using cloud-native tools. Design and implement scalable operations tooling to support platform More ❯
Sheffield, England, United Kingdom Hybrid / WFH Options
JR United Kingdom
Social network you want to login/join with: SiteReliabilityEngineer, sheffield, south yorkshire col-narrow-left Client: Durlston Partners Location: sheffield, south yorkshire, United Kingdom Job Category: Other - EU work permit required: Yes col-narrow-right Job Views: 4 Posted: 31.05.2025 Expiry Date: 15.07.2025 col … wide Job Description: Senior SiteReliabilityEngineer | Remote (EU/UK) | High-Performance Trading A leading trading firm operating at scale in the digital asset space is hiring a Senior SiteReliabilityEngineer to help scale, secure, and optimise its global trading infrastructure. This More ❯