City of London, London, United Kingdom Hybrid / WFH Options
Explore Group
SiteReliabilityEngineer (Hybrid – London) | RegTech Innovator | AWS, Terraform, Kubernetes Location: London (Hybrid – 2-3 days in office) Are you passionate about scalable infrastructure and modern DevOps practices? Want … to make a tangible impact in a fast-growing RegTech company that’s transforming how businesses navigate regulatory compliance? Join us as a SiteReliabilityEngineer (SRE) and help build and operate the infrastructure that powers cutting-edge compliance solutions used by global financial institutions. What You'll Do Maintain and improve our AWS-based infrastructure using … Docker, Kubernetes (EKS) CI/CD: GitHub Actions, Argo CD, Helm Monitoring: Prometheus, Grafana, CloudWatch, OpenTelemetry Languages: Python, Bash, Go (bonus) What We're Looking For Strong experience in SRE, DevOps, or Production Engineering roles Proven hands-on skills with AWS , Terraform , and Kubernetes Experience with production support, incident management, and RCA practices Comfortable working in a fast-paced startup More ❯
City of London, London, United Kingdom Hybrid / WFH Options
Stott & May Professional Search Limited
Senior SiteReliabilityEngineer Start: ASAP Duration: 6-12 months Location: hybrid, London (Tuesdays, Thursdays WFH) Pay: negotiable, inside IR35 We're looking for an experienced DevOps Engineer to join our team on a contract basis, with a focus on AWS infrastructure, observability tooling, and CI/CD automation. This is a hands-on role supporting … Python, Bash, Go or SQL - Work with Git-based workflows for infrastructure as code - Troubleshoot Kubernetes workloads and containerised services - Participate in an on-call rotation to ensure system reliability Your Profile Essential: - Solid hands-on AWS experience in a DevOps setting - Background in incident, change, and problem management - Strong with Prometheus, Grafana, Splunk, and PromQL - Proficient in scripting More ❯
Social network you want to login/join with: SiteReliabilityEngineer, City of London col-narrow-left Location: City of London, United Kingdom Job Category: Information Technology EU work permit required: Yes col-narrow-right Job Reference: BBBH64028_1750084692 Job Views: 6 Posted: 16.06.2025 Expiry Date: 31.07.2025 col-wide Job Description: SiteReliabilityEngineer Whitehall Resources require a SiteReliabilityEngineer to work with a key client on a 6 month initial contract. *This role will involve on site work in London 3 days per week. *Inside IR35. *This role will require some on-call work. SiteReliabilityEngineer The Role As a SiteReliability/DevOps Engineer, you will play a critical role in managing cloud infrastructure, ensuring the reliability of production systems, and improving end-to-end deployment pipelines. This role combines deep operational responsibilities with a strong focus on automation, observability, and continuous improvement. You will be responsible for maintaining high system availability, enabling rapid delivery through CI/ More ❯
City of London, England, United Kingdom Hybrid / WFH Options
Annapurna
SiteReliabilityEngineer Location: London … Hybrid (3 days office) Salary Range: Up to £140,000 Annapurna is working on behalf of a pioneering technology company to recruit a SiteReliabilityEngineer (SRE) . This is a unique opportunity to play a vital role in developing cutting-edge AI systems that power autonomous vehicle technology. What to Expect: The SRE will be instrumental … in ensuring the stability, resilience, and efficiency of complex autonomous systems. This is a role for someone who thrives on innovation, loves solving infrastructure and reliability challenges, and wants to play a significant role in shaping the future of AI-driven mobility. Key responsibilities include: Ensuring smooth and continuous operation of autonomous vehicle systems in real-world environments. Developing More ❯
for a SiteReliabilityEngineer to join their highly skilled, innovative team. Essential skills: Strong proficiency in Python for infrastructure and automation Hands-on experience in SRE, DevOps or production engineering roles Deep understanding of monitoring, incident response workflows, and system architecture Productive approach to improving systems and reducing technical debt Strong collaboration and communication skills – working … design and implement automation for operations, deployments, monitoring and incident management, as well as owning the observability stack (metrics, logs, traces and alerting). You will also: apply core SRE principles (SLIs, SLOs, error budgets) to enhance system reliability; build, document, and improve high-performance system designs; lead incident response and implement improvements; collaborate closely with quant developers/ More ❯
with the subject line: “Application Support Request”. Role: Senior SiteReliabilityEngineer Location: London Job Type: Permanent Are you looking to take your SRE skills to the next level? We’ve got a great opportunity for you – Senior SiteReliabilityEngineer Careers at TCS: It means more TCS is a purpose-led transformation … to prevent problems, not just react to them. Partner across teams to make performance, scalability, and user experience part of the whole engineering mindset. The Role As a Senior SiteReliabilityEngineer , you will be playing a key role in operational support, integration of applications and building and maintaining infrastructure. Your responsibilities: Effectively monitor a wide range … to incidents, and usually taking on-call responsibilities. Your Profile Essential skills/knowledge/experience: Working knowledge and prior hands-on experience using AWS services at the DevOps Engineer level. Previous experience with incidents, change and problem management. Strong background in setup and operation of enterprise observability tooling, specifically Prometheus, Grafana and Splunk, including usage of PromQL. Proficient More ❯
Working for an industry leading, high-growth SaaS business with some of the biggest brand names in the world as clients, the Senior SiteReliabilityEngineer (SRE) will join the global SRE team, working closely with software engineers to build, maintain, and scale resilient systems and provide first line operational support. You’ll be part of a … mission in close collaboration with DevOps team Maintaining and enhancing Engineering Operational Documentation for supported products Providing expertise to build and maintain products operational documentation and setting up product SRE practices Working in close collaboration with SRE team members and Engineering teams based in around the world Helping build a strong culture of reliability and performance in their services. … The Senior SiteReliabilityEngineer will have: Strong experience in SRE, DevOps Engineer or production engineer Experience in Infrastructure as code (IaC) using Terraform Experience in building continuous integration declarative pipelines in Jenkins or CircleCI Experience with platforms like Kubernetes, Containers and public clouds (GCP or AWS) Experience with deployment and monitoring of highly scalable More ❯
Head of SiteReliability Engineering (SRE) Are you ready to lead a global SRE and Production Engineering function for a business-critical suite of platforms used by leading players in financial services? My client is hiring a Head of Production Engineering & SRE to drive the reliability, scalability, and performance of infrastructure that supports mission-critical, client-facing … applications across global markets. Experience required: 10+ years of experience in engineering, with 5+ years in a leadership role in SRE, DevOps, or Production Engineering. Proven track record managing reliable, scalable systems in a high-compliance environment (e.g., FinTech, HealthTech). Strong understanding of modern software development lifecycle, CI/CD, IaC, and cloud-native technologies. Expertise in Kubernetes, AWS … observability tools, and automation frameworks. Excellent leadership, communication, and stakeholder management skills. Experience in financial services, especially asset management or investor services. Why Join? Shape and scale a modern SRE function with true global impact Work in a fast-moving, high-accountability, engineering-led culture Lead on strategy, talent, and technology across global teams Competitive package, flexible working, and strong More ❯
City of London, London, United Kingdom Hybrid / WFH Options
Unitary
SRE (Unitary AI) Description The company We are a rapidly growing startup developing solutions that blend human expertise and AI agents to handle manual customer and marketplace operations tasks. Our unique approach combines the strengths of human expertise (high accuracy and nuanced decision-making) with the advantages of AI automation (speed and cost efficiency). This cutting-edge technology helps … the beginning of our journey - and we are very excited about our plans for growth over the coming year and beyond! The role We are now looking for a SiteReliabilityEngineer to ensure our systems run smoothly and reliably at scale. Your expertise in monitoring, observability, and system automation will help maintain the high availability and … people who are happy to get stuck into whatever needs doing, and are ready to learn and grow with the company. For this particular role, we need a collaborative engineer who excels at working across teams and can translate complex technical concepts into actionable solutions. You should be comfortable balancing your time between fixing urgent issues and investing in More ❯
Lead SiteReliabilityEngineer Central London (Hybrid) Up to £95k + Car Allowance & Bonus TRIA are working with a leading hospitality client for a Lead SRE, where they are investing heavily in the performance, stability, and reliability of its digital platforms. This is a hands-on leadership role - you won’t just guide others, you’ll … uptime The stack includes Kubernetes , Terraform , AWS , Python , and modern CI/CD tools, and it's evolving. If you're confident in a crisis, understand what a good SRE practice looks like, and want to leave systems in a better place than you found them, please apply to be considered and learn more! What you’ll bring : Experience in … high-traffic digital or eCommerce platforms 5+ years in SRE/DevOps roles; strong background in incident response Observability, automation, and infrastructure as code expertise Leadership skills - mentoring others or leading from the front More ❯
City of London, England, United Kingdom Hybrid / WFH Options
JR United Kingdom
Social network you want to login/join with: Junior SiteReliabilityEngineer, London (City of London) Client: Trust In SODA Location: London (City of London), United Kingdom Job Category: Other EU work permit required: Yes Job Views: 3 Posted: 16.06.2025 Expiry Date: 31.07.2025 Job Description: Are you interested in a role as a Junior SiteReliabilityEngineer? Join a rapidly scaling InsureTech company making significant impacts in the Premium Finance Space. This role involves a major SiteReliability scale-out using cutting-edge serverless technology. We are looking for a candidate with leadership qualities and experience in: Programming/Scripting (.Net, C#, PowerShell) Infrastructure as Code (Terraform, Ansible) CI/ More ❯
enabling large corporations to manage complex infrastructure projects, we provide exceptional service while staying at the forefront of cloud technology advancements. Role Description This is a full-time on-site role 3 days a week minimum in Kings Cross London. We are seeking a skilled SiteReliabilityEngineer with a strong focus on Google Cloud Platform … and respond to cloud incidents using incident.io, ensuring timely resolution. Use JIRA to log, track, and prioritize support tickets and workflow tasks. Monitor and maintain cloud infrastructure for performance, reliability, and security. Collaborate with teams to identify and implement solutions to technical challenges. Assist in deploying, configuring, and optimising GCP resources. Create and maintain documentation for troubleshooting processes and More ❯
Social network you want to login/join with: SiteReliabilityEngineer, london (city of london) col-narrow-left Client: Caspian One Location: london (city of london), United Kingdom Job Category: Other - EU work permit required: Yes col-narrow-right Job Views: 3 Posted: 16.06.2025 Expiry Date: 31.07.2025 col-wide Job Description: Overview This role is critical More ❯
City of London, England, United Kingdom Hybrid / WFH Options
JR United Kingdom
Job Category: Other - EU work permit required: Yes col-narrow-right Job Views: 3 Posted: 16.06.2025 Expiry Date: 31.07.2025 col-wide Job Description: SiteReliabilityEngineer (SRE) Lead – Observability Location: London (Hybrid, 2 days on site per week) Contract Role Overview: Join a high-impact team where you'll lead and shape the SRE and Observability … function for a major transformation programme. This role goes beyond traditional SRE – you’ll champion best practices across product teams, drive observability strategy, and work hands-on with cutting-edge tools like Datadog and AWS. Key Responsibilities: Lead the SRE function and promote observability-first thinking across development and operations teams. Define and implement the observability roadmap across product domains … and improvements across observability platforms. Partner with engineering squads to deliver on observability requirements in an agile, demand-led way. Core Skills & Experience: Proven experience as a hands-on SRE Engineer. Deep understanding of observability and monitoring practices. Practical experience with Datadog (or similar observability platforms). Strong DevOps toolchain knowledge: GitHub, GitHub Actions, Jenkins, CodeQL, Nexus, CloudFormation, Terraform. Solid More ❯
City of London, London, United Kingdom Hybrid / WFH Options
Cpl
SiteReliabilityEngineer (SRE) Lead – Observability Rate: £450-£475 per day (Inside IR35) Location: London (Hybrid, 2 days on site per week) Contract Role Overview: Join a high-impact team where you'll lead and shape the SRE and Observability function for a major transformation programme. This role goes beyond traditional SRE – you’ll champion best … practices across product teams, drive observability strategy, and work hands-on with cutting-edge tools like Datadog and AWS. Key Responsibilities: Lead the SRE function and promote observability-first thinking across development and operations teams. Define and implement the observability roadmap across product domains in collaboration with the client. Be hands-on with Datadog for infrastructure and application-level monitoring. … and improvements across observability platforms. Partner with engineering squads to deliver on observability requirements in an agile, demand-led way. Core Skills & Experience: Proven experience as a hands-on SRE Engineer. Deep understanding of observability and monitoring practices. Practical experience with Datadog (or similar observability platforms). Strong DevOps toolchain knowledge: GitHub, GitHub Actions, Jenkins, CodeQL, Nexus, CloudFormation, Terraform. Solid More ❯
Social network you want to login/join with: Sr. SiteReliabilityEngineer (Kubernetes), london (city of london) col-narrow-left Client: VeeAR Projects Inc. Location: london (city of london), United Kingdom Job Category: Other - EU work permit required: Yes col-narrow-right Job Views: 3 Posted: 16.06.2025 Expiry Date: 31.07.2025 col-wide Job Description: Minimum Qualifications More ❯
City of London, London, United Kingdom Hybrid / WFH Options
RP International
SiteReliabilityEngineer | Inside IR35 | Remote - UK | 6 Month Contract Our client a multinational and respected consultancy is hiring for a SiteReliabilityEngineer with expertise in GCP and DevOps Tools for a new project in the Communication Sector. Duration: 6 Months + Extensions Location: Remote (Ideally UK Based) Rate: £300-350 p/ More ❯
IOVENDO City Of London, England, United Kingdom Join or sign in to find your next job Join to apply for the SiteReliabilityEngineer role at IOVENDO IOVENDO City Of London, England, United Kingdom 2 days ago Be among the first 25 applicants Join to apply for the SiteReliabilityEngineer role at IOVENDO … job and more exclusive features. We are working with a London-based insurance company seeking an individual experienced with Azure and GCP. The candidate will assess the resilience and reliability of migrating applications, recommend design improvements, and assist with implementation. The ideal candidate should have: Experience in reliability roles, producing detailed reports on issues, gaps, and risks. Solutions … Mid-Senior level Employment type Contract Job function Information Technology Industries Data Infrastructure and Analytics Referrals can double your chances of interviewing at IOVENDO. Set up job alerts for “SiteReliabilityEngineer” roles in London, UK. #J-18808-Ljbffr More ❯
SiteReliabilityEngineer We are working with a London based insurance company who are looking for someone with experience across Azure and GCP. They are migrating a number of applications to Azure and GCP and need someone who can provide an assessment of the resilience and reliability off their solution, where necessary implement changes to the … and assist with the delivery of any remedies. You will need to possess a background within the insurance market as well as the following: Previous experience working with an reliability role, producing detailed reports on issues, gaps and areas of potential risk. Strong solutions driven background making recommendations along with expected timeframes to deliver any required improvements. Experience of More ❯
Responsibilities (Text Only) Collaborate closely with the existing SRE teams to build and enhance tooling and automation solutions for faster resolution of issues impacting SLOs and to prevent incidents when possible. Work with customers to understand their supportability pain points and SLO attainment challenges, then formulate strategies to address recurring issues sustainably. Serve as the single point of contact for … Product teams to incorporate supportability considerations into feature development. Qualifications (Text Only) In-depth technical experience in software engineering, network engineering, or systems administration. Operational experience in improving Service Reliability, Availability, and Performance. Ability to navigate ambiguity in a fast-paced environment. Systematic problem-solving skills, effective communication, and curiosity. Expertise in analyzing, troubleshooting, and automating root cause analysis More ❯
City of London, England, United Kingdom Hybrid / WFH Options
Hunter Bond
Job title: Production Support & Platform Engineer/Application SRE – Integration & Project Management Client: FinTech Salary: £90,000-£170,000 Location: London/Hybrid Skills: Linux, SQL, Python, FIX, SRE, Infrastructure, Project Management, Coordination, Trading The role: My client are looking for a talented individual with a broad skillset to perform a unique role, bridging the gap between Production and … automation for project tracking dashboards This role could suit an individual from a number of backgrounds: Production/Integration Engineer with some Project Management experience Infrastructure/Application SRE with a good understanding of Trade Flow and some Project Management experience An individual who has worked their way up to Management within Production Support/Integration or Infrastructure who More ❯