betting products. Platform Engineer Responsibilities Build and maintain robust, secure and scalable cloud and on-prem infrastructure Drive improvements across observability, monitoring, and alerting Collaborate with development, architecture and SRE teams to support modern platform capabilities Contribute to infrastructure decision-making, balancing long-term goals with practical delivery Play an active role in incident response, root cause analysis, and continuous More ❯
betting products. Platform Engineer Responsibilities Build and maintain robust, secure and scalable cloud and on-prem infrastructure Drive improvements across observability, monitoring, and alerting Collaborate with development, architecture and SRE teams to support modern platform capabilities Contribute to infrastructure decision-making, balancing long-term goals with practical delivery Play an active role in incident response, root cause analysis, and continuous More ❯
are a significant plus: Kubernetes knowledge and operating experience Experience with big data stack components like Hadoop, Spark, Kafka, Nifi, Experience with data science/data analysis Knowledge of SRE/DevOP stacks - monitoring/system management tools (Prometheus, Ansible, ELK, ) Version control using git A day in your life as a ClickHouse Solutions Engineer may include any or all More ❯
Crawley, England, United Kingdom Hybrid / WFH Options
James Chase
Engineer with a passion for leadership and AWS innovation? We’re partnering with a high-growth technology company that is seeking a SiteReliabilityEngineering (SRE) Team Lead/Technical Lead to join their world-class engineering function. This is not your average technical leadership role — you’ll be driving strategic reliability initiatives, shaping … practices, and leading a team of talented SREs committed to automation, scalability, and operational excellence. What You’ll Be Doing Lead, coach, and grow a high-performing DevOps/SRE team. Define and execute the SRE strategy to support scalability, performance, and resilience across critical systems. Own and evolve the AWS infrastructure – think EC2, RDS, ECS, Fargate, IAM, VPC and … and Datadog. Act as a technical mentor and thought leader within both your team and the broader engineering organisation. What We’re Looking For: Proven leadership experience within SRE, DevOps, or Infrastructure teams. Hands-on mastery of AWS services and cloud-native design patterns (microservices, containers, serverless). Proficient in Ansible (Terraform knowledge is a strong advantage). Strategic More ❯
and future states of the organisation and make faster, more informed decisions. The company is headquartered in London, with offices in Philadelphia, The Hague, Toronto, and Sydney. Role: Principal SiteReliability Engineer You will be a senior technical leader focused on scaling and hardening our AWS- and Kubernetes-based infrastructure. You will collaborate across product, platform, and operations … expertise, excellent communication skills, and a collaborative spirit. Responsibilities: Define and enforce SLOs, SLIs, and error budgets across critical services Develop and implement cloud infrastructure and tooling strategies Enhance SRE practices across the organization Implement robust observability metrics, logs, and traces using our observability tools Guide the team in building automated, self-healing systems Own and evolve incident response processes … security, DevOps, and software teams to ensure compliance and operational excellence Evaluate and adopt tools and practices to improve platform performance and reliability Desired Skills & Experience: Experience leading SRE transformations Hands-on expertise with Kubernetes (EKS preferred) in production Strong experience with AWS core services (EC2, EKS, RDS, S3, ALB/NLB, IAM, CloudWatch, etc.) Proficiency in Infrastructure as More ❯
London, England, United Kingdom Hybrid / WFH Options
Orgvue
future states of the organisation and make faster, more informed decisions. The company is headquartered in London, with offices in Philadelphia, The Hague, Toronto, and Sydney. As a Principal SiteReliability Engineer, you will be a senior technical leader focused on scaling and hardening our AWS- and Kubernetes-based infrastructure. You will work across product, platform, and operations … you will: Define and enforce SLOs, SLIs, and error budgets across critical services Craft and implement a cloud infrastructure and tooling strategy Work across our organization to level up SRE practices Help implement robust observability metrics, logs & traces using our observability tools Guide the team in building automated, self-healing systems Own and evolve our incident response processes, including on … compliance, scalability, and operational excellence Evaluate and introduce tools, patterns, and practices that improve the performance and reliability of our SaaS platform Desired Skills & Experience: Demonstrable experience leading SRE transformations Deep hands-on expertise with Kubernetes (EKS preferred) in production environments Strong experience with AWS core services (EC2, EKS, RDS, S3, ALB/NLB, IAM, CloudWatch, etc.) Expertise in More ❯
Edinburgh, Scotland, United Kingdom Hybrid / WFH Options
Canonical
software and operating systems to the global enterprise and technology markets. Our platform, Ubuntu, is very widely used in breakthrough enterprise initiatives such as public cloud, data science, AI, engineering innovation and IoT. Our customers include the world's leading public cloud and silicon providers, and industry leaders in many sectors. The company is a pioneer of global distributed … times yearly in person, in interesting locations around the world, to align on strategy and execution. The company is founder led, profitable and growing. We are hiring a Senior SiteReliability Engineer Next-gen operations at scale, with pure Python infra-as-code, from bare metal to containers and applications. Our goal is to perfect enterprise infrastructure devops. … s possible with automation by embracing a universal operator pattern and model-driven operations. To succeed in this role you need to believe in automation as a pure software engineering problem, not a hack-it-till-it-works-for-me problem. You need to be interested in the scientific approach to operations at scale, driven by metrics and code More ❯
Honolulu, Hawaii, United States Hybrid / WFH Options
OMW Consulting
Role - SiteReliability Engineer Location - Honolulu - Hybrid - 1-2 days a week on site Security … clearance - Minimum Secret - need this ahead of applying Salary - $150k-$200k + Equity I am partnered with a leading defense tech scale up who are looking to add an SRE to their team based in Hawaii. This role is hybrid with an expectation of 1-2 days on site in Honolulu, however there is some weeks where you will … not need to go on site at all. Due to the nature of the client you must hold an active secret clearance as a minimum ahead of applying for this position. To be considered for this position you must have experience with the following: Experience with Security Clearance and DoD IT Environment: You hold an active security clearance, are More ❯
Global SiteReliability Engineer Location: London About Us Founded in 2013, GSR is a leading market maker and programmatic trading firm in the fast-evolving world of cryptocurrency trading. With over 200 employees across seven countries, we provide billions of dollars in liquidity daily to cryptocurrency protocols and exchanges. We build long-term relationships with crypto communities and … at GSR is an opportunity to be deeply embedded in every major sector of the cryptocurrency ecosystem. About the Role We are seeking a SiteReliability Engineer (SRE) to design, optimize, and support highly available systems across our global trading infrastructure. As part of GSR's SRE team, you will manage a multi-regional cloud environment while integrating … work across all layers of infrastructure, including: Networking & Exchange Connectivity Linux Systems & Kubernetes Administration Microservice Orchestration & Observability Disaster Recovery & Security Optimization Your mission is to improve latency, scalability, and reliability, ensuring GSR remains a best-in-class market maker. We value engineers who drive automation, reduce friction, and enhance developer velocity through better tooling, CI/CD, and infrastructure More ❯
Reading, England, United Kingdom Hybrid / WFH Options
Oracle
Who are we? We are a world class team of high calibre security tool services SiteReliability Engineers. We are an inclusive and diverse team with a full spectrum of experience distributed globally. We have the resources of a large enterprise and the energy of a start-up, working on a critical greenfield software assurance project collaboratively with … our cloud and mobile engineering teams. The Software Assurance organisation has the mission to make application security and software assurance, at scale, a reality. We are a dedicated team, leveraging each other’s insights and abilities to produce cutting edge solutions to difficult problems through automation and CI/CD. Join us to grow your career and create the … scale together. #LI-DNI Work You’ll Do: Learn and shape the newest industry trends and technologies Communicate and coordinate with external teams for release management, product management and engineering requirements within a globally distributed team Design, develop, implement and operate a third-party artifact repository Evaluate and improve the security of the repository Performance tune software application security More ❯
London, England, United Kingdom Hybrid / WFH Options
GSR
at GSR is an opportunity to be deeply embedded in every major sector of the cryptocurrency ecosystem. About the Role We are seeking a SiteReliability Engineer (SRE) to design, optimize, and support highly available systems across our global trading infrastructure. As part of GSR’s SRE team, you will manage a multi-regional cloud environment while integrating … IaC). You will work across all layers of infrastructure, including: Networking & Exchange Connectivity Microservice Orchestration & Observability Disaster Recovery & Security Optimization Your mission is to improve latency, scalability, and reliability, ensuring GSR remains a best-in-class market maker. We value engineers who drive automation, reduce friction, and enhance developer velocity through better tooling, CI/CD, and infrastructure More ❯
London, England, United Kingdom Hybrid / WFH Options
Client Server
Lead SiteReliability Engineer SRE Java - FinTech Lead SiteReliability Engineer SRE Java - FinTech Get AI-powered advice on this job and more exclusive features. This range is provided by Client Server. Your actual pay will be based on your skills and experience — talk with your recruiter to learn more. Base pay range Direct message the … job poster from Client Server Team Lead (DevOps & Cyber Security) at Client Server Lead SiteReliability Engineer/SRE (Java) London/WFH to £130k Are you a SiteReliabilityEngineering technologist with a Java software engineering background seeking a role where you can make the technology choices, influence strategy and remain hands-on … has been consistently voted as one of the UKs top employers. As a Lead SiteReliability Engineer you will focus on improving and raising the bar for SRE operations across the firm. You establish SLOs, leveraging public cloud, containerisation, reliability testing and observability, liaising with business stakeholders to establish the product roadmap and providing technical leadership to More ❯
London, England, United Kingdom Hybrid / WFH Options
xAI
SiteReliability Engineer (SRE) - grok.com & API London, UK About xAI xAI’s mission is to create AI systems that can accurately understand the universe and aid humanity in its pursuit of knowledge. Our team is small, highly motivated, and focused on engineering excellence. This organization is for individuals who appreciate challenging themselves and thrive on curiosity. We More ❯
MCS Group is working with one of their closest clients as they seek to appoint a SiteReliability Engineer to their growing team. An award winning business which has seen exponential growth over the last 2 years off the back of their transformative technology being utilised by organisations across the UK and Ireland and beyond. They've grown … required. Strong knowledge of Linux, Windows, and IP networking, covering routing, DNS, firewalls, and load balancing. Commercial experience with Docker, Kubernetes, and container orchestration. Familiarity with Elasticsearch. Understanding of SRE principles, DevOps, and DevSecOps methodologies. Strong problem-solving skills, attention to detail, and the ability to work autonomously. Full right to work in Ireland or UK. The client is unable More ❯
Altinity is looking for a great Cloud Service SiteReliability Engineer to work on ClickHouse, the hottest analytic database on the planet. ClickHouse now has more contributors than ElasticSearch, previously the biggest open-source analytic project on GitHub. We're looking to hire even more.Altinity is a distributed company that values employees, open-source, and doing the right … things for customers. As a Cloud Service SiteReliability Engineer you will be helping us build out Altinity.Cloud, an enterprise ready, cloud service for managed ClickHouse. Here's how to tell if you fit: You are interested in all things cloud and understand the plumbing that makes cloud applications work. You know how to deploy and operate public … facing, container-based services. You work easily with remote engineers. You have outstanding skills in site operation, including: Proven operational skills on Kubernetes and public clouds including AWS Native fluency in Golang with Python a plus Outstanding knowledge of networking (including DNS, load-balancers, peering), storage, and compute Experienced at automating service deployment using CI/CD pipelines and More ❯
London, England, United Kingdom Hybrid / WFH Options
IG Group
Social network you want to login/join with: Lead an engineering-first culture, embedding reliability into the heart of our IT department! As SRE Team Leader, you'll establish SLOs as foundational principles, leveraging public cloud, containerization, reliability testing, and observability to drive continuous improvement in the customer experience. Be at the forefront of building resilient … people all over the world. Join us for an exciting future and let’s innovate together! Your role in the Team The SiteReliability Engineer Team Lead (SRE Team Lead) will manage a team of SRE’s to proactively ensure the stability, resilience and scale of our services through automation, testing and engineering. The SRE Lead will work … to ensure technical solutions are aligned to IGs architectural principles, designs, and NFRs, that deliver value to our customers as well as ensure consistent monitoring, logging and alerting. The SRE Lead reports to the Head of Platform Engineering and is responsible for building capability and maturing operational ways of working across multiple cross-function delivery teams, with a focus More ❯
City of London, London, United Kingdom Hybrid / WFH Options
Ocean Red Partners
Platform Engineering Lead Overview 🏢 Company | Global Financial Services 👤 Position | Platform Operations Lead 🎯 Impact | Reliability, Resilience, Observability, and Automation 📏 Size | 1000+ employees globally cross-asset trading 🌍 Location | London 💻 Hybrid | 2-3 days in the office 💰 Offer | Up to £120,000 + bonus + package Why this role? This is a standout opportunity to own platform operations across a global … improve service-level performance (SLIs/SLOs), while mentoring engineers and embedding continuous learning across the squad. What you’ll be doing Leading a platform ops squad to deliver sitereliability across trading-critical systems Defining DR and incident response strategy (think playbooks, post-mortems, mitigation) Building automation for config, monitoring, and CI/CD pipelines using Terraform … teams on access control, compliance, and DR Coaching junior engineers and acting as technical authority for platform ops What makes you a great fit You’re a Platform/SiteReliability Engineer or Operations Lead with: Strong leadership and mentoring experience Fluency in scripting (Python or PowerShell) and automation best practices Confidence in managing CI/CD workflows More ❯
Platform Engineering Lead Overview 🏢 Company | Global Financial Services 👤 Position | Platform Operations Lead 🎯 Impact | Reliability, Resilience, Observability, and Automation 📏 Size | 1000+ employees globally cross-asset trading 🌍 Location | London 💻 Hybrid | 2-3 days in the office 💰 Offer | Up to £120,000 + bonus + package Why this role? This is a standout opportunity to own platform operations across a global … improve service-level performance (SLIs/SLOs), while mentoring engineers and embedding continuous learning across the squad. What you’ll be doing Leading a platform ops squad to deliver sitereliability across trading-critical systems Defining DR and incident response strategy (think playbooks, post-mortems, mitigation) Building automation for config, monitoring, and CI/CD pipelines using Terraform … teams on access control, compliance, and DR Coaching junior engineers and acting as technical authority for platform ops What makes you a great fit You’re a Platform/SiteReliability Engineer or Operations Lead with: Strong leadership and mentoring experience Fluency in scripting (Python or PowerShell) and automation best practices Confidence in managing CI/CD workflows More ❯
Full time - London or Paris Hybrid or remote from UK/EU + equity As a Senior SiteReliability Engineer working on Blockchain Protocols at Kiln, you will join our Infrastructure Team, composed of 10 Engineers, to build the future of our Validator product and deploy new blockchain protocols. You will report to our Head of Infrastructure and … Solidity, Foundry, OpenZeppelin Requirements: +5 years of background experience in Software or Infrastructure , within a high standard. engineering environment - preferably FinTech or Crypto. Proven experience as a Senior SRE with a very strong focus on Kubernetes. Proficiency with IaC (Terraform/Terragrunt) and infrastructure automation (Helm, GitOps). Familiar with Prometheus and PromQL Familiar with infrastructure and data security More ❯
Sheffield, England, United Kingdom Hybrid / WFH Options
context recruitment
Senior Azure SiteReliability Engineer A leading Cloud Consultancy are headhunting for a DevOps Platform Architect to join their impressive Cloud Services team.As a DevOps advocate, you will be empowered to streamline processes through innovative use of code, platforms, and tools. Your team will provide standardized approaches and frameworks, collaborating within the Cloud Services Group to architect, build … Introduce valuable new technologies and tools. Stay updated with emerging technologies and industry trends. Independently handle tasks and projects. Requirements: Understanding of the software development lifecycle and DevOps/SRE methodologies. Microsoft technology background, especially Azure PaaS. Familiarity with CI/CD implementations and IaC tools (e.g., Terraform, Bicep, ARM). Proficient in multiple programming languages (e.g., .Net (C#), PowerShell More ❯
London, England, United Kingdom Hybrid / WFH Options
ZipRecruiter
your chance to shape strategy, lead teams, and work on some of the most complex cloud environments in the industry. As a Cloud Architect, you will: Lead and mentor engineering teams, fostering best practices in cloud architecture and automation. Design and implement scalable, secure, and resilient cloud solutions across AWS, Azure, or GCP. Define and execute Infrastructure as Code … cloud security best practices, access management, and governance models. Drive DevSecOps adoption Work with senior stakeholders to align cloud strategy Optimize cloud environments using FinOps, modern operations tooling, and SRE methodologies. The ideal Cloud Architect will have: A strong background in cloud architecture and enterprise-scale transformation projects. Expertise in AWS, Azure, or GCP cloud services. Leadership experience, with a … GKE, OpenShift), and CI/CD pipelines. Strong understanding of cloud networking, API integration, and security controls. Experience working in Agile, DevSecOps, and SiteReliabilityEngineering (SRE) environments. Cloud certifications in AWS, Azure, or GCP (highly desirable). The Cloud Architect Package: £100,000 - £118,000 base Performance bonus Pension & private medical care Hybrid working (London-based More ❯
Leeds, England, United Kingdom Hybrid / WFH Options
CybeRim
pipelines and integrating security tooling. Scripting This is a fantastic opportunity for a senior level Azure DevOps engineer who thrives working within infrastructure as code and has a software engineering mentality rather than that of a systems administration/support background. You will work closely with the Head of Platform to come in and join a team of around … working very closely with engineering and product to move the entire platform forwards in its journey to being tech led. Please apply now. Seniority level Seniority level Mid-Senior level Employment type Employment type Full-time Job function Job function Engineering and Information Technology Industries Software Development Referrals increase your chances of interviewing at CybeRim by 2x Get … week ago Contract Azure Cloud Operations Engineer - NPPV3 + SC Cleared Hampshire, England, United Kingdom 5 days ago Senior Engineer – Data Warehouse SiteReliabilityEngineering (SRE) Reading, England, United Kingdom 1 week ago Reading, England, United Kingdom 1 week ago Abingdon-On-Thames, England, United Kingdom 1 month ago Bracknell, England, United Kingdom 1 day ago Reading More ❯
London, England, United Kingdom Hybrid / WFH Options
Deutsche Bank
their trading and risk applications.In this role, you will ensure the reliability, performance, and scalability of real-time trading systems by applying SiteReliabilityEngineering (SRE) principles. You will engage directly with Traders, Strats, and Development teams to optimize trading workflows, troubleshoot complex issues, and drive continuous improvement in both processes and the environment. The role … you will also ensure applications are maintained and monitored to provide a stable environment for all users. You will help develop and mentor junior team members to foster an engineering culture that seeks to automate and reduce manual effort to minimize risk and costs. What we’ll offer you A healthy, engaged and well-supported workforce are better equipped … your supported platforms Driving cost optimisation and capacity resource management to ensure efficient use of resources and cost-effective solutions Delivering both business and technology related benefits by aligning reliabilityengineering practices with business goals and partnering with developers to design and deploy scalable fault-tolerant solutions to meet evolving business needs Your skills and experience Substantial experience More ❯
Bath, England, United Kingdom Hybrid / WFH Options
JR United Kingdom
signed off on a £multi million 3 year technology & digital programme and this is definitely the time to be joining the journey. About the role As the new Devops Engineering Manager, you'll be responsible for building from scratch, a high-performing team of Platform Engineers. Your mission? To orchestrate and evolve the core technology platforms that underpin the … architecture and reducing technical debt. Define SLIs and SLOs across latency, availability, and throughput, aligning internal goals with platform performance. Promote and embed SiteReliabilityEngineering (SRE) practices to improve stability, monitoring, and response. Manage a growing toolset for orchestration, observability, and automation. Partner closely with Engineering, Delivery, and Architecture teams to ensure seamless integration and … strong internal and external stakeholder relationships — advocating for platform goals and aligning with business objectives. What we’re looking for: Strong knowledge of modern platform management practices (DevOps, Agile, SRE, ITIL). Experience with Azure cloud services, including resource management, networking, and compute. Proficiency in Azure DevOps, including CI/CD pipelines, Azure Boards, and Azure Repos. Knowledge of Azure More ❯
Cheltenham, England, United Kingdom Hybrid / WFH Options
JR United Kingdom
signed off on a £multi million 3 year technology & digital programme and this is definitely the time to be joining the journey. About the role As the new Devops Engineering Manager, you'll be responsible for building from scratch, a high-performing team of Platform Engineers. Your mission? To orchestrate and evolve the core technology platforms that underpin the … architecture and reducing technical debt. Define SLIs and SLOs across latency, availability, and throughput, aligning internal goals with platform performance. Promote and embed SiteReliabilityEngineering (SRE) practices to improve stability, monitoring, and response. Manage a growing toolset for orchestration, observability, and automation. Partner closely with Engineering, Delivery, and Architecture teams to ensure seamless integration and … strong internal and external stakeholder relationships — advocating for platform goals and aligning with business objectives. What we’re looking for: Strong knowledge of modern platform management practices (DevOps, Agile, SRE, ITIL). Experience with Azure cloud services, including resource management, networking, and compute. Proficiency in Azure DevOps, including CI/CD pipelines, Azure Boards, and Azure Repos. Knowledge of Azure More ❯