for our next-gen ledger infrastructure Scale multi-region Kubernetes environments across cloud & on-prem Harden distributed systems (Kafka, Redis, CockroachDB) for global banking workloads Lead our AI-powered SRE approach: observability, remediation, and auto-response Enforce zero-trust, multi-tenant security and compliance (SOC2, ISO 27001) Define IaC foundations (Terraform, GitOps, Helm) What We're Looking For: Expert with … Kubernetes and Distributed Systems Experience building production infrastructure at scale (multi-region, high-availability) Extensive experience building both on-Prem & Cloud infrastructure at scale from scratch. Strong SRE mindset: SLOs, SLIs, incident response AI-curious or AI-native: excited to build agent-powered ops Passion for open-source, clean architecture Software Development experience (Bonus points for Go(Golang)) Bonus: background More ❯
for our next-gen ledger infrastructure Scale multi-region Kubernetes environments across cloud & on-prem Harden distributed systems (Kafka, Redis, CockroachDB) for global banking workloads Lead our AI-powered SRE approach: observability, remediation, and auto-response Enforce zero-trust, multi-tenant security and compliance (SOC2, ISO 27001) Define IaC foundations (Terraform, GitOps, Helm) What We're Looking For: Expert with … Kubernetes and Distributed Systems Experience building production infrastructure at scale (multi-region, high-availability) Extensive experience building both on-Prem & Cloud infrastructure at scale from scratch. Strong SRE mindset: SLOs, SLIs, incident response AI-curious or AI-native: excited to build agent-powered ops Experience working in fast-paced, early stage environment. Someone who is currently hands on (not someone More ❯
for our next-gen ledger infrastructure Scale multi-region Kubernetes environments across cloud & on-prem Harden distributed systems (Kafka, Redis, CockroachDB) for global banking workloads Lead our AI-powered SRE approach: observability, remediation, and auto-response Enforce zero-trust, multi-tenant security and compliance (SOC2, ISO 27001) Define IaC foundations (Terraform, GitOps, Helm) What We're Looking For: Expert with … Kubernetes and Distributed Systems Experience building production infrastructure at scale (multi-region, high-availability) Extensive experience building both on-Prem & Cloud infrastructure at scale from scratch. Strong SRE mindset: SLOs, SLIs, incident response AI-curious or AI-native: excited to build agent-powered ops Experience working in fast-paced, early stage environment. Someone who is currently hands on (not someone More ❯
for our next-gen ledger infrastructure Scale multi-region Kubernetes environments across cloud & on-prem Harden distributed systems (Kafka, Redis, CockroachDB) for global banking workloads Lead our AI-powered SRE approach: observability, remediation, and auto-response Enforce zero-trust, multi-tenant security and compliance (SOC2, ISO 27001) Define IaC foundations (Terraform, GitOps, Helm) What We're Looking For: Expert with … Kubernetes and Distributed Systems Experience building production infrastructure at scale (multi-region, high-availability) Extensive experience building both on-Prem & Cloud infrastructure at scale from scratch. Strong SRE mindset: SLOs, SLIs, incident response AI-curious or AI-native: excited to build agent-powered ops Experience working in fast-paced, early stage environment. Someone who is currently hands on (not someone More ❯
Recognised as one of Europe's fastest-growing E-commerce companies, this company has secured significant investment and are ready for growth! As a key player in their dynamic engineering team (40+), you'll be at the helm of driving DevOps operations including software delivery pipelines, optimising performance, stability and driving continuous improvement across the platform. What will make More ❯
London, England, United Kingdom Hybrid / WFH Options
Spectrum IT Recruitment
Recognised as one of Europe's fastest-growing E-commerce companies, this company has secured significant investment and are ready for growth! As a key player in their dynamic engineering team (40+), you'll be at the helm of driving DevOps operations including software delivery pipelines, optimising performance, stability and driving continuous improvement across the platform. What will make More ❯
Birmingham, West Midlands, United Kingdom Hybrid / WFH Options
Spectrum It Recruitment Limited
Recognised as one of Europe's fastest-growing E-commerce companies, this company has secured significant investment and are ready for growth! As a key player in their dynamic engineering team (40+), you'll be at the helm of driving DevOps operations including software delivery pipelines, optimising performance, stability and driving continuous improvement across the platform. What will make More ❯
Hamilton Barnes is currently representing a major vehicle manufacturer that is actively seeking a SiteReliability Engineer for an initial 6-month contract with the possibility of extension. This position has on site commitments 2/3 Days Per Week in Gaydon. If you are interested in learning more we encourage you to apply today! Responsibilities: Build … software and systems to manage platform infrastructure and applications Provide primary operational support and engineering for multiple large distributed software applications Work alongside Developers supporting CI/CD and Release Cycles. Skills/Must have: Experience with cloud platform preferably Google cloud platform Experience in Kubernetes Hands-on Python Coding Salary: £500 per day More ❯
The Job: Job Title: SiteReliability Engineer Industry: SaaS Working Set-Up: Remote first set-up Salary - £45,000-£55,000 per annum Interview process: 2-3 stages The Role: One of our long standing global SaaS clients is making a key hire in the form of a SiteReliability Engineer. This business, with a multi … been impressive, and there's never been a more exciting time to get in on the action! In this role, you'll play a key role in ensuring the reliability, performance, and scalability of their platform. You'll support internal and external stakeholders … and clients to drive improvement and innovation, helping to move their platform forwards by introducing new processes and technologies. This is an incredibly exciting opportunity for a mid-level SRE to join a global company who put their employees growth and development at the heart of what they do! The Person: 4+ years experience in a similar role Experience working More ❯
flexible remoteworking locations within UK/Europe) Employment type: Permanent Working Hours: Full time (9-6 UK) Salary: Up to £110K + Shares + Benefits TransFICC is hiring a SiteReliability Engineer to provide high-performance services to our customers. We develop an integration service … product that enables our clients to have a flexible, hosted service without requiring their internal resources to respond to connectivity challenges across trading venues. You will be joining our SRE team and contributing to TransFICC's automation culture. We are a multi-disciplinary team covering everything from desktop and laptop support to data centre provisioning of servers and vendor network More ❯
to create and sustain a high-performance culture in every area. We have ambitious plans to build an outstanding operation that can compete at the highest level. From exceptional Engineering and Design talent to a world-class race team, supported by specialists in off-track roles - we are assembling the expertise needed to drive this operation forward and compete … level. Being a part of this team will accelerate your career. Take a closer look at the role: Job Description: We have an opportunity for a talented DevOps/SRE Engineer to join the TWG Cadillac Formula 1 Team as part of the Event IT Team. In your role as a DevOps/SRE Engineer, you will be at the … forefront of developing our technological advantage by maintaining the reliability, scalability, and performance of our cloud and on-premises infrastructure. You will collaborate with software engineers, data scientists, and race strategists to streamline application deployments, monitor system performance, and troubleshoot advanced operational issues. Your work will directly impact the team's race performance by ensuring smooth data flow and More ❯
Localize is seeking a Platform Reliability Engineer to join our growing engineering team. As Localize expands, the scalability, reliability, and performance of our infrastructure and applications have become paramount. This role is dedicated to overseeing and managing all aspects of Localize's technical infrastructure, databases, software tools, and to implementing systems for effective monitoring, alerting, and maintenance. … You will be responsible for the scalability, stability, reliability, and performance of the Localize platform. This role will also support Devops and enhance systems used by the engineering team to improve productivity. Key Responsibilities: Oversee and manage Localize's infrastructure across AWS and Cloudflare. Ensure the scalability, reliability, performance, and security of Localize's data stores, specifically … new technologies and systems to improve the efficiency and capabilities of our infrastructure. Must-Have Skills: 5+ years engineering experience. At least 2 years of experience in a SRE and/or Devops role. Expertise in managing and optimizing infrastructure in AWS. Redis and MongoDB, including configuration, monitoring, optimization, and management of backups. Experience with manual or automated deployment More ❯
Are you a passionate Software Engineer looking for an exciting new challenge? Join this team and transition into maintaining and enhancing the reliability of one of the world's largest platforms. In this role, you will utilise your expertise in Golang coding to develop robust applications, ensuring the systems remain resilient, scalable, and efficient. If you thrive in fast … presence and commitment to innovation, you will have the opportunity to work on projects that reach millions of users, making a real difference in the tech world. As a SiteReliability Engineer, you will be responsible for designing, developing, and maintaining systems and applications using Golang. You will monitor and optimise system performance with tools such as Grafana … Prometheus, New Relic, and Splunk. Your role will involve identifying and resolving reliability issues, automating processes, and ensuring the seamless operation of the platform. If you have a passion for technology and a drive to ensure excellence, we would love to hear from you More ❯
brands as they embark on a major digital & AI transformation focussed on growth and experience. As part of this, an opportunity has come available for an experienced Head of Engineering to lead and a high-performing and talented team of 150+ engineers. Your mission: enable engineering excellence, drive speed and quality through DevOps and modern architecture, and partner … a major digital transformation. You'll play a central role in shaping how we build, scale, and operate our platforms - balancing innovation, resilience, and business outcomes AND combining software engineering, DevOps, and platform thinking to deliver exceptional customer experiences at scale. WHAT EXPERIENCE WILL … YOU OFFER? Proven experience leading large-scale engineering teams (100+) in high-growth, product-led organisations. Deep knowledge of software development languages, cloud platforms (AWS preferred), DevOps/SRE, CICD, TDD. Experience and/or interest in emerging AI/Data Science & Automation solutions Exceptional people leadership: you build empowered, high-trust teams with a strong engineering culture. More ❯
and operational excellence, supporting a global client base of financial institutions. You’ll define and implement a modern operations roadmap—driving automation, CI/CD, managed services, and platform reliability—while enabling high-performing engineering and delivery teams. Key Responsibilities Lead DevOps and infrastructure strategy across cloud/on-prem environments Oversee CI/CD, automation, and platform … reliability Align operations with business goals, client delivery … and engineering standards Support ISO27001/SOC2 compliance and secure operational models Drive continuous improvement through KPIs and operational metrics Build and lead a multidisciplinary operations team (DevOps, SRE, Infra) Working predominantly with AWS Requirements Proven experience in Ops/Platform/DevOps leadership within tech or software Deep knowledge of DevOps tools, infrastructure-as-code, and cloud architecture More ❯
and operational excellence, supporting a global client base of financial institutions. You’ll define and implement a modern operations roadmap—driving automation, CI/CD, managed services, and platform reliability—while enabling high-performing engineering and delivery teams. Key Responsibilities Lead DevOps and infrastructure strategy across cloud/on-prem environments Oversee CI/CD, automation, and platform … reliability Align operations with business goals, client delivery … and engineering standards Support ISO27001/SOC2 compliance and secure operational models Drive continuous improvement through KPIs and operational metrics Build and lead a multidisciplinary operations team (DevOps, SRE, Infra) Working predominantly with AWS Requirements Proven experience in Ops/Platform/DevOps leadership within tech or software Deep knowledge of DevOps tools, infrastructure-as-code, and cloud architecture More ❯
and operational excellence, supporting a global client base of financial institutions. You’ll define and implement a modern operations roadmap—driving automation, CI/CD, managed services, and platform reliability—while enabling high-performing engineering and delivery teams. Key Responsibilities Lead DevOps and infrastructure strategy across cloud/on-prem environments Oversee CI/CD, automation, and platform … reliability Align operations with business goals, client delivery … and engineering standards Support ISO27001/SOC2 compliance and secure operational models Drive continuous improvement through KPIs and operational metrics Build and lead a multidisciplinary operations team (DevOps, SRE, Infra) Working predominantly with AWS Requirements Proven experience in Ops/Platform/DevOps leadership within tech or software Deep knowledge of DevOps tools, infrastructure-as-code, and cloud architecture More ❯
Bristol, Avon, South West, United Kingdom Hybrid / WFH Options
Motability Operations
goals within the team Ensure implementation and solutions are aligned to MO's tech stack and design patterns/architectural principles Help the team employ strategies to avoid over-engineering Identify & help mitigate technical risk Help make technical decisions in the team, and help share decisions wider where appropriate Ensure the quality of the code & safety/security of … work produced by the team Champion engineering best practices and mentor team members, fostering knowledge … sharing Seek opportunities for enhancing developer experience Help reduce any "toil" (manual, repetitive, and error-prone work that does not add business value) through automation etc. Ensure delivery of SRE principles such as automation of operations, observability and reliability measurement Promote innovation within the team Provide input into wider (cross team) initiatives, and contribute to Engineering team goals More ❯
Employment Type: Permanent, Part Time, Work From Home
with SQL and Python Data Visualisation skills with PowerBI, other Automation and Metrics knowledge handy. Proficiency with tools like Jira, Confluence, Excel, and SharePoint Familiarity with Agile, DevOps, and SiteReliabilityEngineering Excellent communication and stakeholder management skills More ❯
with SQL and Python Data Visualisation skills with PowerBI, other Automation and Metrics knowledge handy. Proficiency with tools like Jira, Confluence, Excel, and SharePoint Familiarity with Agile, DevOps, and SiteReliabilityEngineering Excellent communication and stakeholder management skills More ❯
with SQL and Python Data Visualisation skills with PowerBI, other Automation and Metrics knowledge handy. Proficiency with tools like Jira, Confluence, Excel, and SharePoint Familiarity with Agile, DevOps, and SiteReliabilityEngineering Excellent communication and stakeholder management skills More ❯
by up to 70%. We seek a Quality and Support Strategist professional who ensures that the Coralogix Alerting and Incident Management Platform and Process exceed the quality and reliability standards, establish a competitive edge, and prevent failures, profit loss, or work stoppages. You will be responsible for enhancing customer experience by ensuring efficient and effective alert management resolution … reducing engineering interruptions, and boosting product awareness. This role involves developing a robust knowledge base, identifying common usage issues, and creating solutions that establish the Alerting and Incident Management Platform's capabilities in terms of performance, pains, and business use cases we deliver. Key Responsibilities: Improve Customer Satisfaction Improve turnaround time to resolve customer satisfaction. Work closely with engineering … Understanding of SLO/SLA management and implementations Knowledge of industry standard incident management frameworks and best practices Familiarity with automated remediation and runbook automation Experience with DevOps and SRE practices Cultural Fit We're seeking candidates who are hungry, humble, and smart. Coralogix fosters a culture of innovation and continuous learning, where team members are encouraged to challenge the More ❯
Responsibilities Manage Data Center Portfolio and Strategy Source and evaluate colocation providers based on robustness, delivery timelines, commercial terms, geography, availability and compliance. Conduct vendor due diligence, including on-site audits with specialist engineers. Manage customer data center facility reviews for large AI cluster deployments. Relationship Management & Compliance Manage all data center colocation vendor relationships. Lead commercial negotiations, securing … cloud compute and data center colocation infrastructure standards, especially for high-density deployments (e.g. GPU clusters, HPC or AI workloads). 2+ years of experience in software engineering, SRE, DevOps, system administration, or HPC infrastructure. Experience collaborating with network engineering teams. Technical degree in engineering, computer science, or a related field. Benefits Competitive total compensation package (cash More ❯
Description The Role We are looking for a Lead Cloud developer to join our growing engineering organisation developing a wide range of market-leading InsurTech solutions. You will be working in flexible agile squads delivering value on multiple greenfield workstreams in the delivery family to deliver core foundational functionality that will be used by multiple SaaS product offerings across … key role in designing and creating new features and enhancing existing code whilst ensuring the multiple micro services that team is responsible for continue to meet high levels of reliability, maintainability, usability, and performance. Although experience in Angular is not mandatory, a willingness to upskill and play a full stack role in the team is also required. The Responsibilities … Experience with software development ecosystem (IDE's, version control, test automation/CI, etc.). •Strong appreciation of building flexible cross-functional full-stack squads with shift-left DevOps, SRE and QA culture. Other highly desirable, but not essential skills are: •Strong appreciation of DevOps principles, with the ability to create automated processes to continuously deliver SaaS products on a More ❯
people based on an evaluation of their potential and support them throughout their time at Cloudflare. Come join us! Available Locations: Lisbon or London About the Department Cloudflare's Engineering Team builds and runs the software that handles large amounts of requests on the Internet today. We also build and run the internal tools and platforms that run that … software. Individual engineering teams are typically responsible for large areas with considerable impact, and able to execute autonomously within that space to deliver value to their customers - be they internal or external. Many of Cloudflare's critical internal services run on Kubernetes. These services include those responsible for Cloudflare's control plane and APIs, data analytics and other internal … into smaller pieces, provide options, talk through trade-offs and drive the effort to solve the problem Bonus Points Experience operating Kubernetes on-premise at scale in capacities including SRE, systems design or architecture Providing guidance and building platforms across multiple zones and regions as foundation for other teams to build distributed highly-available applications Worked in a platform engineeringMore ❯