Nottingham, Nottinghamshire, United Kingdom Hybrid / WFH Options
Capital One (Europe) plc
Nottingham, Nottinghamshire Senior Software Development Engineer - SiteReliability About the Role We're looking for a Senior Engineer to join our SiteReliabilityEngineering (SRE) team. This role is ideal for a skilled Java engineer with a passion for understanding how complex systems work, analysing performance, and applying engineering solutions to make them more … efficient, stable, and scalable. You'll lead on planning and implementing key SRE initiatives, optimise and automate how our systems operate, and improve observability through better monitoring and logging. You'll also work closely with your peers to drive consistency and high standards across SRE and the wider engineering community, so a real enthusiasm for influencing others and leading … to reduce operational overheads through observability and service automation. Drive engineering best practice (e.g., Operational Excellence, Security, Quality, Resilience etc.) and set standards across the team and wider SRE community. Innovate within your team and contribute within your technical domain. Deliver key pieces of intent from inception through to design and hands-on delivery, in collaboration with your SREM. More ❯
this business-critical environment. This senior leadership role will report to the Head of FIC Production & Reliability Engineering. You will spearhead the SiteReliabilityEngineering (SRE) function across FIC Technology-defining and executing the strategic vision- while remaining embedded in managing and accountable for day-to-day production stability in the Rates & Credit business. You will … be responsible for the resilience, scalability, and performance of mission-critical trading and risk systems-particularly within the Rates & Credit business-while also influencing SRE practices across FX, Repo, Emerging Markets, Listed Derivatives and other Core Platforms. You will work closely with senior business and technology stakeholders to shape the future of Production Engineering, while remaining deeply engaged in … the option to purchase additional days The opportunity to support a wide ranging CSR programme + 2 days' volunteering leave per year Your key responsibilities Define and drive the SRE strategy across FIC Technology, aligning reliability goals with business priorities and regulatory expectations Lead the transformation of production support into a proactive, data-driven engineering discipline focused on More ❯
Leeds, West Yorkshire, United Kingdom Hybrid / WFH Options
VIQU IT
Lead SiteReliability Engineer Hybrid/Remote – Once a month requirement in Leeds. Up to £80,000 per annum plus car allowance plus bonus. VIQU have partnered with a leading company within the supply chain industry who are seeking a Lead SiteReliability Engineer (AWS) to join and mentor their growing team. This position will lead … the organisations cloud infrastructure. This role is mostly remote, with monthly travel required to Leeds. Responsibilities of the Lead SiteReliability Engineer: Lead a team of four SRE’s, helping to maintain the stability of cloud platforms. Take on hands on technical responsibilities within AWS, utilising a range of cloud technologies (CI/CD, Container Orchestration, IaaS, Scripting … the Lead SiteReliability Engineer: Must have at least a years’ experience in managing technical teams, and over five years of experience in a hands on, technical SRE/Dev Ops Engineer role. Experience with CI/CD tools (Jenkins and Concourse CI ideally). Must hold experience within AWS and hold relevant AWS certifications (SA1, DOP-C02 More ❯
Manchester, Lancashire, United Kingdom Hybrid / WFH Options
Embarcaderomediagroup
we're looking for a SiteReliability & Platform Engineer to help lead the way. You'll sit at the heart of our engineering operations, bringing together SRE principles and modern platform engineering practices. This includes combining principles of SRE - such as service-level reliability, observability, incident response - with platform engineering practices like GitOps, Infrastructure … enablement, to help development teams ship faster, safer, and more cost-efficiently. What you'll be doing: Designing and operating highly reliable, scalable, and secure Azure-based platforms Applying SRE principles like SLOs, observability, and incident management to drive service reliability Building Infrastructure as Code using Terraform (v1.7+) and GitOps workflows Enabling teams through platform tools, reusable Terraform modules … This is a great opportunity for someone passionate about building robust infrastructure and enabling others to move faster and more securely. You might come from a cloud engineering, SRE, or DevOps background - what matters most is your curiosity, systems thinking, and drive to improve operational efficiency. At Sorted, we are committed to fostering an inclusive environment where people from More ❯
London, South East, England, United Kingdom Hybrid / WFH Options
Rise Technical Recruitment Limited
Senior SiteReliability EngineerLondon - Hybrid£80,000 - £90,000 + 38 Days Holiday + Private Healthcare + Life Assurance + Flexible Working + Pension Excellent opportunity for SiteReliability Engineer to join a forward-thinking and high-growth technology company offering a Hybrid work environment, a great benefits, and opportunities for further progression!This company operates … performance. With a strong culture rooted in integrity, creativity, and technical excellence, they've become a trusted partner across global industries.In this role you'll take ownership of platform reliability, resilience engineering, and incident management across cutting-edge cloud infrastructure. You'll play a key role in ensuring uptime, performance, and continuous improvement of core systems.The ideal candidate … and conduct chaos engineering experiments*Monitor and maintain Kafka clusters for performance and reliability*Respond to and resolve application-level production incidents The Person: *5+ years in SRE, DevOps, or infrastructure engineering*Strong experience with AWS, EKS/Kubernetes, and Terraform*Familiar with Kafka and observability tools like Datadog or Grafana*Able to troubleshoot issues across infrastructure More ❯
Senior SiteReliability Engineer London - Hybrid £80,000 - £90,000 + 38 Days Holiday + Private Healthcare + Life Assurance + Flexible Working + Pension Excellent opportunity for SiteReliability Engineer to join a forward-thinking and high-growth technology company offering a Hybrid work environment, a great benefits, and opportunities for further progression! This company … With a strong culture rooted in integrity, creativity, and technical excellence, they've become a trusted partner across global industries. In this role you'll take ownership of platform reliability, resilience engineering, and incident management across cutting-edge cloud infrastructure. You'll play a key role in ensuring uptime, performance, and continuous improvement of core systems. The ideal … and conduct chaos engineering experiments Monitor and maintain Kafka clusters for performance and reliability Respond to and resolve application-level production incidents The Person: 5+ years in SRE, DevOps, or infrastructure engineering Strong experience with AWS, EKS/Kubernetes, and Terraform Familiar with Kafka and observability tools like Datadog or Grafana Able to troubleshoot issues across infrastructure More ❯
City of London, London, United Kingdom Hybrid / WFH Options
Rise Technical Recruitment
Senior SiteReliability Engineer London - Hybrid £80,000 - £90,000 + 38 Days Holiday + Private Healthcare + Life Assurance + Flexible Working + Pension Excellent opportunity for SiteReliability Engineer to join a forward-thinking and high-growth technology company offering a Hybrid work environment, a great benefits, and opportunities for further progression! This company … With a strong culture rooted in integrity, creativity, and technical excellence, they've become a trusted partner across global industries. In this role you'll take ownership of platform reliability, resilience engineering, and incident management across cutting-edge cloud infrastructure. You'll play a key role in ensuring uptime, performance, and continuous improvement of core systems. The ideal … and conduct chaos engineering experiments *Monitor and maintain Kafka clusters for performance and reliability *Respond to and resolve application-level production incidents The Person: *5+ years in SRE, DevOps, or infrastructure engineering *Strong experience with AWS, EKS/Kubernetes, and Terraform *Familiar with Kafka and observability tools like Datadog or Grafana *Able to troubleshoot issues across infrastructure More ❯
Employment Type: Permanent
Salary: £80000 - £90000/annum 38 Days Holiday, Healthcare, Pension
We are seeking an exceptional technology leader to oversee our global s ite reliabilityengineering ( SRE), DevOps, and Platform Engineering teams. This hands-on engineering leadership role requires someone who can both provide technical vision and build strong stakeholder relationships across the organization. The ideal candidate will bring a combination of deep technical expertise, strategic thinking … Leadership: Serve as a hands-on technical leader who can architect, design, and guide the implementation of highly resilient systems Build a compelling vision and strategic roadmap for our SRE, DevOps, and Platform Engineering functions Establish and evangelize engineering best practices across teams and the wider organization Drive technical innovation while ensuring operational excellence Provide architectural guidance to … capabilities, and constraints Required Skills & Experience: Extensive experience in engineering leadership roles Strong hands-on technical background in cloud platforms, containerization, and modern DevOps practices Demonstrated experience leading SRE, DevOps, or Platform Engineering teams Deep understanding of system architecture, resilience patterns, and high-availability design Experience developing strategic roadmaps and executing technical vision Proven ability to build and More ❯
Senior SiteReliability Engineer At UnlikelyAI, we are building the future of AI: one that is reliable, accurate and transparent. Our neurosymbolic technology harnesses the power of LLMs and generative AI, and combines it with classical symbolic technology to produce hallucination-resistant artificial intelligence for high-trust applications. To support our rapidly increasing commercial momentum, we're looking … for an experienced and pragmatic sitereliability engineer to join our exceptional team. This role is ideal for someone who has successfully scaled systems from prototype to production and enjoys working in cross-functional teams to champion cloud-native engineering. We are looking for someone with the experience and expertise to define, and own, our approach to building … for reliability and security as first-class citizens. This is a strategically important role for our technology team, as we rapidly approach entering full production in multiple projects. You'll work on a range of customer-facing and internal infrastructure projects, applying your engineering skills to solve complex reliability and scalability challenges. Your ability to build robust More ❯
customer's systems are built and maintained. This role blends operational product support with software engineering to create applications to understand the overall health of our systems. The SRE team sits within a wider programme at the core of the customer mission. The role holder: As an SRE, fundamentally you will be doing work that has historically been done … expertise to substitute automation for human labour, with the objective of limiting traditional manual operations work (incident tickets, on-call etc.) to no more than half of the SRE team's time (and aiming for considerably less). You will have an enthusiasm to learn and experiment, to develop tools to understand application health and improve their reliability … enable them to be scalable and resilient to failure, and how to get the best out of the infrastructure they are deployed to. Participating in the wider DevOps/SRE community within the organisation. Competancies It is desirable for you to have experience in the areas below. However more valued for this role is that you have excitement and enthusiasm More ❯
SiteReliability Engineer with a strong focus on leadership and team management . Around 70% of this role is about building, mentoring and directing a high-performing SRE team, setting strategy and driving operational excellence. The remaining 30% will be hands-on involvement in AWS-based platforms, automation and performance tuning. Key Responsibilities Lead and develop a team … of SRE engineers, setting priorities, providing coaching and creating a culture of reliability and continuous improvement Define and own SRE strategy, standards and ways of working across the organisation Collaborate with engineering, operations and product teams to ensure seamless delivery and robust systems Oversee system reliability, availability and performance across large, business-critical platforms Provide technical guidance … GitLab, Concourse) and ensure AWS platforms meet operational best practice Produce regular reporting and communicate clearly with senior stakeholders Key Requirements Strong experience managing or leading engineering/SRE/DevOps teams in a complex environment Track record of mentoring, coaching and growing technical teams Excellent stakeholder engagement skills with the ability to influence at all levels Broad technical More ❯
Morley, Leeds, West Yorkshire, England, United Kingdom Hybrid / WFH Options
VIQU IT Recruitment
SiteReliability Engineer with a strong focus on leadership and team management . Around 70% of this role is about building, mentoring and directing a high-performing SRE team, setting strategy and driving operational excellence. The remaining 30% will be hands-on involvement in AWS-based platforms, automation and performance tuning. Key Responsibilities Lead and develop a team … of SRE engineers, setting priorities, providing coaching and creating a culture of reliability and continuous improvement Define and own SRE strategy, standards and ways of working across the organisation Collaborate with engineering, operations and product teams to ensure seamless delivery and robust systems Oversee system reliability, availability and performance across large, business-critical platforms Provide technical guidance … GitLab, Concourse) and ensure AWS platforms meet operational best practice Produce regular reporting and communicate clearly with senior stakeholders Key Requirements Strong experience managing or leading engineering/SRE/DevOps teams in a complex environment Track record of mentoring, coaching and growing technical teams Excellent stakeholder engagement skills with the ability to influence at all levels Broad technical More ❯
Morley, West Yorkshire, United Kingdom Hybrid / WFH Options
VIQU IT
SiteReliability Engineer with a strong focus on leadership and team management . Around 70% of this role is about building, mentoring and directing a high-performing SRE team, setting strategy and driving operational excellence. The remaining 30% will be hands-on involvement in AWS-based platforms, automation and performance tuning. Key Responsibilities Lead and develop a team … of SRE engineers, setting priorities, providing coaching and creating a culture of reliability and continuous improvement Define and own SRE strategy, standards and ways of working across the organisation Collaborate with engineering, operations and product teams to ensure seamless delivery and robust systems Oversee system reliability, availability and performance across large, business-critical platforms Provide technical guidance … GitLab, Concourse) and ensure AWS platforms meet operational best practice Produce regular reporting and communicate clearly with senior stakeholders Key Requirements Strong experience managing or leading engineering/SRE/DevOps teams in a complex environment Track record of mentoring, coaching and growing technical teams Excellent stakeholder engagement skills with the ability to influence at all levels Broad technical More ❯
Wokingham, Berkshire, England, United Kingdom Hybrid / WFH Options
eTeam Inc
We are a Global Recruitment specialist that provides support to the clients across EMEA, APAC, US and Canada. We have an excellent job opportunity for you. Role Title: SiteReliabilityEngineering- Need Active SC Clearance Location: Wokingham (Reading) | Hybrid, 60% remote and 40% onsite Duration: 27/02/2026 Rate:402GBP/Day(Inside IR35) Role … Implement CI/CD pipelines for seamless deployment and release management. Ensure compliance with security standards, governance policies, and regulatory requirements. Required Skills & Experience Expertise in software development and engineering for large-scale distributed systems. Strong proficiency in programming languages such as Golang, Java, or Python. Extensive experience with cloud infrastructure providers (AWS, Azure, or GCP). Deep knowledge More ❯
# SiteReliability EngineerRemote - APAC/EngineeringThe Tyk API Management platform is helping to drive the connected world and power new products and services. We're changing the way that organisations connect any number of their systems and services.Whether internal, external, public or highly encrypted systems, Tyk helps businesses drive value across the retail, finance, telecoms, healthcare, or … radical responsibility If this sounds like an environment that you believe could work for you then read on to find out more. The role: We're looking for a SiteReliability Engineer to manage, maintain, improve and provide support on our platform. You will be curious by nature, always looking for ways to improve, as we will look … we expect this role to be advocate of continuous improvement Reliability of our new global Tyk Cloud platform Automation of operations and support Writing and maintaining documentation on SRE processes and policies Recommending and implementing ways of driving operational efficiency and driving down our cost to run, without impacting service Assisting in penetration testing for Cloud through liaising with More ❯
we are passionate about building unified IT solutions that simplify the way IT organizations work. We are currently looking for a SiteReliability Engineer to join our SRE team in the Platform Engineering organization and help us scale our products to millions of end-users. We are looking for individuals with a passion for automation and observability … and SOP's Develop software, scripts, or tooling to improve efficiency and reduce delivery time of applications and infrastructure Other duties as needed About You 5+ years' experience in SiteReliability Engineer roles Expert+ level Linux administration, scripting, and troubleshooting Demonstrable knowledge of Observability tools (Prometheus/Grafana, New Relic, Splunk, DataDog) Comprehensive experience with AWS (Amazon Web More ❯
we are passionate about building unified IT solutions that simplify the way IT organizations work. We are currently looking for a SiteReliability Engineer to join our SRE team in the Platform Engineering organization and help us scale our products to millions of end-users. We are looking for individuals with a passion for automation and observability … and SOP's Develop software, scripts, or tooling to improve efficiency and reduce delivery time of applications and infrastructure Other duties as needed About You 5+ years' experience in SiteReliability Engineer roles Expert+ level Linux administration, scripting, and troubleshooting Demonstrable knowledge of Observability tools (Prometheus/Grafana, New Relic, Splunk, DataDog) Comprehensive experience with AWS (Amazon Web More ❯
are passionate about building unified IT solutions that simplify the way IT organizations work. We are currently looking for a Senior SiteReliability Engineer to join our SRE team in the Platform Engineering organization and help us scale our products to millions of end-users. We are looking for individuals with a passion for automation and observability … and SOP's Develop software, scripts, or tooling to improve efficiency and reduce delivery time of applications and infrastructure Other duties as needed About You 7+ years' experience in SiteReliability Engineer roles 3+ years' experience with an object-oriented language (preferably Java, .NET or C++) Expert+ level Linux administration, scripting, and troubleshooting Demonstrable knowledge of Observability tools More ❯
City Of Westminster, London, United Kingdom Hybrid / WFH Options
Track24 Limited
or New Relic to gain monitoring and performance insights. Incident Management: Establish and oversee monitoring and incident management processes to ensure system reliability. SiteReliabilityEngineering (SRE): Perform SRE duties to ensure system availability, performance, and scalability. Application Support: Work closely with application teams to support application deployment and performance monitoring We use AWS internally, however are More ❯
everyone can do their best work. Whether you're building on our platform, supporting our customers, or shaping our story: You can just ship things. About the Role: As SRE Manager, you will lead the creation and operation of a 24/7 SiteReliabilityEngineering function for Vercel. Your primary goal is to act as the … If you're located beyond that distance, the role is fully remote. For location-specific details, please connect with our recruiting team. What You Will Do: Build & nurture the SRE team at Vercel, holding a high bar for technical work and teamwork. Build rapport with each member of the team and support them as they level up their skills. Define … directly with executive leadership to communicate risks and opportunities and influence cross-engineering prioritization. Partner more specifically with CDN and Compute engineering teams to define and manage SRE-driven project initiatives that improve the robustness and operational efficiency of the company's most critical serving systems. About You: At least 5 years experience in an SRE role, or More ❯
Cambourne, Cambridgeshire, United Kingdom Hybrid / WFH Options
Remotestar
to gemstone supplies They have a presence in London, Hong Kong, Amsterdam, and as well in Mumbai and now in New York in 2001. About the role : As the SRE Manager, you will play a critical role in ensuring the reliability, scalability, and performance of our infrastructure and services through both direct technical contribution along with team building and … tooling. Drive automation initiatives to streamline operational workflows and improve efficiency. Develop and maintain tools, scripts, and dashboards to monitor system health, performance, and reliability. Build a first class SRE team. Through a combination of leading by example, coaching and mentoring, mould the team would want to have around you. Provide leadership and guidance to the SRE team, fostering a … culture of collaboration, innovation, and continuous improvement. RESPONSIBILITIES: Proven experience in a senior or lead SRE role, with a strong track record of building and maintaining highly reliable infrastructure and services. Expertise in incident management, including incident response, resolution, and post-mortem analysis. Proficiency in monitoring, alerting, and observability tools such as Prometheus, Grafana, ELK stack or Datadog. Experience with More ❯
Salford, Manchester, United Kingdom Hybrid / WFH Options
Lloyds Bank plc
drive continuous improvement as we transition to cloud-native technologies. You'll challenge the status quo and push boundaries by working closely with the DevOps COE and the wider engineering community. Join us as an innovator as we enter the next phase of our transformation journey. We're looking for passionate and curious technology specialists with innovative minds who … and compliance principles into architecture and development, ensuring alignment with regulatory and risk frameworks. DevOps & Quality Engineering - Practical experience with DevOps or SiteReliabilityEngineering (SRE), including automation, CI/CD, and quality assurance practices. Leadership & Mentorship - Leads cross-functional teams, drives delivery, coaches others, and fosters a culture of continuous improvement and development. Business Acumen … future trends, drives change initiatives, and shapes technology roadmaps to deliver long-term value. It would be great if you had any of the following Infrastructure as Code & Cloud Engineering - Hands-on experience with tools like Terraform, Chef, Puppet, and Ansible, combined with exposure to cloud platforms such as GCP, AWS, Azure, or ICP/OCP. CI/CD More ❯
Manchester, Lancashire, United Kingdom Hybrid / WFH Options
Lloyds Bank plc
drive continuous improvement as we transition to cloud-native technologies. You'll challenge the status quo and push boundaries by working closely with the DevOps COE and the wider engineering community. Join us as an innovator as we enter the next phase of our transformation journey. We're looking for passionate and curious technology specialists with innovative minds who … and compliance principles into architecture and development, ensuring alignment with regulatory and risk frameworks. DevOps & Quality Engineering - Practical experience with DevOps or SiteReliabilityEngineering (SRE), including automation, CI/CD, and quality assurance practices. Leadership & Mentorship - Leads cross-functional teams, drives delivery, coaches others, and fosters a culture of continuous improvement and development. Business Acumen … future trends, drives change initiatives, and shapes technology roadmaps to deliver long-term value. It would be great if you had any of the following Infrastructure as Code & Cloud Engineering - Hands-on experience with tools like Terraform, Chef, Puppet, and Ansible, combined with exposure to cloud platforms such as GCP, AWS, Azure, or ICP/OCP. CI/CD More ❯
Manchester, Lancashire, England, United Kingdom Hybrid / WFH Options
Lorien
Junior SiteReliability Engineer Hybrid - Manchester x2 days a week Salary up to £45,000 + Bonus The Company: Lorien Global are supporting a growing business based in Manchester City Centre as they expand their Support Services team. With an exciting pipeline of work ahead, they're looking to hire an experienced Junior SiteReliability Engineer More ❯
Southampton, Hampshire, South East, United Kingdom Hybrid / WFH Options
Ordnance Survey Limited
hear from you. Essential Criteria Good knowledge of Azure Cloud hosting technologies Experience with PostgreSQL databases (including PostGIS spatial extension) Good understanding of SiteReliabilityEngineering (SRE) and software engineering best ractices Experience investigating the root cause of failures to understand why they have occurred and propose/enact solutions, and work with external suppliers if More ❯