The SiteReliabilityEngineering (SRE) team at Pendo is responsible for provisioning and maintaining cloud infrastructure from development through production for all product initiatives, and working with developers and product managers to ensure that our products are not only reliable and performant, but also cost-efficient. Our … on-call and incident management functions, supporting a high-throughput platform which processes more than 15 billion events per day. To ensure the reliability of this environment for our customers, SREs work closely with developers and product managers to understand service level objectives, think through failures scenarios, and design … systems which balance cost with reliability objectives. Additionally, SREs collaborate with the Information Security team to ensure that cloud infrastructure is properly secured, and that sufficient controls are in place to meet our compliance goals with respect to industry standards such as SOC 2. Role Responsibilities Write high-quality More ❯
sheffield, south yorkshire, yorkshire and the humber, United Kingdom
Ranger Technical Resources
SiteReliability Engineer #2494 Position Summary: Our partner, an innovative PaaS company specializing in remote monitoring and network management solutions, is looking for a SiteReliability Engineer to help ensure the critical infrastructure and applications' reliability, scalability, and performance. In this role, you’ll build … Bachelor's or higher degree in Computer Science, Information Systems, Information Technology, or a related technical field/experience. 7+ years of experience in SiteReliabilityEngineering, DevOps, Infrastructure, or related roles. Deep understanding of AWS and its various modules and services. Strong background in Linux administration More ❯
Senior Recovery Lead and Head of Service Reliability Brand: HSBC Area of Interest: Technology Location: Sheffield, GB, S1 4NB Work style: Office Worker Date: 21 May 2025 Join a digital first bank that's powered by people. Our technology team builds innovative digital solutions rapidly and at scale to … solutions. Beyond recovery, this leader will also own the strategic and tactical roadmap for building reliable, self-healing systems through collaboration with Problem Management, SRE, and Platform teams. Job Responsibilities: Incident Recovery Leadership: Lead a global, follow-the-sun team that acts as technical escalation during major incidents. Partner with … diagnosis and resolution, reducing TTR. Bring calm, coordination, and engineering clarity during high-pressure recovery efforts. Systemic Cause Elimination: Collaborate with Problem Managers, SRE, and Platform Engineering teams to identify and eliminate systemic causes of incidents. Remediation Plans: Own and drive long-term plans including automation, reliabilityMore ❯
sheffield, south yorkshire, yorkshire and the humber, United Kingdom Hybrid / WFH Options
Stealth iT Consulting
SiteReliability Engineer (SRE) Global Digital Consultancy Salary: Up to £55k + benefits Sponsorship won't be provided for this opportunity. Hybrid remote – Occasional travel to Manchester, London, or Glasgow A leading global consultancy, with ambitious plans to grow its Digital teams throughout 2025, is seeking a SiteReliability Engineer (SRE) to support multiple new and ongoing projects. Important: This role may require occasional out-of-hours support or on call based on client needs. Please apply only if you are comfortable and available for this. Desired Skills and Experience: Active SC Clearance or eligible for … SC Clearance. Strong understanding of the SRE mindset and principles, including the creation and management of Service Level Indicators (SLIs) and Service Level Objectives (SLOs), ensuring reliability and performance. An understanding of Microservices & container orchestration Strong Observability & Monitoring experience (preferably tools such as Dynatrace, Prometheus or OpenTelemetry) Experience delivering More ❯
sheffield, south yorkshire, yorkshire and the humber, United Kingdom Hybrid / WFH Options
Uniting Cloud
SiteReliability Engineer (SRE) Remote (UK) £85,000 – £105,000 (DoE) We’re a growing FinTech scale-up and we’re on the lookout for an experienced SiteReliability Engineer to join our remote-first engineering team. Things are moving fast here, and as we … continue to grow; reliability, automation, and scalability have never been more important to us. You will be our first SRE so a strong background in implementing SRE best practices would be Ideal. You will know what good looks like and strive to continuously improve automation, availability and resilience. This … tooling using AWS, Terraform, Docker, and CI/CD pipelines. Supporting and evolving our container-based architecture (we use ECS and Fargate). Driving SRE best practices: SLIs/SLOs, error budgets, reducing toil, and improving observability. Using (and hopefully enjoying!) tools like Datadog, Prometheus, Grafana, and Nix to support More ❯
sheffield, south yorkshire, yorkshire and the humber, United Kingdom Hybrid / WFH Options
Halian
Halian Technology is seeking an experienced SiteReliability Engineer for a full-time opportunity within our client’s Platform Engineering team, based remotely in the U.S. We’re looking for a technically skilled and automation-driven individual with strong experience in cloud infrastructure, and observability tools to … help scale our client’s services to millions of endpoints globally. This is an exciting opportunity to work at the core of platform reliability and infrastructure automation within a fast-growing SaaS company. Key Responsibilities: Diagnose and resolve complex application and infrastructure issues across distributed systems. Participate in 24x7 … using tools like New Relic, DataDog, or Splunk. Influence design decisions to ensure scalable, secure architecture and high availability. Key Requirements: 5+ years in SiteReliabilityEngineering and/or DevOps roles. Strong Linux administration and scripting skills. Hands-on experience with AWS core services (EC2, ECS More ❯
sheffield, south yorkshire, yorkshire and the humber, United Kingdom Hybrid / WFH Options
eMFusion Global
is outside IR35 and runs through 2026. Key Requirements Strong software engineering experience, ideally in Java (Spring Boot) and Python Proven background in SRE practices, including platform reliability, monitoring, and incident response Ability to debug and resolve issues directly in production code Solid experience with Kubernetes , AWS , CI … collaboration Long-term contract through 2026 Opportunity to have real impact on system reliability, performance, and code quality This is not a passive SRE role — you’ll be embedded in engineering teams, taking full ownership of production issues and resolving them at the code level. If you're … a hybrid between a developer and an SRE, we'd love to hear from you. Apply now or get in touch to find out more. More ❯
sheffield, south yorkshire, yorkshire and the humber, United Kingdom Hybrid / WFH Options
Harrington Starr
SiteReliability Engineer – Fintech Up to £85,000 | Fully Remote (UK Only) We’re working with a forward-thinking technology company that’s helping to transform how global financial transactions are monitored and managed. Their platform is used by some of the world’s leading financial institutions to … streamline international payments and ensure compliance at scale - all through smart automation and modern cloud-native infrastructure. They’re looking to bring on a SiteReliability Engineer with deep experience in observability . If you’ve worked with tools like Prometheus in AWS , supported development teams with tracing … and reducing operational noise Working with AWS (EKS, EC2, Lambda, RDS), Terraform, and CI/CD tools What They’re Looking For: Experience in SRE or DevOps roles in a production environment Strong knowledge of observability tools , especially Prometheus in AWS Experience with tracing , metrics, and logs to support development More ❯
sheffield, south yorkshire, yorkshire and the humber, United Kingdom Hybrid / WFH Options
Durlston Partners
Senior SiteReliability Engineer | Remote (EU/UK) | High-Performance Trading A leading trading firm operating at scale in the digital asset space is hiring a Senior SiteReliability Engineer to help scale, secure, and optimise its global trading infrastructure. This is a remote-first role More ❯
sheffield, south yorkshire, yorkshire and the humber, United Kingdom Hybrid / WFH Options
Propel
dynamic, VC-backed startup that is revolutionising risk and compliance management in electronic communications using AI and ML! We're looking for a Senior SiteReliability Engineer who is deeply passionate about technology and looking to play a pivotal role in ensuring the availability, security, and efficiency of More ❯
sheffield, south yorkshire, yorkshire and the humber, United Kingdom Hybrid / WFH Options
Digital Waffle
years. What you’ll do: Implement, test and deploy Azure Data Factory (ADF) pipeline definitions within version control to customer environments. Work with our SiteReliabilityEngineering team to ensure your solutions are observable, reliable and performant. Work with our software implementation consultants (SICs) to define and … verify specification documents for ETL process. Work with customer IT to test customer data source endpoints to ensure they meet specification. Work with our Engineering teams to ensure end-to-end capability for integrated data. Support cutover to production systems (can be outside normal working hours). Identify improvements More ❯
Sheffield, Yorkshire, United Kingdom Hybrid / WFH Options
Reach Studios Limited
the development team Advising on and managing cloud infrastructure (AWS, Azure etc.) What You'll Need Must-haves: Comprehensive experience in a DevOps or SRE role, ideally in a multi-project environment Deep experience with web stacks: Nginx/Apache, PHP-FPM, MySQL, Redis, Varnish, Elasticsearch Proven expertise in managing More ❯
sheffield, south yorkshire, yorkshire and the humber, United Kingdom
Oracle
Oracle Database Administrator, UK Requirement: British Passport holder and willingness to undergo UK Gov Security clearance Are you a seasoned SiteReliability Engineer or Cloud DevOps guru? Are you a backup, restore and recovery expert? If you are, we are looking for you to join our exciting growing … Terraform, Infrastructure as Code Oracle Database Linux RMAN Exadata Zero Data Loss Recovery Appliance Additionally, we are looking for motivated individuals that have Prior SRE experience managing production cloud services Prior experience in releasing and maintaining cloud services Excellent verbal and written communication Production experience managing systems or database environment More ❯
sheffield, south yorkshire, yorkshire and the humber, United Kingdom
Your Next Hire
financial services sector. Their mission is to enable businesses to operate more efficiently through their automation platform, and they are investing heavily in platform engineering to streamline development and improve the overall engineering experience. They are looking … for a senior platform engineer who is passionate about developer experience, self-serve tooling, and platform enablement. This is not a traditional DevOps or SRE role, your focus will be on building scalable internal platforms that unblock development teams, reduce cognitive load, and provide a seamless developer experience. You will … more efficiently. What you’ll be doing Building and enhancing internal developer platforms to simplify onboarding and deployment processes Creating golden paths for common engineering needs, such as AWS database provisioning, security best practices, and infrastructure templates Improving pipelines and automation to remove bottlenecks in the development lifecycle Treating More ❯
sheffield, south yorkshire, yorkshire and the humber, United Kingdom Hybrid / WFH Options
Christopher Ali
core technology stacks working across multiple languages and frameworks. Skills Required : * Experience with Rails ( or similar framework ) * Experience in operating large systems ( following and SRE approach ) * API experience , both creating and integrating with. * Experience in testing large complex applications using a test driven approach * Previous experience designing , building and maintaining More ❯
sheffield, south yorkshire, yorkshire and the humber, United Kingdom Hybrid / WFH Options
Climate Policy Radar
support deploying and leveraging open source large language models Building infrastructure and tooling for enabling internal teams, by scaling our data infrastructure, optimizing for reliability and cost, or improving our search service Tech Stack: Platform: AWS, Pulumi, Docker, Prefect, Github actions, Grafana cloud monitoring Data Science: Python, PyTorch, Pandas … classification, generative AI Experience working with machine learning models in production systems Experience using and maintaining cloud infrastructure Experience with DevOps/infrastructure/SRE, tools used for automation, CI/CD, infra-as-code, containerisation, orchestration Experience with system design, working on system architecture or making technical decisions, whether … individually or with a team Extensive knowledge of different data stores, and formats Solid understanding of software engineering fundamentals, version control, observability, unit and integration testing Our ideal candidate will champion engineering excellence, open source, enabling internal users and creating delightful user experiences. We are looking for candidates More ❯
sheffield, south yorkshire, yorkshire and the humber, United Kingdom
Ansys
used by internal teams and customers. Partner with developers to embed IaC into CI/CD pipelines and release workflows. Drive improvements in environment reliability, configuration consistency, and infrastructure observability. Contribute to infrastructure design discussions with a focus on scalability, maintainability, and operational simplicity. Support secure identity and access … Traefik in containerized environments. Lead incident response and root cause analysis, ensuring learnings feed back into IaC practices. You Have BS in Computer Science, Engineering, or a related field (MS or PhD a plus). 3+ years of hands-on experience with Terraform, including module creation, remote state management … ingress management or reverse proxy in Kubernetes or container environments. Deep familiarity with Linux systems administration. Experience designing IaC standards and promoting DevOps/SRE best practices across teams. Exposure to policy-as-code (e.g., Sentinel, Open Policy Agent). Why Join Us At Ansys, you won’t just support More ❯
sheffield, south yorkshire, yorkshire and the humber, United Kingdom
SoftServe
WE ARE SoftServe is a leading provider of digital business solutions, digital advisory, and digital engineering services, with a team of over 12,000 professionals across 14 countries. Headquartered in the US, SoftServe serves clients primarily in North America and Europe, helping them navigate complex challenges and transform through … Development, Network & Edge, Security, Resiliency, etc.) Articulate the vision for modern engineering (e.g., agile, cloud-native, DevOps) and operations (e.g., observability, automated response, SRE, etc.) and able to articulate a path toward a target operating model (people, process, and tools) SoftServe is an Equal Opportunity Employer. All qualified applicants More ❯
sheffield, south yorkshire, yorkshire and the humber, United Kingdom Hybrid / WFH Options
Stealth iT Consulting
AI Solutions, with offices in London, Manchester, Newcastle and Glasgow (offering a 'flexi working model' of remote first or hybrid + ad hoc client site travel when required). Clients include Government, Finance, Retail & Energy sectors (among other private sectors). They are looking for a Senior Consultant - DevOps … and improve speed, productivity, and quality, and implement product-centric operating models. Solid understanding of hybrid/multi-cloud environments, DevOps, CI/CD, SRE, DevSecOps models, DevX, build and deployment pipelines, observability, and ITIL. Proven experience leading/managing/mentoring a team of DevOps/SRE/Platform More ❯
Head of Service Management. This individual will be responsible for embedding global service management practices across all CTO-owned platforms and services-ensuring high reliability, operational rigor, and alignment to enterprise service standards. This role is also a critical partner to Product Management teams, enabling fast-paced innovation by … escalations and reduced TTR. Problem Management - Track and remediate systemic issues tied to CTO platforms. Change Management - Improve change success and production hygiene. Service Reliability - Align platform resilience initiatives and participate in scenario planning. Service Level Management - Monitor and report on SLOs, SLAs, and performance indicators. CSDM/CMDB … CTO interests in global service forums and ensure alignment to enterprise service management strategy Qualification and Skills: Experience in Technology Operations, Service Management, or SRE, preferably within infrastructure or developer tooling environments. Proven ability to operationalize global service management frameworks across engineering teams. Deep understanding of ITSM processes, including More ❯