Fancy being our next SRE Superstar? SiteReliability Engineer (SRE) Sunderland (Hybrid) Full-time Alright, listen up! Here at Tombola, we're not just about bingo - we're about brilliant tech, seamless experiences, and keeping millions of players happy. And to do that, we need a SiteReliability Engineer who's as excited about rock-solid … working hand-in-hand with our dev, infra, and security teams, making sure we balance exciting new features with unbeatable stability. What you'll be getting up to: System Reliability & Availability Hero: You'll be the guardian of our uptime, making sure our critical systems are always available and hitting those all-important SLAs . You'll also be … tech and better ways of doing things, constantly pushing us to improve system reliability, performance, and efficiency. Sound like a bit of you? If you're an experienced SRE with a passion for building reliable, scalable, and efficient systems, and you love working in a fun, collaborative environment, then we want to hear from you! Ready to join the More ❯
customer's systems are built and maintained. This role blends operational product support with software engineering to create applications to understand the overall health of our systems. The SRE team sits within a wider programme at the core of the customer mission. The role holder: As an SRE, fundamentally you will be doing work that has historically been done … expertise to substitute automation for human labour, with the objective of limiting traditional manual operations work (incident tickets, on-call etc.) to no more than half of the SRE team's time (and aiming for considerably less). You will have an enthusiasm to learn and experiment, to develop tools to understand application health and improve their reliability … enable them to be scalable and resilient to failure, and how to get the best out of the infrastructure they are deployed to. Participating in the wider DevOps/SRE community within the organisation. Competancies It is desirable for you to have experience in the areas below. However more valued for this role is that you have excitement and enthusiasm More ❯
Senior SiteReliability Engineer - Reuters The Reuters Professional DevOps team is a global squad with members from over five countries. Our work reflects on which is a source of real-time, nonpartisan information on world events, trends and culture. The DevOps team takes a factory approach to infrastructure, by designing and developing repeatable cloud-native patterns and applying … Professional DevOps Team is looking for an experienced engineer, who's passionate about automation and scalability to work from our London Office . About the Role: As a Senior SiteReliability Engineer at Reuters , you will: Work with a global team, responsible for the infrastructure powering and other products Architect, diagram, document and implement highly scalable solutions for … forward until the adoption of chosen solutions reaches a 100% Communicate clearly, frequently, and take pleasure in simplifying technical concepts for non-technical audiences About You: As our Senior SiteReliability Engineer, you are likely to have: Essential Skills & Experience Comfortable with various flavors of (U L)inux and ready to discuss implementations of reg(ex ular expressions More ❯
SiteReliability Engineer with a strong focus on leadership and team management . Around 70% of this role is about building, mentoring and directing a high-performing SRE team, setting strategy and driving operational excellence. The remaining 30% will be hands-on involvement in AWS-based platforms, automation and performance tuning. Key Responsibilities Lead and develop a team … of SRE engineers, setting priorities, providing coaching and creating a culture of reliability and continuous improvement Define and own SRE strategy, standards and ways of working across the organisation Collaborate with engineering, operations and product teams to ensure seamless delivery and robust systems Oversee system reliability, availability and performance across large, business-critical platforms Provide technical guidance … GitLab, Concourse) and ensure AWS platforms meet operational best practice Produce regular reporting and communicate clearly with senior stakeholders Key Requirements Strong experience managing or leading engineering/SRE/DevOps teams in a complex environment Track record of mentoring, coaching and growing technical teams Excellent stakeholder engagement skills with the ability to influence at all levels Broad technical More ❯
Morley, Leeds, West Yorkshire, England, United Kingdom Hybrid / WFH Options
VIQU IT Recruitment
SiteReliability Engineer with a strong focus on leadership and team management . Around 70% of this role is about building, mentoring and directing a high-performing SRE team, setting strategy and driving operational excellence. The remaining 30% will be hands-on involvement in AWS-based platforms, automation and performance tuning. Key Responsibilities Lead and develop a team … of SRE engineers, setting priorities, providing coaching and creating a culture of reliability and continuous improvement Define and own SRE strategy, standards and ways of working across the organisation Collaborate with engineering, operations and product teams to ensure seamless delivery and robust systems Oversee system reliability, availability and performance across large, business-critical platforms Provide technical guidance … GitLab, Concourse) and ensure AWS platforms meet operational best practice Produce regular reporting and communicate clearly with senior stakeholders Key Requirements Strong experience managing or leading engineering/SRE/DevOps teams in a complex environment Track record of mentoring, coaching and growing technical teams Excellent stakeholder engagement skills with the ability to influence at all levels Broad technical More ❯
Morley, West Yorkshire, United Kingdom Hybrid / WFH Options
VIQU IT
SiteReliability Engineer with a strong focus on leadership and team management . Around 70% of this role is about building, mentoring and directing a high-performing SRE team, setting strategy and driving operational excellence. The remaining 30% will be hands-on involvement in AWS-based platforms, automation and performance tuning. Key Responsibilities Lead and develop a team … of SRE engineers, setting priorities, providing coaching and creating a culture of reliability and continuous improvement Define and own SRE strategy, standards and ways of working across the organisation Collaborate with engineering, operations and product teams to ensure seamless delivery and robust systems Oversee system reliability, availability and performance across large, business-critical platforms Provide technical guidance … GitLab, Concourse) and ensure AWS platforms meet operational best practice Produce regular reporting and communicate clearly with senior stakeholders Key Requirements Strong experience managing or leading engineering/SRE/DevOps teams in a complex environment Track record of mentoring, coaching and growing technical teams Excellent stakeholder engagement skills with the ability to influence at all levels Broad technical More ❯
Lead SiteReliability Engineer (Lead SRE) Ready to keep things running smoothly? Join our tombola team! At tombola, we pride ourselves on building our own exceptional games and platforms in-house. That means keeping everything running flawlessly is paramount! We're seeking a Lead SiteReliability Engineer (SRE) to join us and help ensure our critical … systems and services are always reliable, available, and performing at their best. What will yo u be doing? As an SRE, you'll be instrumental in implementing automation, monitoring, and incident response strategies to minimize downtime and optimize our operations. You'll collaborate closely with our development, infrastructure, and security teams, balancing exciting new feature delivery with rock-solid system … with our broader business objectives. Collaborating with other teams and departments to achieve shared success. Partnering with our People Partner for tech to build robust team management practices. System Reliability and Availability Ensure system uptime: Monitor and maintain the availability and reliability of critical systems and services, meeting all uptime SLAs (Service Level Agreements). Incident management: Quickly More ❯
live and transferrable DV Clearance Are you passionate about reliability, automation, and supporting mission-critical systems? Join this global defence organisation as a SiteReliability Engineer (SRE) and help shape the future of one of the UK's most vital national security platforms. You'll be joining a growing SRE team at the heart of the customer … s mission, focused on ensuring performance, availability, and scalability-while driving continuous improvement and innovation. About the Role As an SRE, you'll combine your operational expertise with software engineering skills to minimise manual effort and drive automation across complex systems. This role is perfect for someone who thrives on solving hard problems, automating the mundane, and building intelligent … overtime. Proactively enhance system availability, performance, and resilience. Develop tools and solutions to automate repetitive tasks and reduce operational toil. Collaborate with development teams to embed best practices and SRE principles. Deploy and manage monitoring systems to provide intelligent observability. Engage with the wider DevOps/SRE community within the organisation. Ideal Skills & Experience We're more interested in your More ❯
live and transferrable DV Clearance Are you passionate about reliability, automation, and supporting mission-critical systems? Join this global defence organisation as a SiteReliability Engineer (SRE) and help shape the future of one of the UK's most vital national security platforms. You'll be joining a growing SRE team at the heart of the customer … s mission, focused on ensuring performance, availability, and scalability-while driving continuous improvement and innovation. About the Role As an SRE, you'll combine your operational expertise with software engineering skills to minimise manual effort and drive automation across complex systems. This role is perfect for someone who thrives on solving hard problems, automating the mundane, and building intelligent … overtime. Proactively enhance system availability, performance, and resilience. Develop tools and solutions to automate repetitive tasks and reduce operational toil. Collaborate with development teams to embed best practices and SRE principles. Deploy and manage monitoring systems to provide intelligent observability. Engage with the wider DevOps/SRE community within the organisation. Ideal Skills & Experience We're more interested in your More ❯
Founded in 2001, Resident Advisor (RA) is one of the world's longest-running music media brands and a cornerstone of the dance, electronic and DJ ecosystem. The site's audience of over 6 million monthly users is drawn in by a combination of news, editorial, club listings and ticketing, RA-branded events at venues and festivals worldwide, original … films and a weekly mix series that has run for 18 years. We're looking for a Senior SiteReliability Engineer passionate about electronic music to join our Core Platform team. This role is office based (minimum 3 days/week in-office), and offers flexibility to work hybridly. You'll help scale our high-traffic infrastructure that … MSSQL databases, ElasticSearch, Redis, and Kafka running on AWS EKS (Kubernetes), managed via Terraform with CI/CD pipelines and DataDog monitoring. Your responsibilities include improving infrastructure performance and reliability, driving modernization and cost optimization, developing shared components (i.e. auth systems, GraphQL gateways), enhancing developer experience, maintaining E2E testing systems, and creating internal tooling. This is an opportunity to More ❯
A Developer possesses a unique skill set that synergises well with SiteReliabilityEngineering (SRE). With a strong foundation in Golang development, valuable expertise is brought to the table, enabling contributions to innovative solutions for complex monitoring, automation, and capacity management challenges. As a SiteReliability Engineer, you can shape the way this company … Development and Platform teams to optimise system performance for this industry leader! In the development of reliable and scalable systems, you are responsible for creating software by applying sound engineering principles, best practices, and leveraging technologies including your expertise in contemporary monitoring tools and programming. Experience in modern monitoring tools such as Splunk, Nagios, or … Grafana is a significant advantage! However, proficiency in programming languages such as Golang, Python, or JavaScript is essential! If you are a Golang Engineer looking to transition into the SRE world, or vice versa, this is an opportunity you won't want to miss More ❯
Wokingham, Berkshire, England, United Kingdom Hybrid / WFH Options
eTeam Inc
We are a Global Recruitment specialist that provides support to the clients across EMEA, APAC, US and Canada. We have an excellent job opportunity for you. Role Title: SiteReliabilityEngineering- Need Active SC Clearance Location: Wokingham (Reading) | Hybrid, 60% remote and 40% onsite Duration: 27/02/2026 Rate:402GBP/Day(Inside IR35) Role … Implement CI/CD pipelines for seamless deployment and release management. Ensure compliance with security standards, governance policies, and regulatory requirements. Required Skills & Experience Expertise in software development and engineering for large-scale distributed systems. Strong proficiency in programming languages such as Golang, Java, or Python. Extensive experience with cloud infrastructure providers (AWS, Azure, or GCP). Deep knowledge More ❯
has helped build some of the world's largest companies. Our team in London is growing and we're looking for talented people to join us on our journey Engineering at Duffel We're building tools to simplify travel distribution, search and booking. What does this actually mean? It's one common and seamless API. This brings huge technical … experience to go with it. The tools used on the team include Elixir, Phoenix, Kubernetes and Google Cloud Platform. SiteReliabilityEngineering at Duffel As an SRE at Duffel, you'll be part of a small team within engineering that is responsible for the reliability, performance, and resilience of our infrastructure and applications. You will … be working closely with engineering teams to understand their needs and help meet the demands of our product as we scale globally. What we're looking for - An infrastructure and systems engineering generalist who is comfortable diving deep into the weeds on different issues. Some recent examples include: - A configuration issue between Google's Load Balancer and the More ❯
SiteReliability Engineer - Microsoft Admin (Windows Server, IIS, MS SQL Server) Team Summary The Reliability Engineer (SRE) is a member of a cross-functional Operations & Infrastructure team responsible for running our Visa Spend Clarity for Enterprises production infrastructure and ensuring the highest levels of availability, performance, and operational excellence. What a SiteReliability Engineer does … at Visa: The SRE is responsible for finding the right way to run robust applications in our environments. In this role, you will balance engineering improvements, systems operations, and contributions to strategic initiatives. You will work closely with all members of the Technology Group to improve the reliability, availability, performance, monitoring, and operations of Visa Spend Clarity for More ❯
# SiteReliability EngineerRemote - APAC/EngineeringThe Tyk API Management platform is helping to drive the connected world and power new products and services. We're changing the way that organisations connect any number of their systems and services.Whether internal, external, public or highly encrypted systems, Tyk helps businesses drive value across the retail, finance, telecoms, healthcare, or … radical responsibility If this sounds like an environment that you believe could work for you then read on to find out more. The role: We're looking for a SiteReliability Engineer to manage, maintain, improve and provide support on our platform. You will be curious by nature, always looking for ways to improve, as we will look … we expect this role to be advocate of continuous improvement Reliability of our new global Tyk Cloud platform Automation of operations and support Writing and maintaining documentation on SRE processes and policies Recommending and implementing ways of driving operational efficiency and driving down our cost to run, without impacting service Assisting in penetration testing for Cloud through liaising with More ❯
collaborative innovation. Our group drives competitive advantage by enhancing our consumer experiences, enabling business growth, and advancing operational excellence. The Database ReliabilityEngineering (DBRE) team helps elevate SRE practices as it applies to Database Management technology and services at TWDC, promoting and onboarding new technologies, solving complex problems and integrating with next generation digital platforms. Database Reliability Engineers (DBRE) use a software engineering approach to architect, design, automate, monitor, and build applications at scale. This includes operating and engineering software with close business segment alignment to deliver platforms through efficient, effective and resilient architectures. DBREs are talented engineers that are focused on improving quality through a data driven approach: instrumentation, automation, and functional …/unit testing. The Database ReliabilityEngineering (DBRE) team is a group of highly trained professional database engineers who build, deploy and operate database platforms in an SRE/DevOps manner. This team is responsible for operating the following platforms: MySQL, PostgreSQL, Oracle, NOSQL(MONGO, Cassandra) and Snowflake for TWDC. These workloads are running in all major CSPs More ❯
sector, our technology is truly flexible and designed to transform any business at scale. We've created a unified platform that adapts to diverse needs, offering the scalability and reliability legacy systems simply can't match. At ZILO, our DNA is built on Character, Creativity, and Craftsmanship. We face every challenge with integrity, explore new ideas with a curious … impact. If you're ready to shape the future, let's talk. About the Role We're looking for a Senior SiteReliability Engineer to join our SRE team. This is a hybrid role that blends deep platform engineering with application-level troubleshooting . You'll be responsible for the stability, performance, and resilience of our cloud … code Resolve incidents and support root causes (Java and GoLang services) Contribute to postmortems and reliabilityengineering initiatives Who You Are Essential Experience 5+ years in an SRE, DevOps, or infrastructure role Deep hands-on experience with AWS , EKS/Kubernetes , and Terraform Working knowledge of Kafka tuning, monitoring, and operational troubleshooting Strong familiarity to be able to More ❯
SiteReliability Engineer page is loaded SiteReliability Engineer Apply locations IND-BLR-Divyasree Technopolis time type Full time posted on Posted Yesterday job requisition id R About LSEG: The London Stock Exchange Group (LSEG) is a global financial markets infrastructure and data provider headquartered in London, UK. Established in 2007, though its core institution-the … on SQL Server and SSIS today, we're actively exploring cloud-native platforms-your voice will help guide that transition. Collaborative Environment : Work multi-functionally with guides in data engineering, DevOps, and analytics in a culture that values curiosity, accountability, and continuous improvement. Tech that Matters : You'll support systems that drive real-time business decisions, impact thousands of More ❯
Core, BCG X, and CT worldwide. This role is also accountable for embedding security within DevSecOps practices, enforcing automation at scale, and applying SiteReliabilityEngineering (SRE) principles across all security services. The role requires strong partnership with ISRM, with a focus on balancing and prioritizing security requirements, automation opportunities, user experience needs, and broader business outcomes. … that support modern work scenarios, remote access, zero-trust networking, and AI/ML workloads. Leverage automation frameworks and IaC to improve scalability and reduce manual intervention. Operational Security, SRE & Assurance: Ensure security platforms are resilient, continuously monitored, and designed for 24x7 support and incident response readiness. Embed security telemetry and observability to enable proactive threat detection and automated response. … Apply SRE principles to improve reliability, performance, and maintainability of security services. Lead platform health, patching automation, and vulnerability remediation workflows. Define service level objectives (SLOs) and key performance indicators (KPIs) for all security services. Compliance, Governance & Risk Management: Ensure alignment with global compliance requirements such as ISO 27001, NIST, SOC 2, GDPR, and others. Partner with governance, legal More ❯
strong background in DevOps design and transformation, cloud-native engineering, and modern DevOps tooling. The ideal candidate will also bring expertise in SiteReliabilityEngineering (SRE) principles and practices, with a focus on building scalable, reliable, and resilient systems. Key Responsibilities: • Architect and implement scalable, secure, and high-performance DevOps solutions. • Lead DevOps transformation initiatives across … enterprise environments. • Design and implement cloud-native solutions on Azure, AWS, or GCP. • Apply SRE principles to ensure system reliability, availability, and performance. • Build and maintain CI/CD pipelines and infrastructure as code (IaC). • Evaluate and integrate modern DevOps tools and practices. • Collaborate with cross-functional teams to align DevOps and SRE strategies with business goals. • Mentor … and lead DevOps teams, fostering a culture of innovation and continuous improvement. • Leverage AI and machine learning to optimize DevOps and SRE processes. • Ensure compliance, security, and operational excellence in all DevOps practices. ͏ Required Qualifications: • 15+ years of experience in IT, with a strong focus on DevOps and cloud architecture. • Proven experience in DevOps design and transformation across multiple projects. More ❯
Job Title: Cloud Engineer/SRE - Golang & Github Location: Remote - UK, London Salary/Rate: Up to £690 a day Inside IR35 Start Date: August 2025 Job Type: 12 Month Contract Company Introduction: We are seeking a highly skilled Cloud Engineer/SRE with Development experience in Go and Github to join our client in the Global Analytical Risk sector. … We are seeking a highly skilled and motivated Cloud Engineer/SRE to join our newly formed Enterprise GitHub Operations & Tooling team. This is a foundational role where you will be instrumental in designing, building, and managing the core services and tooling that underpin our extensive use of GitHub Enterprise. You will be responsible for developing code and solutions that … managing) GitHub Actions (designing complex workflows, custom actions) GitHub Enterprise, Organization and Repository settings. Operations/Infrastructure Background: Proven experience in an operations, sitereliabilityengineering (SRE), or infrastructure engineering role, with a strong appreciation for automation and stability. Modern SDLC Practices: Familiarity with: Dependency management. Security remediation processes and secure coding practices. Testing frameworks and More ❯
Senior SiteReliability Engineer Central London (Hybrid) Up to £100k + Car Allowance & Bonus TRIA are working with a leading hospitality client to hire a Senior SRE, where they are investing heavily in the performance, stability, and reliability of its digital platforms. This is a hands-on leadership role - you won't just guide others, you'll … Improving alerting, monitoring, and system-level metrics Driving better SLOs, SLIs, and overall uptime What you'll bring: Experience in high-traffic digital or eCommerce platforms 5+ years in SRE/DevOps roles; strong background in incident response Observability, automation, and infrastructure as code expertise Leadership skills - mentoring others or leading from the front The stack includes Kubernetes, Terraform, AWS … Python, and modern CI/CD tools, and it's evolving. If you understand what a good SRE practice looks like, and want to leave systems in a better place than you found them, please apply to be considered and learn more More ❯
we are passionate about building unified IT solutions that simplify the way IT organizations work. We are currently looking for a SiteReliability Engineer to join our SRE team in the Platform Engineering organization and help us scale our products to millions of end-users. We are looking for individuals with a passion for automation and observability … and SOP's Develop software, scripts, or tooling to improve efficiency and reduce delivery time of applications and infrastructure Other duties as needed About You 5+ years' experience in SiteReliability Engineer roles Expert+ level Linux administration, scripting, and troubleshooting Demonstrable knowledge of Observability tools (Prometheus/Grafana, New Relic, Splunk, DataDog) Comprehensive experience with AWS (Amazon Web More ❯
plan. CI/CD Pipeline Management: Manage and optimise CI/CD pipelines using tools like GitHub Actions, Travis , and other automation frameworks. SiteReliabilityEngineering (SRE): Perform SRE duties to ensure system availability, performance, and scalability. Application Support: Work closely with application teams to support application deployment and performance monitoring . Cloud Administration: Administer and optimise … orchestration tools. Proficiency in monitoring tools such as DataDog, Splunk, or New Relic . Strong understanding of CI/CD pipelines and automation tools. Experience with incident management and SRE best practices. Excellent problem-solving skills and the ability to work collaboratively across teams. We are looking to find individuals keen to join our scaling team - our tech has real More ❯
are passionate about building unified IT solutions that simplify the way IT organizations work. We are currently looking for a Senior SiteReliability Engineer to join our SRE team in the Platform Engineering organization and help us scale our products to millions of end-users. We are looking for individuals with a passion for automation and observability … and SOP's Develop software, scripts, or tooling to improve efficiency and reduce delivery time of applications and infrastructure Other duties as needed About You 7+ years' experience in SiteReliability Engineer roles 3+ years' experience with an object-oriented language (preferably Java, .NET or C++) Expert+ level Linux administration, scripting, and troubleshooting Demonstrable knowledge of Observability tools More ❯