Site Reliability Engineer Job Vacancies

101 to 125 of 145 Site Reliability Engineer Jobs

Site Reliability Engineer - Ai Platform

Berlin, Germany
N26 GmbH
About the opportunity We are seeking a Site Reliability Engineer to join the Platform Engineering domain in the AI Platform team. The mission of Platform Engineering is to provide trusted, performant, self-service platforms that empower product teams to build 'the bank the world loves to use.' The AI Platform team contributes to this mission by creating More ❯
Employment Type: Permanent
Salary: EUR Annual
Posted:

Senior Site Reliability Engineer - Ai Platform

Berlin, Germany
N26 GmbH
About the opportunity We are seeking a Senior Site Reliability Engineer to join the Platform Engineering Domain in the AI Platform Team. The mission of Platform Engineering is to provide trusted, performant, self-service platforms that empower product teams to build 'the bank the world loves to use.' The AI Platform team contributes to this mission by More ❯
Employment Type: Permanent
Salary: EUR Annual
Posted:

Cloud Engineer / SRE - Go & Github

London, United Kingdom
Square One Resources
Job Title: Cloud Engineer/SRE - Golang & Github Location: Remote - UK, London Salary/Rate: Up to £690 a day Inside IR35 Start Date: August 2025 Job Type: 12 Month Contract Company Introduction: We are seeking a highly skilled Cloud Engineer/SRE with Development experience in Go and Github to join our client in the Global Analytical … Risk sector. We are seeking a highly skilled and motivated Cloud Engineer/SRE to join our newly formed Enterprise GitHub Operations & Tooling team. This is a foundational role where you will be instrumental in designing, building, and managing the core services and tooling that underpin our extensive use of GitHub Enterprise. You will be responsible for developing code … deploying, managing) GitHub Actions (designing complex workflows, custom actions) GitHub Enterprise, Organization and Repository settings. Operations/Infrastructure Background: Proven experience in an operations, site reliability engineering (SRE), or infrastructure engineering role, with a strong appreciation for automation and stability. Modern SDLC Practices: Familiarity with: Dependency management. Security remediation processes and secure coding practices. Testing frameworks and methodologies. More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Senior Site Reliability Engineer (SRE) - C13 - London

London, United Kingdom
Hybrid / WFH Options
citi.com
We are seeking an exceptional technology leader to oversee our global s ite reliability engineering ( SRE), DevOps, and Platform Engineering teams. This hands-on engineering leadership role requires someone who can both provide technical vision and build strong stakeholder relationships across the organization. The ideal candidate will bring a combination of deep technical expertise, strategic thinking, and people leadership … Leadership: Serve as a hands-on technical leader who can architect, design, and guide the implementation of highly resilient systems Build a compelling vision and strategic roadmap for our SRE, DevOps, and Platform Engineering functions Establish and evangelize engineering best practices across teams and the wider organization Drive technical innovation while ensuring operational excellence Provide architectural guidance to ensure systems … initiatives, capabilities, and constraints Required Skills & Experience: Extensive experience in engineering leadership roles Strong hands-on technical background in cloud platforms, containerization, and modern DevOps practices Demonstrated experience leading SRE, DevOps, or Platform Engineering teams Deep understanding of system architecture, resilience patterns, and high-availability design Experience developing strategic roadmaps and executing technical vision Proven ability to build and maintain More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Senior Site Reliability Engineer (UK)

Edinburgh, United Kingdom
Devopshunt
deployments as well as accurate health monitoring through all our clients, both new and old. The person in this role will join the Site Reliability Engineering team (SRE). The main role of the SRE team is to facilitate the scalability of Dayshape and allow us to meet the demands of an increasing client base. What you'll … do Lead initiatives to enhance Dayshape's ability to scale our cloud platform Maintain and improve our cloud estate in Azure Improve SRE and other teams' working lives through automation of manual tasks Lead in making the deployment of Dayshape more scalable Increase our knowledge sharing of SRE across the organisation Improve the observability of Dayshape through reporting and tool More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Manager, Cloud Site Reliability Engineering

United Kingdom
Barracuda Networks
cloud environments Reliability Engineering: Lead initiatives to improve system reliability, establish SLOs, and implement monitoring and alerting strategies Team Leadership: Build, mentor, and grow a high-performing SRE team while fostering a culture of innovation and continuous improvement Incident Management: Establish and optimize incident response processes, lead major incident reviews, and drive systematic improvements Automation Development: Spearhead automation … operations and improve system reliability Performance Optimization: Lead projects to optimize system performance, capacity planning, and cost efficiency Cross-team Collaboration: Work closely with development teams to implement SRE best practices and drive operational excellence Technical Strategy: Develop and execute technical roadmaps aligned with business goals and scaling requirements Security Integration: Ensure security best practices are embedded in infrastructure … service providers Operational Excellence: Drive continuous improvement in operational processes, tooling, and methodologies What you bring to the role: Technical Leadership Experience: 5+ years of experience leading and managing SRE/DevOps teams, with a proven track record of improving system reliability and performance Architectural Vision: Deep understanding of distributed systems, cloud platforms (AWS/GCP/Azure), and More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Senior Cloud / SRE Engineer

London, UK
Disability Solutions
Site Reliability Engineering/DevOps Engineer Are you enthusiastic about designing and managing cloud platforms? Do you find satisfaction in ensuring the reliability and performance of complex systems? About Team: The LexisNexis Intellectual Property (IP) division ( ) provides international patent content and a suite of online and analytic tools that meet the evolving needs of the intellectual … area or product line. It contributes directly to project plans, schedules, and methodologies for implementing cross-functional software assets and infrastructure. Responsibilities include cloud platform design across multiple systems, SRE activities, mentoring less-experienced team members, and collaborating with users, customers, and stakeholders to translate their requirements into effective solutions. Additionally, it focuses on fostering a culture of innovation and … and orchestration tools (e.g., Docker, Kubernetes/EKS). Proficiency in scripting languages (e.g., Python, Bash, TypeScript, PowerShell). Knowledge of networking concepts and security best practices. Familiarity with SRE activities and best practices. Familiarity with DevOps practices and tools. Experience with monitoring and logging tools (e.g., DataDog, Coralogix, AWS CloudWatch, Azure Monitor). Excellent problem-solving and stakeholder management More ❯
Posted:

Remote Senior Site Reliability Engineer Manager (Remote)

Cambourne, Cambridgeshire, United Kingdom
Hybrid / WFH Options
Remotestar
to gemstone supplies They have a presence in London, Hong Kong, Amsterdam, and as well in Mumbai and now in New York in 2001. About the role : As the SRE Manager, you will play a critical role in ensuring the reliability, scalability, and performance of our infrastructure and services through both direct technical contribution along with team building and … tooling. Drive automation initiatives to streamline operational workflows and improve efficiency. Develop and maintain tools, scripts, and dashboards to monitor system health, performance, and reliability. Build a first class SRE team. Through a combination of leading by example, coaching and mentoring, mould the team would want to have around you. Provide leadership and guidance to the SRE team, fostering a … culture of collaboration, innovation, and continuous improvement. RESPONSIBILITIES: Proven experience in a senior or lead SRE role, with a strong track record of building and maintaining highly reliable infrastructure and services. Expertise in incident management, including incident response, resolution, and post-mortem analysis. Proficiency in monitoring, alerting, and observability tools such as Prometheus, Grafana, ELK stack or Datadog. Experience with More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Site Reliability Engineer

London, United Kingdom
Duffel
developer experience to go with it. The tools used on the team include Elixir, Phoenix, Kubernetes and Google Cloud Platform. Site Reliability Engineering at Duffel As an SRE at Duffel, you'll be part of a small team within engineering that is responsible for the reliability, performance, and resilience of our infrastructure and applications. You will be … silently drop spans. - An enthusiasm for both software development and systems engineering. - A high bar for code and configuration quality and readability. - A good understanding of current observability and reliability practices. - Experienced and comfortable in running incident response. - Big picture thinking - you can make trade offs on technical work streams against business impact. - Fantastic communication skills. You're able … We manage a data pipeline using Pub/Sub, Airbyte, and dbt. Our Current Focus We're currently driving a big shift in how we think about and monitor reliability across the engineering organisation, with a focus on early detection of customer-impacting issues. We're extending and standardising our use of OpenTelemetry, and introducing Honeycomb as the single More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Site Reliability Engineer (SRE) - Front-end/React Specialist

London, United Kingdom
Hybrid / WFH Options
ZILO
impact. We value continuous learning, personal growth, and providing our team with resources to succeed. Ready to shape the future? Let's talk. We're looking for a seasoned SRE with a front-end focus, expert in React applications, to join our SRE team. In this role, you'll ensure the reliability, performance, and operability of our React-based … invalidation, HTTP caching headers) to reduce latency and origin load. Collaborate with UX teams to balance feature richness with performance targets. Collaboration & Knowledge Sharing Serve as the React/SRE subject-matter expert: mentor engineers on best practices for building resilient front-ends. Produce and maintain runbooks, debugging guides, and incident-playbooks specific to client-side failures. Partner closely with … wider backend SRE, DevOps, and product teams to ensure end-to-end reliability. Enhanced leave - 38 days inclusive of 8 UK Public Holidays. Private Health Care including family cover. Life Assurance - 5x salary. Flexible working - work from home and/or in our London Office. Employee Assistance Program. Company Pension (Salary Sacrifice options available). Access to training and development. More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Principle Site Reliability Engineer

Wokingham, Berkshire, South East, United Kingdom
LA International Computer Consultants Ltd
Our client is looking for a number of Principle Site Reliability Engineers to join their team on a initial six month contract, working a couple days onsite in Wokingham a week and the rest remotely. This role is Inside IR35 and require a candidate with an active SC clearance. Key Responsibilities Lead and drive platform-first initiatives to … improve scalability, reliability, and performance. Design, build, and maintain resilient infrastructure supporting distributed systems. Implement monitoring and alerting systems to ensure high availability and performance. Collaborate with engineering teams to enhance system reliability and mitigate risks. Develop and maintain CI/CD pipelines for seamless deployment and release management. Continuously evaluate and recommend improvements to platform infrastructure and … to appointment which can take up to a minimum 10 weeks. LA International is a HMG approved ICT Recruitment and Project Solutions Consultancy, operating globally from the largest single site in the UK as an IT Consultancy or as an Employment Business & Agency depending upon the precise nature of the work, for security cleared jobs or non-clearance vacancies More ❯
Employment Type: Contract
Rate: £550 - £580 per day
Posted:

Software Engineer (SRE)

London, United Kingdom
LinuxRecruit
Are you a passionate Software Engineer looking for an exciting new challenge? Join this team and transition into maintaining and enhancing the reliability of one of the world's largest platforms. In this role, you will utilise your expertise in Golang coding to develop robust applications, ensuring the systems remain resilient, scalable, and efficient. If you thrive in … presence and commitment to innovation, you will have the opportunity to work on projects that reach millions of users, making a real difference in the tech world. As a Site Reliability Engineer, you will be responsible for designing, developing, and maintaining systems and applications using Golang. You will monitor and optimise system performance with tools such as … Grafana, Prometheus, New Relic, and Splunk. Your role will involve identifying and resolving reliability issues, automating processes, and ensuring the seamless operation of the platform. If you have a passion for technology and a drive to ensure excellence, we would love to hear from you More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Site Reliability Engineer

Wokingham, Berkshire, South East, United Kingdom
LA International Computer Consultants Ltd
Our client is looking for a number of hands on Site Reliability Engineers to join their team on a initial six month contract, working a couple days onsite in Wokingham a week and the rest remotely. This role is Inside IR35 and require a candidate with an active SC clearance. Key Responsibilities Detect and mitigate system issues to … to appointment which can take up to a minimum 10 weeks. LA International is a HMG approved ICT Recruitment and Project Solutions Consultancy, operating globally from the largest single site in the UK as an IT Consultancy or as an Employment Business & Agency depending upon the precise nature of the work, for security cleared jobs or non-clearance vacancies More ❯
Employment Type: Contract
Rate: £390 - £400 per day
Posted:

Senior Site Reliability Engineer in Milton Keynes - Xtremepush

Milton Keynes, Buckinghamshire, United Kingdom
Java Script Works
MySQL, Vue.js, and AWS. Participating in an on-call roster is required as part of this role. This is a hybrid role with 2 days in the office. Senior SRE Position We are seeking a Senior SRE with experience of working with scaled SaaS production infrastructure. The successful candidate will work as part of a team focused on site reliability, security, and scalability as we manage our rapid growth. Monitoring the above environments and reacting to alerts and issues that may arise in day-to-day operation of their product line. They will participate in an on-call rota for priority-1 level health, security, stability, and uptime of production, staging, and development environments. More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Site Reliability Engineer II London, England, United Kingdom London, England, United Kingdom

London, United Kingdom
Axon Enterprise
You'll Do Location: London, England Build robust, easy-to-use foundational platforms and tools that enable engineering teams to provision services rapidly, consistently, and securely. Exemplify cloud-native site reliability best practices. Write code that is performant, maintainable, clear, and concise. Employ strong problem-solving skills, with the ability to debug problems in cloud-native distributed systems. More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Site Reliability Engineer

Scotland, United Kingdom
Curve Dental, LLC
along the way! Job Summary We have built Curve Dental into an industry-leading provider of beautiful cloud software for the dental industry. Who We're Looking For Our Site Reliability Engineers (SREs) are passionate about automation and its power to streamline the deployment and operation of software. They collaborate closely with developers to support a wide range More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Staff Site Reliability Engineer, Infrastructure Security Denmark; France; Germany; Ireland; Lon ...

London, United Kingdom
MongoDB
no wonder that leading organizations, like Samsung and Toyota, trust MongoDB to build next-generation, AI-powered applications. We are looking for an experienced Staff Engineer for our SRE, InfraSec team , to guide the security of our cloud-based infrastructure. As a Staff SRE , you will be very hands-on technically while also mentoring a small team of SREs. … to ensure that our infrastructure adheres to the highest security standards. They build essential security infrastructure and implement controls that reinforce the platform's security posture. This is an SRE team, which means you can expect a highly hands-on approach, tackling the technical challenges of implementing large scale solutions. This team is deeply involved in the technical aspects of … monitoring and anomaly detection. Security Tooling: Evaluate, implement, and manage cloud-native security tools and platforms for endpoint security, identity management (IAM), and CSPM. Qualifications: Experience: 7+ years in SRE, infrastructure engineering or similar, with a strong focus on security, including 2+ years in a senior or staff engineering role. Security Mindset: Deep understanding of cloud environment security, from OS More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Senior Software Engineer/SRE - Application Middleware

London, United Kingdom
Avature
Senior Software Engineer/SRE - Application Middleware Location London Business Area Engineering and CTO Ref # Description & Requirements Are you passionate about building high-performance systems that are fast, resilient, and operate at global scale? Join Bloomberg's Application Middleware SRE team, where you'll combine software engineering and systems expertise to keep the backbone of the Bloomberg Terminal … running smoothly for hundreds of thousands of users around the world. We're not your typical SRE team. We're embedded in a group that powers real-time connectivity, and we own systems where uptime isn't just important-it's essential to the global financial system. This is your opportunity to engineer resilience at scale, automate critical infrastructure … and shape reliability practices across one of the world's most powerful tech platforms. The Team We're the Site Reliability Engineering team within Bloomberg's Application Middleware group. Our mission: ensure that Bloomberg's core connectivity and messaging layers are resilient, scalable, and fully observable. We own systems that operate at high throughput and low latency More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Senior Software Engineer/SRE - Application Middleware

London, United Kingdom
Bloomberg L.P
Senior Software Engineer/SRE - Application Middleware Location London Business Area Engineering and CTO Ref # Description & Requirements Are you passionate about building high-performance systems that are fast, resilient, and operate at global scale? Join Bloomberg's Application Middleware SRE team, where you'll combine software engineering and systems expertise to keep the backbone of the Bloomberg Terminal … running smoothly for hundreds of thousands of users around the world. We're not your typical SRE team. We're embedded in a group that powers real-time connectivity, and we own systems where uptime isn't just important-it's essential to the global financial system. This is your opportunity to engineer resilience at scale, automate critical infrastructure … and shape reliability practices across one of the world's most powerful tech platforms. The Team We're the Site Reliability Engineering team within Bloomberg's Application Middleware group. Our mission: ensure that Bloomberg's core connectivity and messaging layers are resilient, scalable, and fully observable. We own systems that operate at high throughput and low latency More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Senior Site Reliability Engineer - Networking

United Kingdom
Hybrid / WFH Options
Lambda Inc
top AI computing platform. We equip engineers with the tools to deploy AI that is fast, secure, affordable, and built to scale. Whether they need powerhouse GPU hardware on-site or the flexibility of cloud-based solutions, we've got the horsepower to make it happen. Lambda's AI Cloud has been adopted by the world's leading companies … performance through the use of network engineering and other applicable technologies Help with deploying and maintaining network monitoring and management tools You Have 5+ years of experience being SWE, SRE or Network Reliability Engineering Been part of the implementation of production-scale networking projects Experience being on-call and incident response management Have experience building and maintaining Software Defined More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Senior Software Engineer/SRE - Application Middleware London, GBR Posted today

London, United Kingdom
Bloomberg L.P
Senior Software Engineer/SRE - Application Middleware Location London Business Area Engineering and CTO Ref # Description & Requirements Are you passionate about building high-performance systems that are fast, resilient, and operate at global scale? Join Bloomberg's Application Middleware SRE team, where you'll combine software engineering and systems expertise to keep the backbone of the Bloomberg Terminal … running smoothly for hundreds of thousands of users around the world. We're not your typical SRE team. We're embedded in a group that powers real-time connectivity, and we own systems where uptime isn't just important-it's essential to the global financial system. This is your opportunity to engineer resilience at scale, automate critical infrastructure … and shape reliability practices across one of the world's most powerful tech platforms. The Team We're the Site Reliability Engineering team within Bloomberg's Application Middleware group. Our mission: ensure that Bloomberg's core connectivity and messaging layers are resilient, scalable, and fully observable. We own systems that operate at high throughput and low latency More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Azure Site Reliability Engineer

London, United Kingdom
Hybrid / WFH Options
Experian Group
We are seeking a skilled Azure Cloud DevOps Engineer to join our team. The ideal candidate will have a strong background in DevOps practices, cloud solutions, and network engineering in Microsoft Azure. This role involves maintaining and developing a cloud environment that hosts mission critical financial services applications used across Australia and New Zealand. This role is pivotal for … in Computer Science, Information Technology, or a related field. At least one of the below certifications: Microsoft Certified: Azure Administrator Associate Microsoft Certified: Azure Developer Associate Microsoft Certified: DevOps Engineer Expert Microsoft Certified: Azure Network Engineer Associate Cisco Certified Network Associate (CCNA) Additional Information What We Offer Hybrid work model 20 days of annual leave Comprehensive medical and … countries, FORTUNE Best Companies to work and Glassdoor Best Places to Work (globally 4.4 Stars) to name a few. Check out Experian Life on social or our Careers Site to understand why. Experian is proud to be an Equal Opportunity and Affirmative Action employer. Innovation is a critical part of Experian's DNA and practices, and our diverse workforce More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Site reliability engineer

London, United Kingdom
Quorso UK Limited
The role As a Site reliabilityengineer you will focus on improving stability and security aspectsof the technical stackofQuorso by: Owning monitoring and logging integrations, as well as alerting capabilities by improving andautomating currently manual processes Identifying andlogging discovered performance and security related issues Working on remediation for the discovered issues related to backend and infrastructural layers, as well as … Stores technology simplifies retailers' data into daily Next Best Actions ("Missions") for every store, guaranteed to engage teams and drive sales. We're an Enterprise platform, targeting large multi-site retailers. We're growing fast with some of the largest retailers in the world already using Quorso to react faster and become more Agile in the face of a … our investors include CEOs and Chairpersons of a number of the 100 largest companies in the world. Requirements Experience of working in the role or as backend/devops engineer for at least 4 years on projects using Ruby, SQL and Kubernetes Ability to quickly learn platform stack andstaying up to date with ongoing development Experience in proactive implementation More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Applied AI Engineer, Senior/Staff Devops/SRE - EMEA

London, United Kingdom
Mistral AI
shaping the future of AI. Together, we can make a meaningful impact. See more about our culture on . About The Job Mistral AI is seeking an Applied AI Engineer focused on DevOps to facilitate the adoption of its products among customers and collaborate with them to address complex technical challenges. Applied AI Engineers, ML Infra at Mistral AI … in English • You hold a Bachelor's or Master's degree in Computer Science, Engineering, or a related field • You have 2+ years of experience in a DevOps or Site Reliability Engineering role • You're experienced with deploying and managing AI-based products in production environments • You are fluent in Python • You have experience with containerization technologies such … You hold strong communication skills with an ability to explain complex technical concepts in simple terms to technical and non-technical audiences Ideally you have: • Experience as a Customer Engineer, Forward Deployed Engineer, Sales Engineer, Solutions Architect, or Technical Product Manager • Familiarity with AI frameworks such as PyTorch or TensorFlow • Contributions to open-source projects, particularly in More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Site Reliability Engineer

Wokingham, England, United Kingdom
Hybrid / WFH Options
eTeam
We are a Global Recruitment specialist that provides support to the clients across EMEA, APAC, US and Canada. We have an excellent job opportunity for you. Role Title: Principal SRE Location: Wokingham (Reading). Hybrid, 60% remote and 40% onsite Duration: Until 30/01/2026 Rate: £580 per day Inside IR35 through an Umbrella Company C ontractor Must … Hold Active SC Clearance Role Description: Key Responsibilities: Lead and drive platform-first initiatives to improve scalability, reliability, and performance. Design, build, and maintain resilient infrastructure supporting distributed systems. Implement monitoring and alerting systems to ensure high availability and performance. Collaborate with engineering teams to enhance system reliability and mitigate risks. Develop and maintain CI/CD pipelines More ❯
Posted:
Site Reliability Engineer
10th Percentile
£52,500
25th Percentile
£63,630
Median
£70,000
75th Percentile
£85,000
90th Percentile
£99,500