SiteReliabilityEngineer Salary $140k-$200k + Equity Secret Clearance or higher is required My client, a VC-backed organization in the defense tech space, is looking to hire multiple SREs as they build out their DevOps team across the USA. My client has created a modern product which is streamlining processes and saving time in critical … rest of the skills and experience needed for this position are listed below: Secret Clearance or higher Experience working within the DOD cloud environment 4 Years+ Experience as a SRE Experience in creating CI/CD Pipelines Strong knowledge of Kubernetes Experience with either Ironbank, Cloud One, Platform one Risk management Framework security experience Experience working with AWS If you More ❯
SiteReliabilityEngineer Salary $140k-$200k + Equity Secret Clearance or higher is required My client, a VC-backed organization in the defense tech space, is looking to hire multiple SREs as they build out their DevOps team across the USA. My client has created a modern product which is streamlining processes and saving time in critical … rest of the skills and experience needed for this position are listed below: Secret Clearance or higher Experience working within the DOD cloud environment 4 Years+ Experience as a SRE Experience in creating CI/CD Pipelines Strong knowledge of Kubernetes Experience with either Ironbank, Cloud One, Platform one Risk management Framework security experience Experience working with AWS If you More ❯
Has anyone actually ever given you a good description of what SRE is? Recently I've met dozens of companies implementing an SRE function. Half are just rebranding an ops team (because Ops ain't cool), some don't want to call the additional silo they have created 'DevOps' (because apparently that's the wrong thing to do) so they … re calling it SRE and the rest actually don't really know how to describe what they're doing. And if you can't describe it simply, you don't know what it is, chief (because Google do it, isn't the right answer). That was until today, when I met a company who actually white boarded their vision … process rather than the build. We discussed Kubernetes, Prometheus and API Gateways. Most importantly, they spoke like they knew what the hell they were on about. Not just about SRE, but on the whole Engineering process. This is a company with at the top of their game, who are about to introduce a brand new monitisation model to a web More ❯
are We are a London tech startup on the lookout for bright, motivated and self-driven individuals to join the team. Who you are You are a DevOps/SiteReliabilityEngineer with experience managing complex infrastructure and deploying scalable, reliable systems. You are passionate about automation, cloud technologies, and continuous improvement. Must have: Proven track record More ❯
globe. What you'll do: As a SiteReliabilityEngineer at Zefr, you'll apply your expertise in cloud infrastructure, CI/CD, Observability, and core SRE concepts, to deliver high-quality, reliable, and scalable solutions. A significant aspect of this role involves working closely with Zefr's Engineering and Data Science teams ensuring the infrastructure required More ❯
European cloud revolution. We supercharge our customers to innovate in hyperscaler cloud, enabling seamless migration, advanced security, and data-driven success. Currently, we are looking for a Senior Azure SiteReliabilityEngineer to join our team in the UK. Your daily responsibilities: Architect, implement, and improve existing monitoring and alerting systems Proactively investigate and identify performance anomalies More ❯
software, platforms, and infrastructure. The Role Join us as a SiteReliabilityEngineer and help us build the future of data sovereignty! We're seeking an SRE passionate about creating high-performance, scalable, and reliable services for our production infrastructure. You'll have a direct impact, improving existing systems and developing innovative solutions to complex challenges. Our … implement a comprehensive observability strategy for self-hosted deployments, including infrastructure and tooling for monitoring, alerting, and troubleshooting. This will involve designing and implementing robust metrics and logging systems. Engineer the ACRA platform for high availability and fault tolerance. This includes ensuring resilience against Cloud Availability Zone outages and the ability to gracefully handle node failures. Guarantee 99.9% uptime … capacity planning, and optimization of resource utilization. Collaborate closely with the product engineering team to influence the design and implementation of new products and features, ensuring they meet our reliability and scalability standards from the outset. Preferred Qualifications Bachelor's degree (or equivalent) in Computer Science or a related field; relevant practical experience will also be considered Proficiency with More ❯
Hamilton Barnes is currently representing a major vehicle manufacturer that is actively seeking a SiteReliabilityEngineer for an initial 6-month contract with the possibility of extension. This position has on site commitments 2/3 Days Per Week in Gaydon. If you are interested in learning more we encourage you to apply today! Responsibilities More ❯
SiteReliability Engineering Manager page is loaded SiteReliability Engineering Manager Apply remote type Remote locations Remote - United Kingdom time type Full time posted on Posted Yesterday job requisition id JR- Job Description As an SRE Engineering Manager, you will be expected to not only lead your team in setting priorities and ensuring alignment with organizational goals but also to be deeply technical. We expect our … details, solve problems hands-on, and support your team's technical decisions is crucial. You'll be a mentor, guide, and a partner, helping engineers grow, and ensuring the reliability and efficiency of the systems they are working on. We believe in setting a high bar for engineering managers who can lead by example in both technical expertise and More ❯
TechOps, Quality & Systems Engineering (TQSE) team within Technology & Digital for Disney Experiences, working closely with World Wide business, Global Information Security (GIS), and application teams across the company. The SiteReliabilityEngineer will report to the Manager, Technology (TQSE). About the Role & Team: At Disney, storytelling is at the heart of everything we do-and in More ❯
flexible remoteworking locations within UK/Europe) Employment type: Permanent Working Hours: Full time (9-6 UK) Salary: Up to £110K + Shares + Benefits TransFICC is hiring a SiteReliabilityEngineer to provide high-performance services to our customers. We develop an integration service … product that enables our clients to have a flexible, hosted service without requiring their internal resources to respond to connectivity challenges across trading venues. You will be joining our SRE team and contributing to TransFICC's automation culture. We are a multi-disciplinary team covering everything from desktop and laptop support to data centre provisioning of servers and vendor network … automated, so having experience with a software automation tool like Ansible and coding ability is a must. We are looking for someone experienced as a sys admin or network engineer; however, you must have a reasonable understanding of both. Constructive, open-minded and self-motivated. A belief in life learning, and an awareness of how much there still is More ❯
About the opportunity We are seeking a SiteReliabilityEngineer to join the Platform Engineering domain in the AI Platform team. The mission of Platform Engineering is to provide trusted, performant, self-service platforms that empower product teams to build 'the bank the world loves to use.' The AI Platform team contributes to this mission by creating More ❯
About the opportunity We are seeking a Senior SiteReliabilityEngineer to join the Platform Engineering Domain in the AI Platform Team. The mission of Platform Engineering is to provide trusted, performant, self-service platforms that empower product teams to build 'the bank the world loves to use.' The AI Platform team contributes to this mission by More ❯
Job Title: Cloud Engineer/SRE - Golang & Github Location: Remote - UK, London Salary/Rate: Up to £690 a day Inside IR35 Start Date: August 2025 Job Type: 12 Month Contract Company Introduction: We are seeking a highly skilled Cloud Engineer/SRE with Development experience in Go and Github to join our client in the Global Analytical … Risk sector. We are seeking a highly skilled and motivated Cloud Engineer/SRE to join our newly formed Enterprise GitHub Operations & Tooling team. This is a foundational role where you will be instrumental in designing, building, and managing the core services and tooling that underpin our extensive use of GitHub Enterprise. You will be responsible for developing code … deploying, managing) GitHub Actions (designing complex workflows, custom actions) GitHub Enterprise, Organization and Repository settings. Operations/Infrastructure Background: Proven experience in an operations, sitereliability engineering (SRE), or infrastructure engineering role, with a strong appreciation for automation and stability. Modern SDLC Practices: Familiarity with: Dependency management. Security remediation processes and secure coding practices. Testing frameworks and methodologies. More ❯
We are seeking an exceptional technology leader to oversee our global s ite reliability engineering ( SRE), DevOps, and Platform Engineering teams. This hands-on engineering leadership role requires someone who can both provide technical vision and build strong stakeholder relationships across the organization. The ideal candidate will bring a combination of deep technical expertise, strategic thinking, and people leadership … Leadership: Serve as a hands-on technical leader who can architect, design, and guide the implementation of highly resilient systems Build a compelling vision and strategic roadmap for our SRE, DevOps, and Platform Engineering functions Establish and evangelize engineering best practices across teams and the wider organization Drive technical innovation while ensuring operational excellence Provide architectural guidance to ensure systems … initiatives, capabilities, and constraints Required Skills & Experience: Extensive experience in engineering leadership roles Strong hands-on technical background in cloud platforms, containerization, and modern DevOps practices Demonstrated experience leading SRE, DevOps, or Platform Engineering teams Deep understanding of system architecture, resilience patterns, and high-availability design Experience developing strategic roadmaps and executing technical vision Proven ability to build and maintain More ❯
deployments as well as accurate health monitoring through all our clients, both new and old. The person in this role will join the SiteReliability Engineering team (SRE). The main role of the SRE team is to facilitate the scalability of Dayshape and allow us to meet the demands of an increasing client base. What you'll … do Lead initiatives to enhance Dayshape's ability to scale our cloud platform Maintain and improve our cloud estate in Azure Improve SRE and other teams' working lives through automation of manual tasks Lead in making the deployment of Dayshape more scalable Increase our knowledge sharing of SRE across the organisation Improve the observability of Dayshape through reporting and tool More ❯
SiteReliability Engineering/DevOps Engineer Are you enthusiastic about designing and managing cloud platforms? Do you find satisfaction in ensuring the reliability and performance of complex systems? About Team: The LexisNexis Intellectual Property (IP) division ( ) provides international patent content and a suite of online and analytic tools that meet the evolving needs of the intellectual … area or product line. It contributes directly to project plans, schedules, and methodologies for implementing cross-functional software assets and infrastructure. Responsibilities include cloud platform design across multiple systems, SRE activities, mentoring less-experienced team members, and collaborating with users, customers, and stakeholders to translate their requirements into effective solutions. Additionally, it focuses on fostering a culture of innovation and … and orchestration tools (e.g., Docker, Kubernetes/EKS). Proficiency in scripting languages (e.g., Python, Bash, TypeScript, PowerShell). Knowledge of networking concepts and security best practices. Familiarity with SRE activities and best practices. Familiarity with DevOps practices and tools. Experience with monitoring and logging tools (e.g., DataDog, Coralogix, AWS CloudWatch, Azure Monitor). Excellent problem-solving and stakeholder management More ❯
Cambourne, Cambridgeshire, United Kingdom Hybrid / WFH Options
Remotestar
to gemstone supplies They have a presence in London, Hong Kong, Amsterdam, and as well in Mumbai and now in New York in 2001. About the role : As the SRE Manager, you will play a critical role in ensuring the reliability, scalability, and performance of our infrastructure and services through both direct technical contribution along with team building and … tooling. Drive automation initiatives to streamline operational workflows and improve efficiency. Develop and maintain tools, scripts, and dashboards to monitor system health, performance, and reliability. Build a first class SRE team. Through a combination of leading by example, coaching and mentoring, mould the team would want to have around you. Provide leadership and guidance to the SRE team, fostering a … culture of collaboration, innovation, and continuous improvement. RESPONSIBILITIES: Proven experience in a senior or lead SRE role, with a strong track record of building and maintaining highly reliable infrastructure and services. Expertise in incident management, including incident response, resolution, and post-mortem analysis. Proficiency in monitoring, alerting, and observability tools such as Prometheus, Grafana, ELK stack or Datadog. Experience with More ❯
developer experience to go with it. The tools used on the team include Elixir, Phoenix, Kubernetes and Google Cloud Platform. SiteReliability Engineering at Duffel As an SRE at Duffel, you'll be part of a small team within engineering that is responsible for the reliability, performance, and resilience of our infrastructure and applications. You will be … silently drop spans. - An enthusiasm for both software development and systems engineering. - A high bar for code and configuration quality and readability. - A good understanding of current observability and reliability practices. - Experienced and comfortable in running incident response. - Big picture thinking - you can make trade offs on technical work streams against business impact. - Fantastic communication skills. You're able … We manage a data pipeline using Pub/Sub, Airbyte, and dbt. Our Current Focus We're currently driving a big shift in how we think about and monitor reliability across the engineering organisation, with a focus on early detection of customer-impacting issues. We're extending and standardising our use of OpenTelemetry, and introducing Honeycomb as the single More ❯
impact. We value continuous learning, personal growth, and providing our team with resources to succeed. Ready to shape the future? Let's talk. We're looking for a seasoned SRE with a front-end focus, expert in React applications, to join our SRE team. In this role, you'll ensure the reliability, performance, and operability of our React-based … invalidation, HTTP caching headers) to reduce latency and origin load. Collaborate with UX teams to balance feature richness with performance targets. Collaboration & Knowledge Sharing Serve as the React/SRE subject-matter expert: mentor engineers on best practices for building resilient front-ends. Produce and maintain runbooks, debugging guides, and incident-playbooks specific to client-side failures. Partner closely with … wider backend SRE, DevOps, and product teams to ensure end-to-end reliability. Enhanced leave - 38 days inclusive of 8 UK Public Holidays. Private Health Care including family cover. Life Assurance - 5x salary. Flexible working - work from home and/or in our London Office. Employee Assistance Program. Company Pension (Salary Sacrifice options available). Access to training and development. More ❯
Are you a passionate Software Engineer looking for an exciting new challenge? Join this team and transition into maintaining and enhancing the reliability of one of the world's largest platforms. In this role, you will utilise your expertise in Golang coding to develop robust applications, ensuring the systems remain resilient, scalable, and efficient. If you thrive in … presence and commitment to innovation, you will have the opportunity to work on projects that reach millions of users, making a real difference in the tech world. As a SiteReliabilityEngineer, you will be responsible for designing, developing, and maintaining systems and applications using Golang. You will monitor and optimise system performance with tools such as … Grafana, Prometheus, New Relic, and Splunk. Your role will involve identifying and resolving reliability issues, automating processes, and ensuring the seamless operation of the platform. If you have a passion for technology and a drive to ensure excellence, we would love to hear from you More ❯
MySQL, Vue.js, and AWS. Participating in an on-call roster is required as part of this role. This is a hybrid role with 2 days in the office. Senior SRE Position We are seeking a Senior SRE with experience of working with scaled SaaS production infrastructure. The successful candidate will work as part of a team focused on sitereliability, security, and scalability as we manage our rapid growth. Monitoring the above environments and reacting to alerts and issues that may arise in day-to-day operation of their product line. They will participate in an on-call rota for priority-1 level health, security, stability, and uptime of production, staging, and development environments. More ❯
You'll Do Location: London, England Build robust, easy-to-use foundational platforms and tools that enable engineering teams to provision services rapidly, consistently, and securely. Exemplify cloud-native sitereliability best practices. Write code that is performant, maintainable, clear, and concise. Employ strong problem-solving skills, with the ability to debug problems in cloud-native distributed systems. More ❯
along the way! Job Summary We have built Curve Dental into an industry-leading provider of beautiful cloud software for the dental industry. Who We're Looking For Our SiteReliability Engineers (SREs) are passionate about automation and its power to streamline the deployment and operation of software. They collaborate closely with developers to support a wide range More ❯
no wonder that leading organizations, like Samsung and Toyota, trust MongoDB to build next-generation, AI-powered applications. We are looking for an experienced Staff Engineer for our SRE, InfraSec team , to guide the security of our cloud-based infrastructure. As a Staff SRE , you will be very hands-on technically while also mentoring a small team of SREs. … to ensure that our infrastructure adheres to the highest security standards. They build essential security infrastructure and implement controls that reinforce the platform's security posture. This is an SRE team, which means you can expect a highly hands-on approach, tackling the technical challenges of implementing large scale solutions. This team is deeply involved in the technical aspects of … monitoring and anomaly detection. Security Tooling: Evaluate, implement, and manage cloud-native security tools and platforms for endpoint security, identity management (IAM), and CSPM. Qualifications: Experience: 7+ years in SRE, infrastructure engineering or similar, with a strong focus on security, including 2+ years in a senior or staff engineering role. Security Mindset: Deep understanding of cloud environment security, from OS More ❯