London, England, United Kingdom Hybrid / WFH Options
Gorgias
Join to apply for the Senior SiteReliabilityEngineer role at Gorgias Join to apply for the Senior SiteReliabilityEngineer role at Gorgias Get AI-powered advice on this job and more exclusive features. Gorgias is the conversational AI platform for ecommerce that drives sales and resolves support inquiries. Trusted by over … product recommendations. Gorgias, where every customer interaction feels personal, support becomes sales, and conversations shape success. Relocate to either: Paris, Lisbon or Belgrade. Relocation and Visa provided. About The SRE Team We are seeking a highly skilled and experienced Senior SiteReliabilityEngineer (SRE) to join our team. As an SRE at Gorgias, you will play a … crucial role in ensuring the reliability, scalability, and performance of our systems, enabling the seamless delivery of our products and services. The SRE team at Gorgias maintains the core infrastructure and services that make up the heart of our product. We have the privilege to work with high throughput systems and TB-scale data stores serving billions of queries More ❯
We are seeking a foundational member for the Cloud Infrastructure team at Writer. This role involves contributing to the development and implementation of our SiteReliability Engineering (SRE) program. The ideal candidate will ensure the reliability, scalability, performance, and security of Writer's critical systems, proactively guaranteeing that our high-ROI products reach customers seamlessly. Your responsibilities … ensure cost efficiency. Ensure the security and compliance of our systems, adhering to industry standards and regulations. Provide mentorship and technical guidance to junior engineers, fostering a culture of reliability and continuous improvement. Stay current with emerging technologies and industry trends to improve our sitereliability practices. Is this you? Proven expertise in SiteReliability … Kubernetes) and orchestration tools. Knowledge of monitoring and logging tools (e.g., Prometheus, Grafana, ELK Stack) for maintaining system health and performance. Ability to lead and mentor junior engineers in reliability and system optimization best practices. Excellent communication skills for effective collaboration with cross-functional teams and stakeholders. Proactive in identifying and mitigating potential system failures and performance issues. Preferred More ❯
London, England, United Kingdom Hybrid / WFH Options
Durlston Partners
and experience — talk with your recruiter to learn more. Base pay range $250,000.00/yr - $300,000.00/yr Direct message the job poster from Durlston Partners Senior SiteReliabilityEngineer | Remote (EU/UK) | High-Performance Trading A leading trading firm operating at scale in the digital asset space is hiring a Senior SiteReliabilityEngineer to help scale, secure, and optimise its global trading infrastructure. This is a remote-first role open to engineers across the UK and EU. This isn’t just another DevOps job — they’re looking for a tinkerer. Someone who enjoys getting deep into the internals of systems, thrives on debugging tough problems, and constantly looks … Industries Capital Markets, IT Services and IT Consulting, and Financial Services Referrals increase your chances of interviewing at Durlston Partners by 2x Sign in to set job alerts for “SiteReliabilityEngineer” roles. Wilmslow, England, United Kingdom 1 week ago SiteReliabilityEngineer | North America | Canada | Europe | Fully Remote Intermediate SiteReliabilityMore ❯
this company would be a great experience as their employees work in a supportive and autonomous environment. If you are looking for a challenging, yet rewarding role as a SiteReliabilityEngineer, this is the opportunity for you. What You'll Be Doing: Designing, creating, and delivering technical infrastructure code or services to improve the performance of More ❯
Join to apply for the Senior Cloud/SREEngineer role at LexisNexis 6 days ago Be among the first 25 applicants Join to apply for the Senior Cloud/SREEngineer role at LexisNexis SiteReliability Engineering/DevOps Engineer Are you enthusiastic about designing and managing cloud platforms? Do you find satisfaction in … area or product line. It contributes directly to project plans, schedules, and methodologies for implementing cross-functional software assets and infrastructure. Responsibilities include cloud platform design across multiple systems, SRE activities, mentoring less-experienced team members, and collaborating with users, customers, and stakeholders to translate their requirements into effective solutions. Additionally, it focuses on fostering a culture of innovation and … and orchestration tools (e.g., Docker, Kubernetes/EKS). Proficiency in scripting languages (e.g., Python, Bash, TypeScript, PowerShell). Knowledge of networking concepts and security best practices. Familiarity with SRE activities and best practices. Familiarity with DevOps practices and tools. Experience with monitoring and logging tools (e.g., DataDog, Coralogix, AWS CloudWatch, Azure Monitor). Excellent problem-solving and stakeholder management More ❯
we can make a meaningful impact. See more about our culture on https://mistral.ai/careers . About The Job Mistral AI is seeking an Applied AI Engineer focused on DevOps to facilitate the adoption of its products among customers and collaborate with them to address complex technical challenges. Applied AI Engineers, ML Infra at Mistral AI … in English • You hold a Bachelor's or Master's degree in Computer Science, Engineering, or a related field • You have 2+ years of experience in a DevOps or SiteReliability Engineering role • You're experienced with deploying and managing AI-based products in production environments • You are fluent in Python • You have experience with containerization technologies such … You hold strong communication skills with an ability to explain complex technical concepts in simple terms to technical and non-technical audiences Ideally you have: • Experience as a Customer Engineer, Forward Deployed Engineer, Sales Engineer, Solutions Architect, or Technical Product Manager • Familiarity with AI frameworks such as PyTorch or TensorFlow • Contributions to open-source projects, particularly in More ❯
SiteReliabilityEngineer/DevOps Engineer Are you enthusiastic about designing and managing cloud platforms? Do you find satisfaction in ensuring the reliability and performance of complex systems? About Team: The LexisNexis Intellectual Property (IP) division ( ) provides international patent content and a suite of online and analytic tools that meet the evolving needs of the … RDS, Azure VMs, Azure Functions). Maintaining and improving system documentation and operational procedures. Mentor team members and contribute to a culture of learning and inclusion. Continuously improving infrastructure reliability and reducing manual work (TOIL). Participating in incident response and root cause analysis. Why Join Us? Join our team and contribute to a culture of innovation, collaboration, and More ❯
SiteReliabilityEngineer/DevOps Engineer Are you enthusiastic about designing and managing cloud platforms? Do you find satisfaction in ensuring the reliability and performance of complex systems? About Team: The LexisNexis Intellectual Property (IP) division (https://www.lexisnexisip.com) provides international patent content and a suite of online and analytic tools that meet the … RDS, Azure VMs, Azure Functions). Maintaining and improving system documentation and operational procedures. Mentor team members and contribute to a culture of learning and inclusion. Continuously improving infrastructure reliability and reducing manual work (TOIL). Participating in incident response and root cause analysis. Why Join Us? Join our team and contribute to a culture of innovation, collaboration, and More ❯
SiteReliabilityEngineer Location: London … Hybrid (3 days WFH) Salary Range: Up to £140,000 Annapurna is working on behalf of a pioneering technology company to recruit a SiteReliabilityEngineer (SRE) . This is a unique opportunity to play a vital role in developing cutting-edge AI systems that power autonomous vehicle technology. What to Expect: The SRE will be instrumental … in ensuring the stability, resilience, and efficiency of complex autonomous systems. This is a role for someone who thrives on innovation, loves solving infrastructure and reliability challenges, and wants to play a significant role in shaping the future of AI-driven mobility. Key responsibilities include: Ensuring smooth and continuous operation of autonomous vehicle systems in real-world environments. Developing More ❯
City of London, London, United Kingdom Hybrid / WFH Options
Annapurna
SiteReliabilityEngineer Location: London … Hybrid (3 days WFH) Salary Range: Up to £140,000 Annapurna is working on behalf of a pioneering technology company to recruit a SiteReliabilityEngineer (SRE) . This is a unique opportunity to play a vital role in developing cutting-edge AI systems that power autonomous vehicle technology. What to Expect: The SRE will be instrumental … in ensuring the stability, resilience, and efficiency of complex autonomous systems. This is a role for someone who thrives on innovation, loves solving infrastructure and reliability challenges, and wants to play a significant role in shaping the future of AI-driven mobility. Key responsibilities include: Ensuring smooth and continuous operation of autonomous vehicle systems in real-world environments. Developing More ❯
Manchester, England, United Kingdom Hybrid / WFH Options
Sectigo
Sectigo Manchester, England, United Kingdom SiteReliabilityEngineer Sectigo Manchester, England, United Kingdom Get AI-powered advice on this job and more exclusive features. Job Description We are looking for a SiteReliabilityEngineer to join our growing global team at Sectigo. Job Description We are looking for a SiteReliabilityEngineer to join our growing global team at Sectigo. The SiteReliabilityEngineer will design and implement solutions to reduce toil and ensure reliability of our critical services at Sectigo. This is a full-time and remote position, with the ideal candidate located within 1-hour of vehicle commute … distance from Manchester, U.K. Here are the core functions, responsibilities, and expectations for this role: Ensure the reliability of our critical products and services by meeting or exceeding SRE objectives. Instantiate and maintain production infrastructure using Infrastructure as Code and Configuration Management tools. Build and maintain proper monitoring of our services by utilizing centralized logging and time series databases. More ❯
Cambourne, Cambridgeshire, United Kingdom Hybrid / WFH Options
Remotestar
to gemstone supplies They have a presence in London, Hong Kong, Amsterdam, and as well in Mumbai and now in New York in 2001. About the role : As the SRE Manager, you will play a critical role in ensuring the reliability, scalability, and performance of our infrastructure and services through both direct technical contribution along with team building and … tooling. Drive automation initiatives to streamline operational workflows and improve efficiency. Develop and maintain tools, scripts, and dashboards to monitor system health, performance, and reliability. Build a first class SRE team. Through a combination of leading by example, coaching and mentoring, mould the team would want to have around you. Provide leadership and guidance to the SRE team, fostering a … culture of collaboration, innovation, and continuous improvement. RESPONSIBILITIES: Proven experience in a senior or lead SRE role, with a strong track record of building and maintaining highly reliable infrastructure and services. Expertise in incident management, including incident response, resolution, and post-mortem analysis. Proficiency in monitoring, alerting, and observability tools such as Prometheus, Grafana, ELK stack or Datadog. Experience with More ❯
London, England, United Kingdom Hybrid / WFH Options
Blockchain Ventures
we share the passion to code, create, and ultimately build an open, accessible and fair financial future, one piece of software at a time. We are looking for a SiteReliabilityEngineer to join our Core team to encourage infrastructure best practices across our organization that would allow to securely scale a distributed financial platform that touches … of people a day. Our distributed financial platform tackles some of the most interesting problems in the crypto for millions of our customers and continues to grow rapidly. The SRE team at Blockchain combines software and systems engineering to provide a platform that abstracts complexity for increased security, reliability, and rapid product delivery. The SRE organization at Blockchain is … a rapid, secure, and scalable manner. WHAT YOU WILL DO You will play a critical role in evolving our infrastructure as we develop solutions to complex technical problems involving reliability, latency, bandwidth, and security. You will be an integral part of improving observability, monitoring, and alerting throughout the platform. You will help coordinate work across different areas of the More ❯
The Central Lake County Joint Action Water Agency (CLCJAWA)
Your Impact As a contributor in the APX SRE organization, you are passionate about delivering solutions to the real-time problems our mission-critical cloud native services encounter. You are also obsessed about achieving the high quality and reliability our customers demand. You will work closely not only with the APX SRE organization, but your technical deliverables will reach … You'll Do Location: London, England. Build robust, easy-to-use foundational platforms and tools that enable engineering teams to provision services rapidly, consistently, and securely. Exemplify cloud-native sitereliability best practices. Write code that is performant, maintainable, clear, and concise. Employ strong problem-solving skills, with the ability to debug problems in cloud-native distributed systems. More ❯
passionate about building unified IT solutions that simplify the way IT organizations work. We are currently looking for a Senior SiteReliabilityEngineer to join our SRE team in the Platform Engineering organization and help us scale our products to millions of end-users. We are looking for individuals with a passion for automation and observability, ensuring … and SOP's Develop software, scripts, or tooling to improve efficiency and reduce delivery time of applications and infrastructure Other duties as needed About You 7+ years' experience in SiteReliabilityEngineer roles 3+ years' experience with an object-oriented language (preferably Java, .NET or C++) Expert+ level Linux administration, scripting, and troubleshooting Demonstrable knowledge of Observability More ❯
passionate about building unified IT solutions that simplify the way IT organizations work. We are currently looking for a Senior SiteReliabilityEngineer to join our SRE team in the Platform Engineering organization and help us scale our products to millions of end-users. We are looking for individuals with a passion for automation and observability, ensuring … and SOP's Develop software, scripts, or tooling to improve efficiency and reduce delivery time of applications and infrastructure Other duties as needed About You 7+ years' experience in SiteReliabilityEngineer roles 3+ years' experience with an object-oriented language (preferably Java, .NET or C++) Expert+ level Linux administration, scripting, and troubleshooting Demonstratable knowledge of Observability More ❯
London, England, United Kingdom Hybrid / WFH Options
Blockchain.com
we share the passion to code, create, and ultimately build an open, accessible and fair financial future, one piece of software at a time. We are looking for a SiteReliabilityEngineer to join our Core team to encourage infrastructure best practices across our organization that would allow to securely scale a distributed financial platform that touches … of people a day. Our distributed financial platform tackles some of the most interesting problems in the crypto for millions of our customers and continues to grow rapidly. The SRE team at blockchain combines software and systems engineering to provide a platform that abstracts complexity for increased security, reliability and rapid product delivery. The SRE organization at Blockchain is … and scalable manner. WHAT YOU WILL DO You will be able to play a critical role in evolving our infrastructure as we develop solutions to complex technical problems involving reliability, latency, bandwidth and most importantly security. You will be an integral part of improving observability, monitoring and alerting throughout the platform. You will help co-ordinate work across different More ❯
Out in Science, Technology, Engineering, and Mathematics
enforcement systems. Bring your leadership, technical expertise, and high bar for quality to a team that's building the foundation for fast, reliable cloud services worldwide. As a Senior SiteReliabilityEngineer in the Axon IaC group, your responsibilities will include contributing to architectural decisions, tool selection and guiding best practices for our IaC provisioning pipelines. You … Axon to deliver new features efficiently. You are obsessed with achieving the high performance and reliability our customers demand and reducing toil. You will work closely with both SRE's and SWE's, and your technical deliverables will join forces with partner teams in building our cloud infrastructure provisioning platform and CICD pipelines of the future. The ideal candidate … You'll Do Location: London, England Build robust, easy-to-use foundational platforms and tools that enable engineering teams to provision services rapidly, consistently, and securely. Exemplify cloud-native sitereliability best practices. Write code that is performant, maintainable, clear, and concise. Employ strong problem-solving skills, with the ability to debug problems in cloud-native distributed systems. More ❯
and drive real change. Constantly grow as you work hard for a mission that matters at a company where you matter. Your Impact As a contributor in the APX SRE organization, you are passionate about delivering solutions to the real-time problems our mission-critical cloud native services encounter. You are also obsessed about achieving the high quality and reliability our customers demand. You will work closely not only with the APX SRE organization, but your technical deliverables will reach the entire engineering organization to enable product teams to continuously deliver features on the vanguard of innovation. What You'll Do Location: London, England. (3-4 days onsite) Build robust, easy-to-use foundational platforms and tools that enable … engineering teams to provision services rapidly, consistently, and securely. Exemplify cloud-native sitereliability best practices. Write code that is performant, maintainable, clear, and concise. Employ strong problem-solving skills, with the ability to debug problems in cloud-native distributed systems. Influence and educate the engineering organization to adopt new and improved architectural patterns. Provide robust documentation for More ❯
Sheffield, England, United Kingdom Hybrid / WFH Options
JR United Kingdom
Social network you want to login/join with: SiteReliabilityEngineer, sheffield, south yorkshire col-narrow-left Client: Durlston Partners Location: sheffield, south yorkshire, United Kingdom Job Category: Other - EU work permit required: Yes col-narrow-right Job Views: 4 Posted: 31.05.2025 Expiry Date: 15.07.2025 col-wide Job Description: Senior SiteReliabilityEngineer | Remote (EU/UK) | High-Performance Trading A leading trading firm operating at scale in the digital asset space is hiring a Senior SiteReliabilityEngineer to help scale, secure, and optimise its global trading infrastructure. This is a remote-first role open to engineers across the UK and EU. This isn’t just another DevOps More ❯
Join to apply for the SR SiteReliabilityEngineer role at Wakapi . We are seeking a highly skilled Senior SiteReliabilityEngineer to join our Platform Engineering team. The ideal candidate will have a strong understanding of DevOps and Service Level Management (SLM) metrics, with experience in event-driven infrastructure projects using tools … ensuring observability through metrics, tracing, log aggregation, and alerting. Help teams determine settings and thresholds for alerts and automations based on application performance requirements. Monitor, optimize, and ensure system reliability and performance using tools like New Relic and applying DORA metrics. Track uptime, response times, and resolution times to ensure compliance with SLAs, SLOs, and SLIs. Implement and promote More ❯
Chesterfield, England, United Kingdom Hybrid / WFH Options
JR United Kingdom
Social network you want to login/join with: Senior SiteReliabilityEngineer | Remote (EU/UK) | High-Performance Trading A leading trading firm operating at scale in the digital asset space is hiring a Senior SiteReliabilityEngineer to help scale, secure, and optimise its global trading infrastructure. This is a remote-first More ❯
Head of Production Engineering & SiteReliability Engineering (SRE) Join to apply for the Head of Production Engineering & SiteReliability Engineering (SRE) role at SS&C Technologies Head of Production Engineering & SiteReliability Engineering (SRE) 1 week ago Be among the first 25 applicants Join to apply for the Head of Production Engineering & SiteReliability Engineering (SRE) role at SS&C Technologies As a leading financial services and healthcare technology company based on revenue, SS&C is headquartered in Windsor, Connecticut, and has 27,000+ employees in 35 countries. Some 20,000 financial services and healthcare organizations, from the world's largest companies to small and mid-market firms, rely on SS … investor and distributor services across asset managers, insurance companies, retirement providers, and wealth management platforms. Job Overview As the Head of Production Engineering and SiteReliability Engineering (SRE) for the GIDS organisation, you will lead a team responsible for the scalability, resilience, performance, and reliability of cloud and hybrid infrastructure powering some of the most critical client More ❯
SiteReliabilityEngineer with Python Our Client looking to bring on a sitereliabilityengineer to help deploy, manage, troubleshoot, and enhance our complex cloud-based set of internal tools and externally managed services for a variety of users across our wide-ranging organization. You will have at least 7 to 10 years hands … on expertise working as a SiteReliability Engineer. You will work closely with IT, product, and engineering to extend and maintain this set of tools and services and to help debug and resolve problems. In addition, the ideal candidate will proactively look for system weaknesses and find ways to resolve them before they can cause production issues via More ❯
investor and distributor services across asset managers, insurance companies, retirement providers, and wealth management platforms. Job Overview As the Head of Production Engineering and SiteReliability Engineering (SRE) for the GIDS organisation, you will lead a team responsible for the scalability, resilience, performance, and reliability of cloud and hybrid infrastructure powering some of the most critical client … Build a modern engineering organisation with a strong culture of innovation, ownership, and reliability. Key Responsibilities Leadership & Strategy Define and execute the vision and roadmap for Production Engineering and SRE within GIDS. Build and lead globally distributed, high-performance teams with a focus on talent development, SRE culture, and operational excellence. Collaborate cross-functionally with Engineering, Product, Compliance, and Infrastructure … burnout in around-the-clock operations, including tooling, automation, and shift rotation planning. Qualifications Required: 10+ years of experience in engineering, with 5+ years in a leadership role in SRE, DevOps, or Production Engineering. Proven track record managing reliable, scalable systems in a high-compliance environment (e.g., FinTech, HealthTech). Strong understanding of modern software development lifecycle, CI/CD More ❯