approval Helping others, and asking for help Self-critical of all our operations, consistently striving to improve our service and processes Improve processes through rationalization and automation Increase service reliability through i dentification and elimination of single points of failure, of process shortcomings, develop innovative process improvements, reducing workload and risk What will make you Successful: Linux administration (RedHat More ❯
colleagues, all with a common goal to deliver an exceptional customer experience every day." We're looking for a SiteReliability/Application Support Engineers (SRE/AS) responsible for Digital Payments application performance, availability, and reliability. Candidate is responsible to provide consultation and strategic recommendations by quickly assessing and remediating complex platform availability issues. SiteReliabilityEngineering (SRE) is a continuous engineering discipline that effectively combines software development and systems engineering to build and run scalable, distributed, fault-tolerant systems. This role will ensure that American Express internal and external services have reliability and uptime appropriate to users' needs. We also ensure a continuous improvement, while keeping an ever-watchful eye … automated, on capacity and performance. This role will drive the SRE/AS mindset which strives to use software engineering to build and run better production systems. You will write software to optimize day to day work through better automation, monitoring, alerting, testing, and deployment. You'll be expected to work with several Technology partners to identify areas of More ❯
strong background in DevOps design and transformation, cloud-native engineering, and modern DevOps tooling. The ideal candidate will also bring expertise in SiteReliabilityEngineering (SRE) principles and practices, with a focus on building scalable, reliable, and resilient systems. Key Responsibilities: • Architect and implement scalable, secure, and high-performance DevOps solutions. • Lead DevOps transformation initiatives across … enterprise environments. • Design and implement cloud-native solutions on Azure, AWS, or GCP. • Apply SRE principles to ensure system reliability, availability, and performance. • Build and maintain CI/CD pipelines and infrastructure as code (IaC). • Evaluate and integrate modern DevOps tools and practices. • Collaborate with cross-functional teams to align DevOps and SRE strategies with business goals. • Mentor … and lead DevOps teams, fostering a culture of innovation and continuous improvement. • Leverage AI and machine learning to optimize DevOps and SRE processes. • Ensure compliance, security, and operational excellence in all DevOps practices. ͏ Required Qualifications: • 15+ years of experience in IT, with a strong focus on DevOps and cloud architecture. • Proven experience in DevOps design and transformation across multiple projects. More ❯
the latest DevOps trends, tools, and standard processes to drive innovation. What You'll Bring: - Proven experience in **DevOps, Cloud Engineering, or SiteReliabilityEngineering (SRE)**. - Deep expertise in Azure and AWS cloud services. - Strong proficiency in Python for scripting and automation. - Hands-on experience with GitLab CI/CD and modern DevOps pipelines. - In … forward-thinking team. - Work with the latest technologies in cloud computing and DevOps. - Enjoy opportunities for growth, learning, and professional development. - Make a meaningful impact on our infrastructure and engineering processes. If you're passionate about DevOps, automation, and cloud technologies, we'd love to hear from you! LSEG is a leading global financial markets infrastructure and data provider. More ❯
London, England, United Kingdom Hybrid / WFH Options
Aker Systems
diversity because they help challenge us and find new groundbreaking technical solutions. Seniority level Seniority level Mid-Senior level Employment type Employment type Full-time Job function Job function Engineering and Information Technology Industries Software Development Referrals increase your chances of interviewing at Aker Systems by 2x Get notified about new DevOps Engineer jobs in London, England, United Kingdom … . London, England, United Kingdom 7 hours ago London, England, United Kingdom 1 week ago London, England, United Kingdom 5 days ago Infrastructure SiteReliability Engineer Greater London, England, United Kingdom 1 week ago London, England, United Kingdom 2 months ago London, England, United Kingdom 2 weeks ago London, England, United Kingdom 6 days ago DevOps/Data … Engineering/Compliance (Remote) London, England, United Kingdom 1 day ago City Of London, England, United Kingdom $25.00-$40.00 1 hour ago London, England, United Kingdom 3 weeks ago London, England, United Kingdom 2 weeks ago London, England, United Kingdom 3 months ago London, England, United Kingdom 6 days ago London, England, United Kingdom 1 month ago London, England More ❯
Leeds, England, United Kingdom Hybrid / WFH Options
ZipRecruiter
and grow out a small team. This permanent opportunity sits within a high-performing technology function, driving excellence across DevOps. You'll influence technical direction, modernise infrastructure, and mentor engineering talent in a fast-paced, agile environment. Lead DevOps Engineer Responsibilities Lead the design of scalable CI/CD pipelines Drive platform modernisation Mentor and lead a small team … of engineers Align DevOps capabilities with the wider business Champion DevEx, reliability, and security Embed operational excellence and incident response Promote observability and performance optimisation Lead DevOps Engineer Requirements Proven technical … and some leader/mentoring experience Cloud- expertise (any cloud provider is fine: GCP, AWS or Azure) Knowledge of GitLab CI/CD, Terraform, Ansible Experience in Kubernetes, Docker, SRE and IaC principles Monitoring with Prometheus, Grafana etc Any Scripting experience will be a bonus What's in it for me? Competitive salary up to £90k + bonus Hybrid working More ❯
Bradford, England, United Kingdom Hybrid / WFH Options
JR United Kingdom
and grow out a small team. This permanent opportunity sits within a high-performing technology function, driving excellence across DevOps. You'll influence technical direction, modernise infrastructure, and mentor engineering talent in a fast-paced, agile environment. Lead the design of scalable CI/CD pipelines Drive platform modernisation Mentor and lead a small team of engineers Align DevOps … capabilities with the wider business Champion DevEx, reliability, and security Embed operational excellence and incident response Promote observability and performance optimisation Proven technical and … some leader/mentoring experience Cloud-native expertise (any cloud provider is fine: GCP, AWS or Azure) Knowledge of GitLab CI/CD, Terraform, Ansible Experience in Kubernetes, Docker, SRE and IaC principles Monitoring with Prometheus, Grafana etc Any scripting experience will be a bonus What's in it for me? Competitive salary up to £90k + bonus Hybrid working More ❯
London, England, United Kingdom Hybrid / WFH Options
Keyrock
Infrastructure as Code (IaC): Use Terraform, Ansible, or similar tools for automation. Collaboration & Knowledge Sharing: Work closely with development, security, and operations teams to promote DevOps culture. Disaster Recovery & ReliabilityEngineering: Design failover and backup strategies to ensure business continuity. Background and Experience Bachelor’s degree in Computer Science, Engineering, or related field. 5+ years in cloud infrastructure, SRE, or DevOps roles. Interest or exposure to trading or similar themes is desirable but not essential. Competencies and Personality Strong expertise in AWS (EC2, S3, Lambda, RDS, VPC, IAM). More ❯
Southampton, England, United Kingdom Hybrid / WFH Options
JR United Kingdom
cause analysis, and implement observability best practices (metrics, logging, tracing). Harden infrastructure and deployments with infrastructure as code (Terraform/CDK/CloudFormation). Lead incident response, system reliability efforts, and infrastructure scalability initiatives. Manage messaging queues (e.g., Kafka, RabbitMQ) and optimize for low-latency event handling and throughput. Contribute to evolving our security posture, including secrets management … controls, and audit logging. Qualifications: Bachelor's degree in Computer Science, Engineering, or a related field, or equivalent practical experience. 5+ years of professional experience in a DevOps, SRE, or Cloud Infrastructure role. Strong proficiency with AWS, Linux, and containerized environments (Docker, Kubernetes). Deep understanding of CI/CD best practices and hands-on experience with tools like More ❯
Newcastle upon Tyne, England, United Kingdom Hybrid / WFH Options
Partnerize
ones along the way. As a Senior Linux/SysAdmin Engineer at Partnerize, You Will: Delivery coaching sessions to the team/individuals Scoping the work coming into the SRE team and delegating to the team members appropriately. Provide primary operational support and engineering for multiple large, distributed software applications Build software and systems to manage platform infrastructure and … applications Measure and optimise system performance, with an eye toward pushing our capabilities forward, getting ahead of customer needs, and innovating to continually improve Improve reliability, quality, and time-to-market of our suite of software solutions Drive improvements to Partnerize platforms security through strategic planning and collaboration with the security compliance team. Produce production grade application security designs. More ❯
actual pay will be based on your skills and experience — talk with your recruiter to learn more. Base pay range Direct message the job poster from Caspian One DevOps, SRE, and Data Engineering | Supporting FinTech, Healthcare, and Broadcast across Cloud, Data & GenAI We're looking for an experienced Linux Engineer to help manage and optimise global server environments supporting … log aggregation tools to improve observability Collaborate with teams to identify bottlenecks and deploy scalable, automated solutions What We're Looking For: 6+ years of Linux system administration and engineering experience in performance-critical environments Proficiency in Python and bash Scripting, with hands-on Ansible experience Familiarity with observability tools like Prometheus, Grafana, and ELK Infrastructure-as-code experience … United Kingdom . Greater London, England, United Kingdom 1 week ago Linux Engineer - Up to £200k + Industry Leading Bonus - Elite FinTech Firm Linux System Engineer - Systems Technologies and Engineering - London London, England, United Kingdom 4 weeks ago London Colney, England, United Kingdom 3 days ago Greater London, England, United Kingdom 2 weeks ago Greater London, England, United Kingdom More ❯
architectures, playbooks, and standard operating procedures. Collaborate cross-functionally with engineering and research teams. Experience General Programming Debugging Skill (irrespective of Programming Languages). 3+ years in DevOps, SRE, or system infrastructure roles. Strong experience with Kubernetes and container orchestration. Familiarity with: Grafana, Prometheus, ArgoCD Knowledge of infrastructure-as-code tools: Terraform, Ansible, or Helm. Knowledge of virtualisation and More ❯
/ad hoc duties as required to meet the needs of the business. Experience/Competences Essential Deep and broad experience of AWS Cloud platform and services DevOps and SRE principles Very good working knowledge of incorporating testing into CI/CD pipelines Understanding of various deployment patterns such as blue-green and canary Platforms; Windows Server, Amazon Linux, RHEL More ❯
Senior SiteReliability Engineer Job role: SRE Engineer, Senior SRE Engineer, Senior SiteReliability Engineer Salary: £70,000 - £90,000 Location: UK Based, Fully Remote with office travel once a month Company: Market leading SAAS team with LOTS of growth opportunities If you’re an SRE who thrives in a fast-moving environment, loves solving real … uptime, this one’s for you. We're supporting a tech-driven, purpose-led company in the energy sector that's scaling up its platform capabilities. As their next SiteReliability Engineer , you'll work closely with engineering and platform teams to ensure systems are performant, resilient, and scalable while shaping the observability and incident response strategy. … CloudFormation CI/CD pipelines + scripting (Python, Bash, PowerShell) Containerized applications (Docker + ECS) Observability tooling like New Relic, CloudWatch, Prometheus, Datadog Who we’re looking for: Proven SRE or platform engineering experience in a high-availability environment Passion for reliability, automation, and system performance Strong problem-solving mindset and solid communication skills Bonus: Software engineeringMore ❯
London, England, United Kingdom Hybrid / WFH Options
Your Next Hire
Senior SiteReliability Engineer Job role: SRE Engineer, Senior SRE Engineer, Senior SiteReliability Engineer Salary: £70,000 - £90,000 Location: UK Based, Fully Remote with office travel once a month Company: Market leading SAAS team with LOTS of growth opportunities If you’re an SRE who thrives in a fast-moving environment, loves solving real … uptime, this one’s for you. We're supporting a tech-driven, purpose-led company in the energy sector that's scaling up its platform capabilities. As their next SiteReliability Engineer , you'll work closely with engineering and platform teams to ensure systems are performant, resilient, and scalable while shaping the observability and incident response strategy. … CloudFormation CI/CD pipelines + scripting (Python, Bash, PowerShell) Containerized applications (Docker + ECS) Observability tooling like New Relic, CloudWatch, Prometheus, Datadog Who we’re looking for: Proven SRE or platform engineering experience in a high-availability environment Passion for reliability, automation, and system performance Strong problem-solving mindset and solid communication skills Bonus: Software engineeringMore ❯
Job Description Job Title: Senior SiteReliability Engineer (SRE) Location: London, UK – Onsite (5 days/week) Employment Type: Permanent Salary: Up to £80,000 per annum (Gross) About the Role: We are seeking a highly skilled and motivated SiteReliability Engineer (SRE) to join our London-based team. This role is ideal for someone passionate … about service reliability, scalability, and performance. As an SRE, you will collaborate with development and operations teams to automate infrastructure, enhance observability, and reduce manual processes (TOIL) to improve overall system health. Key Responsibilities: Design, build, and maintain scalable, resilient systems and services. Automate routine tasks and eliminate manual effort using scripting and infrastructure-as-code. Collaborate with development … CI/CD frameworks. Qualifications: Bachelor’s degree in Computer Science, Engineering, or a related technical field (or equivalent practical experience). 8+ years of relevant experience in SRE, DevOps, or Infrastructure Engineering roles. #J-18808-Ljbffr More ❯
Your Impact As a contributor in the APX SRE organization, you are passionate about delivering solutions to the real-time problems our mission-critical cloud native services encounter. You are also obsessed about achieving the high quality and reliability our customers demand. You will work closely not only with the APX SRE organization, but your technical deliverables will reach … the entire engineering organization to enable product teams to continuously deliver features on the vanguard of innovation. What You'll Do Location: London, England. Build robust, easy-to-use foundational platforms and tools that enable engineering teams to provision services rapidly, consistently, and securely. Exemplify cloud-native sitereliability best practices. Write code that is performant … maintainable, clear, and concise. Employ strong problem-solving skills, with the ability to debug problems in cloud-native distributed systems. Influence and educate the engineering organization to adopt new and improved architectural patterns. Provide robust documentation for use by engineers to promote self-service. Take calculated risks, champion new ideas, and cultivate your craft. What You Bring Ability to More ❯
Cambourne, Cambridgeshire, United Kingdom Hybrid / WFH Options
Remotestar
to gemstone supplies They have a presence in London, Hong Kong, Amsterdam, and as well in Mumbai and now in New York in 2001. About the role : As the SRE Manager, you will play a critical role in ensuring the reliability, scalability, and performance of our infrastructure and services through both direct technical contribution along with team building and … tooling. Drive automation initiatives to streamline operational workflows and improve efficiency. Develop and maintain tools, scripts, and dashboards to monitor system health, performance, and reliability. Build a first class SRE team. Through a combination of leading by example, coaching and mentoring, mould the team would want to have around you. Provide leadership and guidance to the SRE team, fostering a … culture of collaboration, innovation, and continuous improvement. RESPONSIBILITIES: Proven experience in a senior or lead SRE role, with a strong track record of building and maintaining highly reliable infrastructure and services. Expertise in incident management, including incident response, resolution, and post-mortem analysis. Proficiency in monitoring, alerting, and observability tools such as Prometheus, Grafana, ELK stack or Datadog. Experience with More ❯
Reigate, England, United Kingdom Hybrid / WFH Options
Willis Towers Watson
Summary : We are seeking a SiteReliability Engineer to join our SRE team based in Reigate. The ideal candidate will have excellent communication skills, experience working with multiple stakeholders, and a track record in Azure and Observability platforms. You will be joining Insurance Consulting and Technology (ICT) at an exciting time of transformation as we work on improving … multiple greenfield workstreams in the delivery family to deliver core foundational functionality that will be used by multiple SaaS product offerings across the business. You will be with other SiteReliability and Response teams as well as with the core Applications Teams, whose responsibility is to deliver and manage business critical services that are used 24×7 by … open to flexible and hybrid working arrangements, with presence in the Reigate office up to two days per week. The Role: Collaborate with cross-functional teams to ensure the reliability, availability, and performance of our client-facing services Maintain and configure observability platforms such as Datadog Proactive monitoring of production and other environments to ensure stability, availability, security and More ❯
London, England, United Kingdom Hybrid / WFH Options
Blockchain Ventures
we share the passion to code, create, and ultimately build an open, accessible and fair financial future, one piece of software at a time. We are looking for a SiteReliability Engineer to join our Core team to encourage infrastructure best practices across our organization that would allow to securely scale a distributed financial platform that touches millions … of people a day. Our distributed financial platform tackles some of the most interesting problems in the crypto for millions of our customers and continues to grow rapidly. The SRE team at Blockchain combines software and systems engineering to provide a platform that abstracts complexity for increased security, reliability, and rapid product delivery. The SRE organization at Blockchain … a member of the Core team, you will be tasked with developing an in-depth understanding of the infrastructure needs of our products. You will establish and maintain creative engineering solutions to improve our customers’ experience by building necessary tooling. Crucially, you will also guide and educate developer teams so that they can deliver new features in a rapid More ❯
We are seeking a seasoned Principal Engineer to lead the design, development, and evolution of our Observability Platform , ensuring it meets the needs of our rapidly scaling systems and engineering teams. This role will also focus on leveraging Machine Learning (ML) and Artificial Intelligence (AI) to deliver advanced insights that proactively improve system health and drive down Mean Time … scale. Integrate ML/AI-driven solutions to enhance anomaly detection, root cause analysis, and predictive insights. Lead the development and adoption of platform capabilities to ensure system health, reliability, and performance. Establish and evolve platform standards and best practices to align with the company's overall engineering goals. Strategic Initiatives Collaborate with engineering teams to define … and Collaboration Act as a mentor and technical leader for engineers, fostering a culture of learning, innovation, and excellence. Collaborate with stakeholders, including SiteReliabilityEngineering (SRE), infrastructure, and application teams, to gather requirements and deliver impactful solutions. Advocate for observability as a critical enabler of operational success across the organization. What will you bring to the More ❯
London, England, United Kingdom Hybrid / WFH Options
Blockchain.com
we share the passion to code, create, and ultimately build an open, accessible and fair financial future, one piece of software at a time. We are looking for a SiteReliability Engineer to join our Core team to encourage infrastructure best practices across our organization that would allow to securely scale a distributed financial platform that touches millions … of people a day. Our distributed financial platform tackles some of the most interesting problems in the crypto for millions of our customers and continues to grow rapidly. The SRE team at blockchain combines software and systems engineering to provide a platform that abstracts complexity for increased security, reliability and rapid product delivery. The SRE organization at Blockchain … a member of the Core team you will be tasked with developing an in-depth understanding of the infrastructure needs of our products. You will establish and maintain creative engineering solutions to improve our customers’ experience by building necessary tooling. Crucially, you will alsoguide and educate developer teamsso that they can deliver new features in a rapid, secure and More ❯
London, England, United Kingdom Hybrid / WFH Options
Durlston Partners
and experience — talk with your recruiter to learn more. Base pay range $250,000.00/yr - $300,000.00/yr Direct message the job poster from Durlston Partners Senior SiteReliability Engineer | Remote (EU/UK) | High-Performance Trading A leading trading firm operating at scale in the digital asset space is hiring a Senior SiteReliability … the stack For more information, please get in touch at james@durlstonpartners.com Seniority level Seniority level Mid-Senior level Employment type Employment type Full-time Job function Job function Engineering and Information Technology Industries Capital Markets, IT Services and IT Consulting, and Financial Services Referrals increase your chances of interviewing at Durlston Partners by 2x Sign in to set … job alerts for “SiteReliability Engineer” roles. Wilmslow, England, United Kingdom 1 week ago SiteReliability Engineer | North America | Canada | Europe | Fully Remote Intermediate SiteReliability Engineer, Environment Automation Nottingham, England, United Kingdom 2 months ago London, England, United Kingdom 2 days ago Coventry, England, United Kingdom 2 months ago Luton, England, United Kingdom More ❯
Manchester, England, United Kingdom Hybrid / WFH Options
Couchbase
Join to apply for the SiteReliability Engineer role at Couchbase 2 weeks ago Be among the first 25 applicants Join to apply for the SiteReliability Engineer role at Couchbase As industries race to embrace AI, traditional database solutions fall short of rising demands for versatility, performance, and affordability. Couchbase is leading the way with … to edge. Trusted by over 30% of the Fortune 100, Couchbase is unlocking innovation, accelerating AI transformation, and redefining customer experiences. Come join our mission. Role Overview At Couchbase, SiteReliability Engineers are hybrid software and systems engineers. They are the glue holding things together, whether that’s infrastructure/platform, tooling support for our cloud business or … managing Observability posture for Couchbase. In this role the candidate we are looking for is for the Observability team which is responsible for maintaining Reliability, Availability and Serviceability for the entire Couchbase cloud offerings. You will be working as a Software Engineer developing and maintaining Couchbase monitoring stack which includes metrics pipeline, alerting, notifications and the likes. You will More ❯
London, England, United Kingdom Hybrid / WFH Options
Auros
Join to apply for the Senior SiteReliability Engineer role at Auros Join to apply for the Senior SiteReliability Engineer role at Auros At Auros, we’re dedicated to advancing the cryptocurrency ecosystem through unparalleled liquidity and market-making services. We’re one of the largest participants in the market, trading across 10+ global locations … within this dynamic environment. You’ll be responsible for: Participate in on-call roster to support our trading operations. Maintain and improve our global infrastructure with high performance and reliability requirements. Improve and update the security infrastructure of a widely distributed company that operates in a high-risk environment. Engage and collaborate with other teams around system layout, rollout … performance or reliability. Active participation in various trading and infrastructure projects. Work closely with developers, traders and other staff to accomplish our firm’s goals. Who you are An SRE/DevOps professional with experience managing and optimising Linux systems in a high-performance 24 x 7 environment. Cloud management using IaC, with experience in AWS, Azure or Google Cloud. More ❯