Chaos Engineering Jobs in London

11 of 11 Chaos Engineering Jobs in London

Global IT Quality Engineer Senior Director & CoE Lead

London, United Kingdom
Boston Consulting Group
of our DNA. To meet the needs of BCG's global, mobile, fast growing and increasingly diverse business, we are looking for a Global IT Senior Director for Quality Engineering role to lead and expand our central QA Center of Excellence (CoE) into an end-to-end QA Team. To execute this transformation, we need people who can translate … and expertise development for Quality Assurance and Performance Engineering. Among your responsibilities, you will: Lead End-to-End Quality Assurance: Lead the development and expansion of a centralized Quality Engineering (QE) Centre of Excellence (COE), ensuring that quality and performance standards are maintained across all platforms, products, including end-user environments. Implement best practices in quality metrics, reviews, and … end-to-end testing and manage structured QA cycles for security updates, patches, and system upgrades, ensuring comprehensive testing across third-party and custom-built applications. Establish Advanced Performance Engineering: Establish a robust performance engineering strategy, integrating advanced tools for application performance monitoring (APM), observability, and telemetry. Focus on early identification of performance bottlenecks and quality assurance measures More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Site Reliability Engineer

London, South East, England, United Kingdom
Hybrid / WFH Options
Rise Technical Recruitment Limited
a strong culture rooted in integrity, creativity, and technical excellence, they've become a trusted partner across global industries.In this role you'll take ownership of platform reliability, resilience engineering, and incident management across cutting-edge cloud infrastructure. You'll play a key role in ensuring uptime, performance, and continuous improvement of core systems.The ideal candidate will be an … experienced Site Reliability Engineer with a deep background in AWS, Kubernetes (EKS), Terraform, and monitoring/eventing tools. You'll have a strong grasp of application-level troubleshooting, chaos engineering, and performance tuning.This is a fantastic opportunity to work in a modern DevOps environment where innovation is encouraged, personal development is supported, and technical impact is real. The … Role: *Manage and optimise AWS and Kubernetes (EKS) infrastructure*Implement resilience strategies and conduct chaos engineering experiments*Monitor and maintain Kafka clusters for performance and reliability*Respond to and resolve application-level production incidents The Person: *5+ years in SRE, DevOps, or infrastructure engineering*Strong experience with AWS, EKS/Kubernetes, and Terraform*Familiar with Kafka and More ❯
Employment Type: Full-Time
Salary: £80,000 - £90,000 per annum, Inc benefits
Posted:

Site Reliability Engineer

London, United Kingdom
Hybrid / WFH Options
Rise Technical Recruitment Limited
strong culture rooted in integrity, creativity, and technical excellence, they've become a trusted partner across global industries. In this role you'll take ownership of platform reliability, resilience engineering, and incident management across cutting-edge cloud infrastructure. You'll play a key role in ensuring uptime, performance, and continuous improvement of core systems. The ideal candidate will be … an experienced Site Reliability Engineer with a deep background in AWS, Kubernetes (EKS), Terraform, and monitoring/eventing tools. You'll have a strong grasp of application-level troubleshooting, chaos engineering, and performance tuning. This is a fantastic opportunity to work in a modern DevOps environment where innovation is encouraged, personal development is supported, and technical impact is … real. The Role: Manage and optimise AWS and Kubernetes (EKS) infrastructure Implement resilience strategies and conduct chaos engineering experiments Monitor and maintain Kafka clusters for performance and reliability Respond to and resolve application-level production incidents The Person: 5+ years in SRE, DevOps, or infrastructure engineering Strong experience with AWS, EKS/Kubernetes, and Terraform Familiar with More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Site Reliability Engineer

City of London, London, United Kingdom
Hybrid / WFH Options
Rise Technical Recruitment
strong culture rooted in integrity, creativity, and technical excellence, they've become a trusted partner across global industries. In this role you'll take ownership of platform reliability, resilience engineering, and incident management across cutting-edge cloud infrastructure. You'll play a key role in ensuring uptime, performance, and continuous improvement of core systems. The ideal candidate will be … an experienced Site Reliability Engineer with a deep background in AWS, Kubernetes (EKS), Terraform, and monitoring/eventing tools. You'll have a strong grasp of application-level troubleshooting, chaos engineering, and performance tuning. This is a fantastic opportunity to work in a modern DevOps environment where innovation is encouraged, personal development is supported, and technical impact is … real. The Role: *Manage and optimise AWS and Kubernetes (EKS) infrastructure *Implement resilience strategies and conduct chaos engineering experiments *Monitor and maintain Kafka clusters for performance and reliability *Respond to and resolve application-level production incidents The Person: *5+ years in SRE, DevOps, or infrastructure engineering *Strong experience with AWS, EKS/Kubernetes, and Terraform *Familiar with More ❯
Employment Type: Permanent
Salary: £80000 - £90000/annum 38 Days Holiday, Healthcare, Pension
Posted:

Sr. Cloud Operations Delivery Manager (CODM), Enterprise Support - UKI

London, United Kingdom
Amazon
ability to make high-judgment technical decisions in complex environments - Experience leading cross-functional teams with a mix of technical, business, and operational roles PREFERRED QUALIFICATIONS - Experience with resilience engineering, chaos engineering, and observability practices in AWS - Understanding of enterprise IT operational capabilities - examples include Change, Incident Management, infrastructure management or applications management - Knowledge of the AWS More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

SRE & Service Lead

London, United Kingdom
Sanderson Recruitment
Role: SRE & Service Lead - Digital Core Platforms Location: 2 days a week in London Salary: £160,000 + 20% Value Account + Bonus Are you a forward-thinking Engineering Leader with a deep understanding of software engineering, cloud infrastructure, and SRE principles? Do you have a sharp eye for automation, observability, and leading technical teams through digital transformation … could be the perfect opportunity to elevate your career at the forefront of banking innovation. This is a unique opportunity to join a major UK bank and lead strategic engineering efforts across three key areas: Retail Mortgages Bank of APIs - delivering on PSD2 and other regulatory initiatives Real-Time Core Banking - part of a long-term, cutting-edge modernisation … programme You'll be responsible for coordinating engineering teams, guiding technical strategy, and embedding best practices across one of the largest engineering domains in the bank (over 1000 staff - 75% engineers). This is a hands-on leadership role for someone who's passionate about driving resilience, automation, performance, and security at scale. Ideal Candidate: Deep software engineering More ❯
Employment Type: Permanent
Posted:

SRE & Service Lead

London, South East, England, United Kingdom
Sanderson
Role: SRE & Service Lead - Digital Core Platforms Location: 2 days a week in London Salary: £160,000 + 20% Value Account + Bonus Are you a forward-thinking Engineering Leader with a deep understanding of software engineering, cloud infrastructure, and SRE principles? Do you have a sharp eye for automation, observability, and leading technical teams through digital transformation … could be the perfect opportunity to elevate your career at the forefront of banking innovation. This is a unique opportunity to join a major UK bank and lead strategic engineering efforts across three key areas: Retail Mortgages Bank of APIs - delivering on PSD2 and other regulatory initiatives Real-Time Core Banking - part of a long-term, cutting-edge modernisation … programme You'll be responsible for coordinating engineering teams, guiding technical strategy, and embedding best practices across one of the largest engineering domains in the bank (over 1000 staff - 75% engineers). This is a hands-on leadership role for someone who's passionate about driving resilience, automation, performance, and security at scale. Ideal Candidate: Deep software engineering More ❯
Employment Type: Full-Time
Salary: £150,000 - £175,000 per annum, Negotiable, Inc benefits
Posted:

Senior Platform Engineer

London, United Kingdom
ClearBank Ltd
About you You'll be joining the Team Ocelot in the Infrastructure Cluster as an Engineer. Reporting to an Engineering Manager, you'll be a part of a fast-growing business that is challenging the market and doing things differently. You'll be a playing a critical role in designing, implementing, and maintaining the foundations of our infrastructure platform. … a team than as individuals, contributing to a culture of self-service and continuous improvement. A major part of this role involves working closely with Technical Product Owners and Engineering Managers to shape the art of the possible as we build out our next generation internal platform. You will be Designing and implementing a self-service platform, developing a … robust and scalable self-service platform using methods like Chaos Engineering Define and template Kubernetes resources such as pods, services, deployments, and ingress controllers to ensure our platform is resilient, scalable and secure. Establishing GitOps practices for managing infrastructure and application lifecycles. Taking part in the exploration and adoption of new technologies and practices Maintaining an eye for More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Sr. Software Development Engineer in Test, Blink

London, United Kingdom
Amazon
who will shape the future of our AI-powered automation platform, with a particular focus on modernizing our application testing and deployment pipelines. The ideal candidate will combine deep engineering expertise with strategic thinking to create intelligent, scalable solutions that transform how we approach automation, dramatically reducing the time and complexity of application validation and delivery. This role requires … LLM-based approaches to test script generation, automated debugging, and intelligent test maintenance across our distributed systems Pioneer innovative quality practices that leverage AI for automated performance analysis, intelligent chaos engineering scenarios, and predictive system reliability testing Design self-healing test systems that use machine learning to adapt to application changes, automatically maintain test suites, and provide AI … focused on building solutions that scale across teams, accelerate our testing cycles, and ultimately enable us to deliver higher quality products faster than ever before. About the team Our Engineering Environment At Blink, you'll work within a fully integrated engineering ecosystem where you can test across multiple layers - from algorithms and ASICs to hardware, firmware, AWS infrastructure More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Technology Resilience Manager

London, United Kingdom
Innovation Group
in technology operations, who is looking to broaden their skillset. After developing your specialist skills you are now looking for opportunities to grow and learn more about wider resilience, chaos engineering and cloud services - we will support, provide guidance and mentor you. Nevertheless, we are open to other experiences as we are creating a new diverse and dynamic More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Site Reliability Engineer

London, United Kingdom
Global Processing Services
shape our SRE strategy, establish best practices, and set the standard for service reliability and performance. What You'll Do Define strategies for Application Performance Monitoring, Unit Cost, and Chaos Engineering. Continuously optimize production environments to enhance reliability and efficiency. Implement and apply MTTR, SLO, and SLI principles to ensure high service standards. Respond to incidents, analyze root causes … layers that drive our platform's success. What You Need Proven experience implementing SRE principles at scale, including deep knowledge of SLI/SLO/SLA differences. A product engineering background with strong coding skills in Python, C#, or similar. Experience with incident management frameworks and evolving them for efficiency. Expertise in cloud platforms (AWS preferred) and container orchestration More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:
Chaos Engineering
London
25th Percentile
£103,750
Median
£107,500
75th Percentile
£141,250
90th Percentile
£159,250