High Availability Jobs in London

1 to 25 of 331 High Availability Jobs in London

Hybrid Cloud Solutions in London: Unlocking Efficiency with Windows Server 2025 for SMBs

London, United Kingdom
Hybrid / WFH Options
Server Consultancy Ltd
recovery strategies are fundamental for businesses implementing hybrid cloud solutions in London. Windows Server 2025 elevates these capabilities through advanced cloud-based backup and high-availability configurations, offering a comprehensive framework that supports the resilience of business operations. Here's how Windows Server 2025 enhances these critical areas … redundancy, where data can be replicated in multiple locations across the cloud. This is especially crucial for businesses in London that must maintain data availability and business operations despite local disruptions. Enhanced Business Continuity Plans : High-Availability Configurations : Windows Server 2025 supports high-availability setups More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Systems Development Engineer, Managed Operations

London, United Kingdom
Amazon
part of the AWS Managed Operations team, you will play a pivotal role in building and leading operations and development teams dedicated to delivering high-availability AWS services, including EC2, S3, Dynamo, Lambda, and Bedrock, exclusively for EU customers. For more information on ESC please check out our … A typical day in this role involves collaborating with technology leaders, contributing to the enhancement of day-to-day operations, and ensuring improvements in availability, reliability, latency, performance, and efficiency of the ESC. You will be required to occasionally participate in "on-call" rotations to resolve incidents occurring out … of-hours. The overarching goal is to deliver scalable services and ensure a high-availability experience for EU customers. If you are an experienced professional ready for a challenging and impactful opportunity, we invite you to join our efforts in building a best-in-class development engineering and More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

DevOps Engineer, Managed Operations

London, United Kingdom
Amazon
part of the AWS Managed Operations team, you will play a pivotal role in building and leading operations and development teams dedicated to delivering high-availability AWS services, including EC2, S3, Dynamo, Lambda, and Bedrock, exclusively for EU customers. For more information on ESC please check out our … A typical day in this role involves collaborating with technology leaders, contributing to the enhancement of day-to-day operations, and ensuring improvements in availability, reliability, latency, performance, and efficiency of the ESC. You will be required to occasionally participate in "on-call" rotations to resolve incidents occurring out … of-hours. The overarching goal is to deliver scalable services and ensure a high-availability experience for EU customers. If you are an experienced professional ready for a challenging and impactful opportunity, we invite you to join our efforts in building a best-in-class development engineering and More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Remote Site Reliability Engineer

London, United Kingdom
Hybrid / WFH Options
Paymentology
Engineer (SRE) to join our dedicated team. As an SRE at Paymentology, you'll be the superhero responsible for maintaining, improving, and ensuring the high availability, scalability, and performance of our platform. Tasks Platform Reliability and Scalability: Build software that enhances Paymentology services' scalability and reliability. Ensure platform … will allow you to clearly convey your ideas and recommendations. As a key member of our technical team, you will be expected to maintain high availability and be ready to address critical incidents, ensuring the continuous performance of our systems. This includes being part of an on-call … and development. Ready to Join Us? If you're a gadget guru who thrives on optimizing infrastructure, automating all the things, and delivering sky-high availability and performance, we want to hear from you! Apply now and be part of a company that values your skills and fosters More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Senior Director - Operations and Reliability Engineering

Canary Wharf, Greater London, UK
Boston Consulting Group
traditional operations models to build a next-generation Reliability Engineering function. This role ensures end-to-end automation at scale, 24x7 operational excellence, and high availability across all of BCG, including BCG Core, BCG X, and Consulting Team (CT) worldwide. The leader will drive strategic planning, execution, and … native services across AWS, Azure, and GCP. * Scale Infrastructure as Code (IaC), automated provisioning, and cloud workload optimization. * Drive edge computing, containerized workloads, and high-performance computing strategies. * Implement AI-driven monitoring, self-healing automation, and full-stack observability. IT Service Management & Operational Excellence: * Mandate and assure the adoption … SRE-based operational metrics, including SLOs, SLIs, and error budgets. * Oversee incident response, problem resolution, and root cause analysis with AI-driven remediation. * Ensure high availability, performance, and security compliance for all enterprise services. * Develop a follow-the-sun operational support model, ensuring 24x7 resilience and uptime across More ❯
Employment Type: Full-time
Posted:

Senior Director - Operations and Reliability Engineering

City of London, Greater London, UK
Boston Consulting Group
traditional operations models to build a next-generation Reliability Engineering function. This role ensures end-to-end automation at scale, 24x7 operational excellence, and high availability across all of BCG, including BCG Core, BCG X, and Consulting Team (CT) worldwide. The leader will drive strategic planning, execution, and … native services across AWS, Azure, and GCP. * Scale Infrastructure as Code (IaC), automated provisioning, and cloud workload optimization. * Drive edge computing, containerized workloads, and high-performance computing strategies. * Implement AI-driven monitoring, self-healing automation, and full-stack observability. IT Service Management & Operational Excellence: * Mandate and assure the adoption … SRE-based operational metrics, including SLOs, SLIs, and error budgets. * Oversee incident response, problem resolution, and root cause analysis with AI-driven remediation. * Ensure high availability, performance, and security compliance for all enterprise services. * Develop a follow-the-sun operational support model, ensuring 24x7 resilience and uptime across More ❯
Employment Type: Full-time
Posted:

Site Reliability Engineer

London Area, United Kingdom
IGT Solutions
Purpose Responsible for the proactive support of products so that there is high product performance that is continuously improved. Responsible for identifying and resolving the root causes of operational incidents, implementing solutions to improve stability and prevent recurrence. Manages the creation and maintenance of the event catalog to trigger … enhance efficiency and collaboration between development and operations within service operations. Key Responsibilities Site Reliability Engineer Define, build, and maintain support systems to ensure high availability and performance. Handle complex cases for the Operations team. Build events to add to the event catalog for the relevant product or … external stakeholders for feedback for continual service improvement for inscope products & drive plan till successful closure Accountable for the in scope product to ensure high availability performance. Problem Management Conduct thorough problem investigations and root cause analyses (RCA) to diagnose recurring incidents and service disruptions Coordinate with incident More ❯
Posted:

Site Reliability Engineer

london, south east england, United Kingdom
IGT Solutions
Purpose Responsible for the proactive support of products so that there is high product performance that is continuously improved. Responsible for identifying and resolving the root causes of operational incidents, implementing solutions to improve stability and prevent recurrence. Manages the creation and maintenance of the event catalog to trigger … enhance efficiency and collaboration between development and operations within service operations. Key Responsibilities Site Reliability Engineer Define, build, and maintain support systems to ensure high availability and performance. Handle complex cases for the Operations team. Build events to add to the event catalog for the relevant product or … external stakeholders for feedback for continual service improvement for inscope products & drive plan till successful closure Accountable for the in scope product to ensure high availability performance. Problem Management Conduct thorough problem investigations and root cause analyses (RCA) to diagnose recurring incidents and service disruptions Coordinate with incident More ❯
Posted:

SRE Team Lead - FinTech - London

London, England, United Kingdom
Oliver Bernard
be very “hands on” with a focus on the improvement, automation and engineering of their trading platforms – a cutting edge, low latency and very high availability infrastructure. You’ll take real ownership of release pipelines, performance engineering and work how you want a huge variety of tools and … of tools such as Kubernetes, Terraform, Ansible etc - Good knowledge of DevOps best practices, Cloud and tools (AWS is ideal) - Knowledge of working in high availability/high volume infrastructure environments - Excellent communications skills More ❯
Posted:

SRE Team Lead - FinTech - London

london, south east england, United Kingdom
Oliver Bernard
be very “hands on” with a focus on the improvement, automation and engineering of their trading platforms – a cutting edge, low latency and very high availability infrastructure. You’ll take real ownership of release pipelines, performance engineering and work how you want a huge variety of tools and … of tools such as Kubernetes, Terraform, Ansible etc - Good knowledge of DevOps best practices, Cloud and tools (AWS is ideal) - Knowledge of working in high availability/high volume infrastructure environments - Excellent communications skills More ❯
Posted:

SRE Lead - FinTech - £120K+

London Area, United Kingdom
Oliver Bernard
be very “hands on” with a focus on the improvement, automation and engineering of their trading platforms – a cutting edge, low latency and very high availability infrastructure. You’ll take real ownership of release pipelines, performance engineering and work how you want a huge variety of tools and … experience of containers, monitoring, automation, Cloud and Linux Good knowledge of DevOps best practices, tools and CI/CD Good understanding of working in high availability/high volume infrastructure environments Excellent communications skills The ability to lead others – A vision of “what good looks like More ❯
Posted:

SRE Lead - FinTech - £120K+

london, south east england, United Kingdom
Oliver Bernard
be very “hands on” with a focus on the improvement, automation and engineering of their trading platforms – a cutting edge, low latency and very high availability infrastructure. You’ll take real ownership of release pipelines, performance engineering and work how you want a huge variety of tools and … experience of containers, monitoring, automation, Cloud and Linux Good knowledge of DevOps best practices, tools and CI/CD Good understanding of working in high availability/high volume infrastructure environments Excellent communications skills The ability to lead others – A vision of “what good looks like More ❯
Posted:

Lead SRE - FinTech - £125K

London Area, United Kingdom
Oliver Bernard
be very “hands on” with a focus on the improvement, automation and engineering of their trading platforms – a cutting edge, low latency and very high availability infrastructure. You’ll take real ownership of release pipelines, performance engineering and work how you want a huge variety of tools and … experience of containers, monitoring, automation, Cloud and Linux Good knowledge of DevOps best practices, tools and CI/CD Good understanding of working in high availability/high volume infrastructure environments Excellent communications skills The ability to lead others – A vision of “what good looks like More ❯
Posted:

Lead SRE - FinTech - £125K

london, south east england, United Kingdom
Oliver Bernard
be very “hands on” with a focus on the improvement, automation and engineering of their trading platforms – a cutting edge, low latency and very high availability infrastructure. You’ll take real ownership of release pipelines, performance engineering and work how you want a huge variety of tools and … experience of containers, monitoring, automation, Cloud and Linux Good knowledge of DevOps best practices, tools and CI/CD Good understanding of working in high availability/high volume infrastructure environments Excellent communications skills The ability to lead others – A vision of “what good looks like More ❯
Posted:

Azure SQL DBA

London, United Kingdom
Chambers & Partners
successful candidate will be responsible for designing, implementing, managing, and optimizing Azure SQL database environments. This role requires expertise in cloud-based database administration, high availability solutions, performance tuning, and security best practices. You will collaborate with development, infrastructure, and security teams to ensure efficient database operations that … and Responsibilities Design, implement, and maintain Azure SQL databases to support business applications. Monitor and optimize database performance, security, and availability. Configure and manage high availability and disaster recovery (HA/DR) solutions such as Always On Availability Groups. Perform database tuning, indexing, and query optimization. Implement … ARM templates. Experience supporting CI/CD pipelines for databases (e.g., Azure DevOps, GitHub Actions ). Knowledge of data replication, mirroring, and Always On Availability Groups . Ability to troubleshoot and resolve database-related performance and connectivity issues. Microsoft certification in Azure Database Administration (DP-300) or equivalent. Experience More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Systems Engineer , APAC Controls Deployment Team

London, United Kingdom
Amazon
load balancing, and failover clustering. Skilled in network administration of larger layer two networks with layer three switch and routing concepts. Other relevant skills: High-availability Configurations: Skilled in designing and maintaining high-availability systems through clustering services, load balancer setups, and redundancy planning. Experienced in More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

DevOps Engineer - Azure, C#, Banking

London Area, United Kingdom
The JM Longbridge Group
a permanent role based in the City on a hybrid basis, offering a salary of £80K - £100K depending on experience. You will join a high-performing engineering team supporting the design, deployment, and management of secure, scalable, and high-availability environments. Responsibilities: Own and evolve DevOps practices … like Bicep, Helm, PowerShell, Bash. Understanding of networking fundamentals, infrastructure security, and identity management. Proven experience in deploying enterprise-grade cloud-native solutions with high availability and performance. Please apply for immediate interview More ❯
Posted:

DevOps Engineer - Azure, C#, Banking

london, south east england, United Kingdom
The JM Longbridge Group
a permanent role based in the City on a hybrid basis, offering a salary of £80K - £100K depending on experience. You will join a high-performing engineering team supporting the design, deployment, and management of secure, scalable, and high-availability environments. Responsibilities: Own and evolve DevOps practices … like Bicep, Helm, PowerShell, Bash. Understanding of networking fundamentals, infrastructure security, and identity management. Proven experience in deploying enterprise-grade cloud-native solutions with high availability and performance. Please apply for immediate interview More ❯
Posted:

AWS Software Engineer

London, United Kingdom
Cloud Bridge
maintain scalable applications using a variety of AWS services. You'll collaborate closely with cross-functional teams to ensure that applications are developed with high availability, performance, and security in mind. Key Responsibilities: Design and deploy cloud-based solutions using AWS services (e.g., Lambda, EC2, S3, RDS, DynamoDB … . Collaborate with teams to build secure, scalable architectures. Integrate AWS services using SDKs, APIs, and CLI, ensuring high availability and performance. Automate CI/CD pipelines with AWS CodePipeline, CodeDeploy, and CodeBuild. Monitor and optimize application performance with CloudWatch, X-Ray, and CloudTrail. Implement security measures, including More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Senior Site Reliability Engineer (SRE)

City Of London, England, United Kingdom
Hybrid / WFH Options
Fruition Group
shaping the infrastructure and operational strategy for one of the most innovative businesses in their market. Working with cutting-edge technology, this role offers high-impact challenges, meaningful collaboration, and excellent career progression. Senior SRE Responsibilities Manage and optimise cloud infrastructure to ensure scalability, high availability, and … as PowerShell or Python. Champion infrastructure best practices and mentor junior team members. Senior SRE Requirements Extensive experience in SRE or DevOps roles within high-availability, cloud-native environments. Strong expertise with AWS (including EKS, MSK, RDS, VPC design, encryption, and IAM). Experience with Kubernetes and Argo More ❯
Posted:

Senior Site Reliability Engineer (SRE)

london (city of london), south east england, United Kingdom
Hybrid / WFH Options
Fruition Group
shaping the infrastructure and operational strategy for one of the most innovative businesses in their market. Working with cutting-edge technology, this role offers high-impact challenges, meaningful collaboration, and excellent career progression. Senior SRE Responsibilities Manage and optimise cloud infrastructure to ensure scalability, high availability, and … as PowerShell or Python. Champion infrastructure best practices and mentor junior team members. Senior SRE Requirements Extensive experience in SRE or DevOps roles within high-availability, cloud-native environments. Strong expertise with AWS (including EKS, MSK, RDS, VPC design, encryption, and IAM). Experience with Kubernetes and Argo More ❯
Posted:

Java Technical Lead - Travel Sector

London Area, United Kingdom
Smart Sourcer
travel company in a ‘hands on’ technical lead role with some design & architecture responsibility. At the core of this business is a global, distributed, high availability, microservices based platform for online booking and the associated millions of transactions and this role leads the team that develops and maintains … a huge opportunity to shape the technical direction of the platform. You’ll require the following skills: 2+ years’ design & architecture experience of scalable, high-availability platforms for enterprise applications 5+ years’ software development experience with Java, Spring Boot, Hibernate, SQL and Linux Extensive experience of Microservices and More ❯
Posted:

Java Technical Lead - Travel Sector

london, south east england, United Kingdom
Smart Sourcer
travel company in a ‘hands on’ technical lead role with some design & architecture responsibility. At the core of this business is a global, distributed, high availability, microservices based platform for online booking and the associated millions of transactions and this role leads the team that develops and maintains … a huge opportunity to shape the technical direction of the platform. You’ll require the following skills: 2+ years’ design & architecture experience of scalable, high-availability platforms for enterprise applications 5+ years’ software development experience with Java, Spring Boot, Hibernate, SQL and Linux Extensive experience of Microservices and More ❯
Posted:

Tech Lead - .Net (All genders)

London, United Kingdom
Entain / GVC Holdings
a cross-functional team, collaborating closely with the Product Owner (PO) to plan and achieve desired outcomes while reporting results. This role involves ensuring high-quality technical contributions, addressing risks, and fostering a sustainable and efficient delivery process. Key Responsibilities: Delivery Management: Oversee the delivery of complex projects, ensuring … knowledge through workshops and advisory sessions and participate in cross-team technical initiatives. Qualifications Technical Skills: .NET Expertise: Proficient in C# for low latency, high availability, high transaction systems. Web Services: Strong background in RESTful APIs and web services. Databases: Experience with SQL (MSSQL and PostgreSQL) Cloud More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

AWS Network Architect

London Area, United Kingdom
Cognizant
and Kubernetes. Monitor cloud infrastructure performance and costs using AWS CloudWatch, CloudTrail, and Cost Explorer. Troubleshoot complex system issues and implement solutions to ensure high availability and disaster recovery. Collaborate with development teams to integrate cloud services and maintain production environments. Provide technical guidance, mentorship, and documentation for … such as Docker and Kubernetes. Experience in AWS CloudWatch, CloudTrail, and Cost Explorer. Experience in Troubleshooting complex system issues and implement solutions to ensure high availability and disaster recovery. At Cognizant you will experience an exciting mix of innovation by design, creativity, collaboration, and efficiency within a framework More ❯
Posted:
High Availability
London
10th Percentile
£55,000
25th Percentile
£61,250
Median
£72,940
75th Percentile
£97,500
90th Percentile
£120,000