High Availability Jobs in London

1 to 25 of 248 High Availability Jobs in London

Infrastructure & Technology Infrastructure Specialist - System Administrator Professional Multi ...

London, United Kingdom
Avature
be crucial in ensuring the seamless operation of our applications, DevOps, middleware, security, and infrastructure components. Key Responsibilities : Provide 24/7 technical support for cloud-based solutions, ensuring high availability and performance across various applications and infrastructure components. Design, build, and maintain infrastructure and configuration as code using tools like Ansible and Terraform. Administer Dev, Test, and More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Senior Site Reliability Engineer

London, United Kingdom
Hybrid / WFH Options
Stratospherec Ltd
tests to identify and remediate bottlenecks Develop and maintain platform solutions, automate infrastructure provisioning, configuration, and management tasks using Infrastructure as Code. Monitor, review and tune databases to ensure high availability and performance Collaborate with product engineering teams to design/build fit-for-purpose and observable software Required Skills and Experience: Proven experience in a SRE/… e.g., Certified Kubernetes Administrator) are a plus Experience in database management/performance tuning, particularly MSSQL. Employee benefits: Opportunity to be a part of a 30+ year well-established, high-performance SaaS company. Excellent Company Pension scheme and Life Insurance, Excellent holiday allowance. A supportive team environment with emphasis on learning and development opportunities Working with a team of … caring, high-performing, and passionate people who have fun supporting our vision, innovation, and continuous improvement. This Senior Site Reliability Engineer role is working for a market leading global software company and this job is part of a large program of change and improvement in their Cloud SaaS products over the coming years. If you are looking for an More ❯
Employment Type: Permanent
Salary: £85000 - £90000/annum Excellent Benefits package
Posted:

Head of AI (London)

London, UK
Scrumconnect Consulting
can rapidly prototype models, optimize for performance, and mentor junior engineers, all while helping define product strategy. In this role, you will: Lead AI strategy and execution in a high-ambiguity environment. Build, train, and deploy state-of-the-art models (e.g., deep learning, NLP, computer vision, reinforcement learning, or relevant domain-specific architectures). Design infrastructure for data … can rapidly prototype models, optimize for performance, and mentor junior engineers, all while helping define product strategy. In this role, you will: Lead AI strategy and execution in a high-ambiguity environment. Build, train, and deploy state-of-the-art models (e.g., deep learning, NLP, computer vision, reinforcement learning, or relevant domain-specific architectures). Design infrastructure for data … models) using PyTorch, TensorFlow, JAX, or equivalent. Productionize models in cloud/on-prem environments (AWS/GCP/Azure) with containerization (Docker/Kubernetes) and ensure low-latency, high-availability inference. Strategic Leadership Develop a multi-quarter AI roadmap aligned with product milestones and fundraising milestones. Identify and evaluate opportunities for AI-driven competitive advantages (e.g., proprietary More ❯
Employment Type: Full-time
Posted:

Senior Site Reliability Engineer (SRE) / Unix

London, United Kingdom
Morgan Hunt UK Limited
Morgan Hunt are seeking an experienced Site Reliability Engineer (SRE)/Unix Infrastructure Engineer to support the deployment, migration, and optimisation of critical infrastructure services. The role involves ensuring high availability, disaster recovery readiness, and automation-driven improvements across RHEL, Oracle DB, Kubernetes, and AWS environments . Key Responsibilities Infrastructure & Deployment Support migration and deployment of services to More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Senior Site Reliability Engineer SRE / Unix

London, South East, England, United Kingdom
Morgan Hunt Recruitment
Morgan Hunt are seeking an experienced Site Reliability Engineer (SRE)/Unix Infrastructure Engineer to support the deployment, migration, and optimisation of critical infrastructure services. The role involves ensuring high availability, disaster recovery readiness, and automation-driven improvements across RHEL, Oracle DB, Kubernetes, and AWS environments . Key Responsibilities Infrastructure & Deployment Support migration and deployment of services to More ❯
Employment Type: Contractor
Rate: £550 per day
Posted:

Principal Solutions Architect

London, United Kingdom
Hybrid / WFH Options
Parser Limited
platform. Key Responsibilities Architectural Design in cloud based environments: Develop and implement robust IT architecture strategies for cloud and hybrid environments, leveraging AWS best practices. Design scalable, secure, and high-availability solutions tailored to business needs. Architect and optimize data platforms to enable efficient data collection, storage, and processing. Implement and manage cloud-native services, including compute, storage More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

SAP Sovereign Cloud Expert DevOps Engineer

London, United Kingdom
SAP SE
closely with development, security, and operations teams, applying DevOps methodologies to streamline processes and enhance system reliability. Performance Optimization : Expertise in tuning cloud applications for cost efficiency, scalability, and high availability , leveraging Azure Autoscaling, Load Balancers, andTraffic Manager . At least 5 years of hands-on experience in Azure Hyperscale/DevOps. Over 10 years of experience in More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Cloud Architect

City of London, London, United Kingdom
Hybrid / WFH Options
Talent Hero Ltd
AWS, Azure, or GCP. If youre skilled at turning business needs into technical cloud solutions we encourage you to apply. Remote cloud architecture roles are both in demand and high-impact, and our service is 100% free for all UK applicants. Applying through Talent Hero gives you direct access to US companies needing cloud leaders ready to make an … Apply once we do the legwork. Your profile is matched to multiple US clients hiring for your skill set. Fast hiring process Responsibilities Design and implement scalable, secure, and high-availability cloud infrastructure Collaborate with engineering, DevOps, and security teams to define architecture best practices Automate infrastructure deployment using IaC tools and CI/CD pipelines Monitor performance More ❯
Employment Type: Contract, Work From Home
Rate: £75,000
Posted:

Cloud Architect - GCP

City of London, London, United Kingdom
Paymentology
Kafka and Kubernetes Platform Management: Design, deploy, and maintain scalable Kafka and Kubernetes clusters to support development and production environments Implement best practices for Kafka and Kubernetes operations , ensuring high availability, performance, and security Monitor, troubleshoot, and optimize Kafka and Kubernetes infrastructure to meet development team needs Implementation: Implement cloud infrastructure components, including compute, storage, networking, and security … for performance, scalability, and cost-efficiency Implement DevOps practices for streamlined deployment and operations Troubleshooting and Support: Provide technical support for cloud infrastructure and services Troubleshoot and resolve performance, availability, and security issues Support production environments and participate in a 24x7 on-call rotation when required Requirements: Experience 7+ years of experience in designing, implementing, and managing cloud-based More ❯
Employment Type: Permanent
Posted:

Principal Platform Engineer

London, United Kingdom
Institutional Shareholder Services Inc
solutions and implementations Experience implementing developer self-service/developer experience portals Strong experience of application modernisation and cloud migration programs Strong Linux and Windows server experience in a high-availability 24/7 operation Experience with the development and deployment of large-scale, complex technology platforms Deep understanding of GCP products across database, serverless, containerization and API … Advanced level expertise in Terraform Extensive experience in designing and implementing DevOps practices Experience with two or more CI/CD solutions Experience coaching and mentoring high-performing teams Pragmatic experience using agile to deliver incremental value Experience working in a global or multinational team setting Strong documentation, communication and collaboration skills Proven ability to drive innovation and continuous More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

SRE Engineer

London, South East, England, United Kingdom
Robert Walters
wide range of AWS native databases including RDS, Aurora, Neptune, as well as CockroachDB. Your daily responsibilities will involve designing robust software solutions that enhance system performance while ensuring high availability for critical applications. You will work hand-in-hand with product engineering teams to improve observability tools and telemetry systems, driving forward automation initiatives that reduce manual More ❯
Employment Type: Contractor
Rate: £400 - £500 per day
Posted:

Site reliability engineer

London, United Kingdom
writer.com
and implementation of our Site Reliability Engineering (SRE) program. The ideal candidate will ensure the reliability, scalability, performance, and security of Writer's critical systems, proactively guaranteeing that our high-ROI products reach customers seamlessly. Your responsibilities: Lead the design, implementation, and maintenance of Writer, Inc.'s cloud infrastructure to ensure high availability and performance. Design and … reliability practices. Is this you? Proven expertise in Site Reliability Engineering with at least 7 years of hands-on experience. Deep understanding of system architecture and infrastructure design for high availability and performance. Bachelor's degree in Computer Science, Engineering, or a related field. Strong proficiency in programming languages such as Python, Java, or Go for automation and More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Solutions Architect [UAE Based] (London)

Surbiton, Greater London, UK
ZipRecruiter
robust microservices-based solution, enabling agility, scalability, and independent service deployments. Define and own the solution architecture for the product, ensuring scalability, configurability, and cloud-agnostic capabilities. Develop HLD (High-Level Design) and LLD (Low-Level Design) documents for the product. Create and maintain the deployment architecture , ensuring efficient and resilient deployment strategies. Design the integration architecture , including APIs … first design . Cloud Platforms: Deep knowledge of key cloud players AWS, Azure, and GCP , ensuring cloud-agnostic design principles. Scalability & Performance Optimization: Expertise in designing scalable, distributed, and high-availability systems. DevOps & CI/CD: Knowledge of Kubernetes, Docker, Terraform, Ansible , and other infrastructure automation tools. Security & Compliance: Understanding of cloud security, management, and regulatory compliance (GDPR … a related field. 6+ years of experience in software architecture and design. Proven experience as a Solution Architect in SaaS-based or cloud-agnostice products . Strong background in high-scale distributed systems, API design, and cloud platforms . Experience in leading architecture for a multi-tenant SaaS or large enterprise application. Certifications: AWS Certified Solutions Architect, Google Professional More ❯
Employment Type: Full-time
Posted:

DV Cleared - Data Engineer - ELK & NiFi

London, UK
Defence
within secure environments Perform troubleshooting, debugging, and performance tuning of data pipelines and the Elastic Stack Build dashboards and visualisations in Kibana to enable data-driven decision-making Ensure high availability and disaster recovery for data systems, implementing appropriate backup and replication strategies Document data architecture, workflows, and security protocols to ensure smooth operational handover and audit readiness. … If you are a skilled Data Systems Engineer with the required security clearance and expertise, we would love to hear from you. Apply now to join our client's high-performing team in Worcester. JBRP1_UKTJ More ❯
Posted:

DV Cleared - Data Engineer - ELK & NiFi

West London, UK
Defence
within secure environments Perform troubleshooting, debugging, and performance tuning of data pipelines and the Elastic Stack Build dashboards and visualisations in Kibana to enable data-driven decision-making Ensure high availability and disaster recovery for data systems, implementing appropriate backup and replication strategies Document data architecture, workflows, and security protocols to ensure smooth operational handover and audit readiness. … If you are a skilled Data Systems Engineer with the required security clearance and expertise, we would love to hear from you. Apply now to join our client's high-performing team in Worcester. JBRP1_UKTJ More ❯
Posted:

DV Cleared - Data Engineer - ELK & NiFi

South West London, UK
Defence
within secure environments Perform troubleshooting, debugging, and performance tuning of data pipelines and the Elastic Stack Build dashboards and visualisations in Kibana to enable data-driven decision-making Ensure high availability and disaster recovery for data systems, implementing appropriate backup and replication strategies Document data architecture, workflows, and security protocols to ensure smooth operational handover and audit readiness. … If you are a skilled Data Systems Engineer with the required security clearance and expertise, we would love to hear from you. Apply now to join our client's high-performing team in Worcester. JBRP1_UKTJ More ❯
Posted:

Global Applications & Eclipse Product Manager

London, United Kingdom
Willis Towers Watson
party services. Oversee API development with product owner and ensure best practices in service-oriented architecture. Team Leadership & Collaboration: Work closely with engineering, DevOps, and support teams to deliver high-quality solutions. Facilitate agile ceremonies, including backlog grooming, sprint planning, and retrospectives. Act as the primary liaison between technical teams and business stakeholders. Operational Excellence & Continuous Improvement: Ensure high availability and reliability of the platform and applications, implementing monitoring and automation as needed. Identify areas for improvement and drive initiatives for performance optimization. Maintain compliance with security, data protection, and industry standards. Vendor relationship management: Manage the relationship with vendor(s) and hold them contractually accountable for all services provided. Qualifications Required Qualifications: Education & Experience: Bachelor's More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Head of Infrastructure Engineering (London)

Wandsworth, Greater London, UK
Spendesk
recovery projects. People: Management and growth of engineers - through 1:1s, performance reviews and objectives, it is important for all that we are able to deliver work of a high standard, in a sustainable manner, and engineers are able to learn, develop and grow their skills and career. Collaborate closely with other engineering teams, product managers, and business leaders … to align infrastructure capabilities with business needs and growth. What we're looking for Technical Experience An expert in modern infrastructure technology with experience in high-availability cloud platforms for SAAS companies. Experience with our specific tech stack is preferred. Understanding of regulatory frameworks like GDPR, ISO27k etc. An advocate for AI technologies and constantly stays up to More ❯
Employment Type: Full-time
Posted:

Lead Fullstack Engineer

London, United Kingdom
Track24 Limited
containerisation (Docker, Kubernetes) and cloud platforms (AWS, GCP or Azure) Skilled in cross-functional collaboration and stakeholder communication Strong analytical skills with a proactive, problem-solving mindset Experience in high-availability systems, cybersecurity frameworks (ISO, SOC), or Elixir development Background in fast-paced, start-up or scale-up environments Interest in stepping into or growing towards an Engineering More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Guidewire Cloud Technical Architect

London, United Kingdom
WeAreTechWomen
and workloads Lead infrastructure strategy for cloud migrations of insurance core systems, including on-prem to cloud transitions. Optimize cloud infrastructure using native services for performance, cost-efficiency, and high availability. Define best practices for cloud operations, monitoring, disaster recovery, and compliance. Insurance Application Cloud Enablement: Provide cloud infrastructure implementation support for core insurance platforms, including Guidewire applications (PolicyCenter More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Guidewire Cloud Technical Architect (London)

Wandsworth, Greater London, UK
WeAreTechWomen
and workloads Lead infrastructure strategy for cloud migrations of insurance core systems, including on-prem to cloud transitions. Optimize cloud infrastructure using native services for performance, cost-efficiency, and high availability. Define best practices for cloud operations, monitoring, disaster recovery, and compliance. Insurance Application Cloud Enablement: Provide cloud infrastructure implementation support for core insurance platforms, including Guidewire applications (PolicyCenter More ❯
Employment Type: Full-time
Posted:

Senior Director - Operations and Reliability Engineering

London, United Kingdom
Boston Consulting Group
Reliability Engineering (SRE), DevOps, and traditional operations models to build a next-generation Reliability Engineering function. This role ensures end-to-end automation at scale, 24x7 operational excellence, and high availability across all of BCG, including BCG Core, BCG X, and Consulting Team (CT) worldwide. The leader will drive strategic planning, execution, and optimization of global IT infrastructure … reliability, compute platforms, and cloud-native services across AWS, Azure, and GCP. Scale Infrastructure as Code (IaC), automated provisioning, and cloud workload optimization. Drive edge computing, containerized workloads, and high-performance computing strategies. Implement AI-driven monitoring, self-healing automation, and full-stack observability. IT Service Management & Operational Excellence: Mandate and assure the adoption of IT Service Management (ITSM … and effective service delivery. Establish SRE-based operational metrics, including SLOs, SLIs, and error budgets. Oversee incident response, problem resolution, and root cause analysis with AI-driven remediation. Ensure high availability, performance, and security compliance for all enterprise services. Develop a follow-the-sun operational support model, ensuring 24x7 resilience and uptime across all of BCG. Optimize incident More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Senior Database Engineer (MySQL) (London)

Wandsworth, Greater London, UK
Partnerize
mentoring junior DBAs and providing technical leadership on database design, optimisation, and migration strategies. Essential Knowledge, Skills and Experience MySQL : Proficiency in MySQL replication (master-slave, master-master) and high availability configurations. Experience in query performance optimisation, including slow query analysis, indexing strategies, and troubleshooting. Strong understanding of schema optimisation (e.g., normalisation, denormalisation, partitioning) to enhance database performance. … in managing MySQL upgrades and schema migrations in production environments, ensuring minimal downtime and data integrity. In-depth knowledge of replication techniques across the various database technologies to ensure high availability, data consistency, and fault tolerance. Experience in setting up and maintaining multi-master replication, geo-replication, GTID and disaster recovery strategies. Proficient in resolving replication lag, failover … database schema changes and migrations to ensure controlled and tested deployments. Apache Druid/Column based databases : Familiarity with setting up and managing replication across Druid clusters, including data availability and data sharding strategies. Experience with query optimisation in Druid, especially for long-running queries in OLAP workloads. Understanding of schema design and optimisation for Druid's columnar data More ❯
Employment Type: Full-time
Posted:

Solace Messaging Administrator

London, Clerkenwell, United Kingdom
Eligo Recruitment Ltd
Solace Messaging Administrator London 3x a week Full-Time Permanent Salary on application You will be responsible for managing and supporting our enterprise messaging infrastructure, ensuring high availability, optimal performance, and reliability across production and non-production environments. This includes working on incident response, capacity planning, network optimization, and system observability using industry-standard monitoring tools. Required Skills … Azure, GCP) and cloud-native deployments. Why Join Us? Be part of a mission-critical team enabling real-time data flows. Work with cutting-edge technologies and contribute to high-impact projects. Eligo Recruitment is acting as an Employment Business in relation to this vacancy. Eligo is proud to be an equal opportunity employer dedicated to fostering diversity and More ❯
Employment Type: Permanent
Posted:

Solace Messaging Administrator

London, South East, England, United Kingdom
Eligo Recruitment
Solace Messaging Administrator London 3x a week Full-Time Permanent Salary on application You will be responsible for managing and supporting our enterprise messaging infrastructure, ensuring high availability, optimal performance, and reliability across production and non-production environments. This includes working on incident response, capacity planning, network optimization, and system observability using industry-standard monitoring tools. Required Skills … Azure, GCP) and cloud-native deployments. Why Join Us? Be part of a mission-critical team enabling real-time data flows. Work with cutting-edge technologies and contribute to high-impact projects. Eligo Recruitment is acting as an Employment Business in relation to this vacancy. Eligo is proud to be an equal opportunity employer dedicated to fostering diversity and More ❯
Employment Type: Full-Time
Salary: Competitive salary
Posted:
High Availability
London
10th Percentile
£58,250
25th Percentile
£62,500
Median
£72,500
75th Percentile
£97,500
90th Percentile
£115,625