Disaster Recovery Manager
Disaster Recovery & Business Continuity Engineer
Salary: £70,000 – £75,000 + Bonus + Excellent Benefits
Location: London (Hybrid Working Model)
(Sponsorship not provided)
Role Overview
This is a specialist resilience engineering role focused on strengthening an organisation’s ability to recover from major technology disruption, including cyber incidents, infrastructure failure, cloud outages, and data integrity events.
You will play a key role in moving Business Continuity and Disaster Recovery capability beyond documentation into fully engineered, tested, and continuously improved recovery systems.
Working closely with Technology and Security teams, you will design, implement, and validate recovery strategies that ensure critical systems and services can be restored within agreed recovery objectives, with minimal operational impact.
Key Responsibilities
Business Continuity & Disaster Recovery Ownership
- Own and maintain the organisation’s BC/DR strategy across infrastructure and critical systems
- Develop and continuously refine recovery plans aligned to business and operational priorities
- Transition recovery capability from static documentation to engineered, testable solutions
Resilient Infrastructure Design
- Design and support resilient architectures across cloud and on-prem environments
- Ensure systems are built with recovery, redundancy, and fault tolerance as standard
- Implement backup, replication, and failover mechanisms to support rapid restoration of services
Risk & Recovery Management
- Define and manage Recovery Time Objectives (RTO) and Recovery Point Objectives (RPO)
- Validate backup integrity, restore processes, and recovery sequencing
- Identify and remediate gaps in operational resilience and recovery readiness
Testing & Simulation
- Plan and execute disaster recovery tests, failover exercises, and scenario-based simulations
- Run large-scale outage and cyber recovery scenarios with technical teams
- Capture outcomes and drive continuous improvement in recovery capability
Incident Support & Leadership
- Act as a technical SME during major incidents and recovery events
- Provide structured support during high-pressure operational disruption scenarios
- Contribute to post-incident analysis and ensure corrective actions are implemented
Collaboration & Governance
- Work closely with Security teams to align recovery planning with cyber incident response
- Engage with technical and non-technical stakeholders to ensure resilience requirements are met
- Embed recovery considerations into architecture, change, and project processes
Documentation & Capability Development
- Develop clear recovery playbooks for key failure scenarios (cyber, infrastructure, data loss)
- Maintain structured and usable documentation for recovery workflows and escalation paths
- Improve internal capability and reduce dependency on external recovery support
Essential Experience
- 5+ years’ experience in infrastructure, cloud, or platform engineering
- Proven experience designing and implementing Disaster Recovery and Business Continuity solutions
- Strong understanding of cyber-related failure scenarios (ransomware, identity compromise, service disruption)
- Experience supporting incident response and system recovery activities
- Deep knowledge of backup, replication, high availability, and failover architectures
- Experience defining and working with RTO and RPO targets in enterprise environments
- Strong analytical skills with the ability to assess and reduce technical risk
- Ability to remain calm and structured during major incidents
- Strong communication and documentation skills
Desirable Experience
- Cloud, cybersecurity, or resilience-related certifications
- Experience in complex, multi-system enterprise environments
- Exposure to structured disaster recovery testing or cyber simulation exercises
- Background in high-availability or mission-critical infrastructure environments