Senior SRE Lead

Company: Albany Beck

Location: London (Hybrid)

About Albany Beck

Albany Beck is a Management Consultancy focused on providing specialist talent and transformative solutions to Financial Services clients. We combine subject matter expertise with innovative delivery models that help clients scale efficiently, while offering meaningful, long-term career opportunities to our people. At Albany Beck, you’ll be joining an organisation that is passionate about capability build, technical excellence, and delivering meaningful change within complex enterprise environments.

Role Overview

Albany Beck is seeking a Senior SRE Lead / Observability SME to lead the establishment of a new enterprise Site Reliability Engineering (SRE) capability, with a primary focus on designing and implementing a modern observability suite and operational resilience framework.

This is a foundational build role, responsible for defining how reliability engineering and observability are structured, measured, and embedded across a complex global technology estate. The successful candidate will play a key role in shifting the organisation from reactive operational support to a metrics-driven, engineering-led reliability model.

You will work across infrastructure, platform, and application teams to define standards, implement tooling, and establish operational practices that improve service stability, incident response maturity, and end-to-end visibility across systems.

This role is best suited to someone who has helped design or scale SRE and observability capabilities in large, distributed, and regulated environments.

Key Responsibilities

Lead the design, build, and rollout of an enterprise-wide observability capability
Define observability standards, including metrics, logging, tracing, and alerting frameworks
Establish Site Reliability Engineering (SRE) operating model and engineering practices
Develop and embed operational resilience and service reliability measurement frameworks
Design requirements-based architecture for observability and reliability tooling
Improve incident, problem, and outage management maturity across technology teams
Partner with infrastructure, platform, and application support teams to embed SRE principles
Drive transition from reactive operational support to proactive, metrics-driven engineering
Define and implement service level indicators (SLIs) and service level objectives (SLOs)
Support tooling selection, integration, and optimisation across observability platforms
Contribute to improving overall operational resilience within a global distributed environment

Key Skills & Experience

Proven experience as a Senior SRE Lead, Principal Engineer, or Observability SME in enterprise environments
Strong background in designing and implementing observability platforms (metrics, logs, tracing, monitoring)
Experience building or scaling SRE capabilities within large, complex organisations
Strong understanding of operational resilience frameworks and reliability engineering principles
Experience working in private cloud or hybrid enterprise infrastructure environments
Strong knowledge of incident management, problem management, and operational maturity models
Ability to define and implement SLIs, SLOs, and error budgets
Experience working across distributed global technology estates
Strong stakeholder management skills with the ability to influence engineering and infrastructure teams
Experience in transitioning organisations from reactive support models to proactive engineering-led operations
Strong architectural mindset with experience in requirements-based design for observability solutions

Environment

Enterprise SRE capability buildout (greenfield / early maturity stage)
Observability suite implementation across multiple platforms and teams
Private cloud environment with global distributed infrastructure footprint
High complexity, multi-team engineering landscape
Focus on operational resilience and service reliability uplift

Apply Now

Senior SRE Lead

Job Details