Senior SRE Engineer

Senior SRE Engineer | Azure, Observability & Reliability Engineering | Platform Transformation in Financial Services

  • Location: London (Hybrid, typically 3 days onsite)
  • Permanent, Full-time
  • Salary: £80k–£90k + bonus + benefits
  • Visa sponsorship: Not available

The Role

You’ll join as the first dedicated SRE hire , with responsibility for establishing SRE practices across a live Azure-based platform and a new strategic platform being brought into service.

The role is focused on reliability, observability, incident management, resilience, and automation . You’ll help define how services are measured and operated, introducing practical improvements around SLIs, SLOs, error budgets, monitoring, and service ownership.

This is a hands-on role for someone who has done this before and can bring structure, prioritise well, and build an SRE capability in a pragmatic way.

Non-Negotiables

  • Site Reliability Engineering in production environments
  • Azure cloud environments in enterprise-scale businesses
  • SLO / SLI / error budget design and implementation
  • Observability tooling (Prometheus, Grafana, OpenTelemetry or similar)
  • Incident leadership across Sev1 / Sev2 environments
  • Disaster recovery, resilience testing, RTO / RPO
  • Terraform infrastructure as code
  • CI/CD pipelines and engineering enablement
  • Strong scripting with PowerShell, Bash or Python
  • Experience improving reliability in hybrid estates (cloud + IaaS)
  • Ability to introduce new ways of working and build an SRE practice from scratch

They are looking for someone with a strong Azure background, but the priority is proven SRE capability and the ability to apply it effectively.

What You’ll Work With

  • Azure platform engineering
  • Azure Container Apps / cloud-native services
  • Terraform infrastructure as code
  • Prometheus monitoring
  • Grafana dashboards
  • OpenTelemetry tracing
  • Azure DevOps pipelines
  • GitHub Actions CI/CD
  • Windows Server and Linux estates
  • Service Bus, Event Hubs and Kafka
  • Incident management, runbooks, failover and resilience testing

Nice to Haves

  • Financial services or regulated environment experience
  • FCA / PRA operational resilience exposure
  • Payments or FX platform experience
  • Chaos engineering
  • FinOps or cloud cost awareness
  • Kubernetes exposure

Kubernetes knowledge is useful, but not essential.

Why Join / Projects

  • Establish the SRE capability from the ground up
  • Define and implement SLIs, SLOs and error budgets
  • Improve observability across platforms and services
  • Lead incident response and post-incident improvements
  • Drive resilience, failover and automation initiatives
  • Support the move toward a modern, reliability-first platform

You’ll play a key role in shaping how reliability is engineered across both the current platform and a new strategic platform being brought into production.

Employee Benefits

  • Pension
  • Private healthcare
  • Training and certification support

Senior SRE Engineer | Azure, Observability & Reliability Engineering | Platform Transformation in Financial Services

Job Details

Company
Prism Digital
Location
City of London, Greater London, UK
Hybrid / Remote Options
Posted