Senior SRE

Role: Senior Site Reliability Engineer

Salary: £70,000 – £75,000 + bonus + benefits

Location: London (Hybrid – 1 day per week in office)

We are working with a mission-led technology organisation that is continuing to scale a fully cloud-native platform as part of a major initiative. As they move away from traditional data centres, they are investing heavily in building a highly reliable, scalable and observable cloud platform.

As a Senior SRE, you will play a key role in ensuring the reliability and performance of systems that support millions of customers. This is a hands-on engineering role where you will work closely with platform, cloud and product teams to embed reliability into everything they build.

You will be solving complex engineering problems across distributed systems, helping improve observability, automation and incident response as the platform continues to scale.

Key Responsibilities

  • Designing, improving and automating monitoring and observability systems
  • Defining and managing SLOs, SLIs and error budgets
  • Supporting incident response, root cause analysis and post-mortems
  • Working with engineering teams to design resilient, fault-tolerant systems
  • Driving automation across infrastructure, deployments and operations
  • Contributing to capacity planning, performance tuning and cost optimisation
  • Participating in design reviews to improve reliability and scalability

Tech Environment

  • GCP and AWS
  • Kubernetes and containerised workloads
  • Terraform and Infrastructure as Code
  • Prometheus, Grafana, Datadog and modern observability tooling
  • CI/CD pipelines and automation tooling
  • Python, Go or similar scripting languages
  • Distributed systems at scale

About You

  • Strong background in SRE, DevOps or Platform Engineering
  • Experience running and supporting production systems at scale
  • Strong understanding of observability, monitoring and reliability principles
  • Hands-on experience with cloud infrastructure and Kubernetes
  • Experience with Infrastructure as Code (Terraform or similar)
  • Comfortable debugging complex systems across infrastructure and application layers
  • Passionate about automation and improving engineering efficiency

This is a great opportunity to join a team building a platform with real-world impact, combining complex engineering challenges with a mission to contribute to a future.

Job Details

Company
Pulse Recruit
Location
City of London, London, United Kingdom
Hybrid / Remote Options
Posted