Site Reliability Engineer

Key info:

Permanent · London · £55,000–£70,000 base
Location: City of London
Hybrid pattern: 3 days onsite a week

The role:

A global financial markets client is looking for an SRE to join its Production Services team. You'll support mission-critical trading, clearing, and market data platforms - environments where reliability genuinely isn't optional.

The role blends application support, platform engineering and SRE practice. It suits someone who leans toward automation and observability over reactive firefighting.

Responsibilities:

Managing OpenShift and Kubernetes clusters across physical, virtual, and containerised environments
Operating observability stacks (Grafana, Prometheus, Splunk) and driving proactive monitoring
Automating operational tasks using Python, Bash, or PowerShell
Supporting CI/CD pipelines (Bamboo, Bitbucket) and IaC delivery via Ansible Tower
Responding to production incidents and contributing to problem and change management
Participating in DR exercises and on-call rotation

Key Requirements:

Hands-on Kubernetes and/or OpenShift experience in production
Scripting skills in Python, Bash, or PowerShell
Familiarity with observability tooling and SRE principles
SQL and database knowledge (MySQL, Oracle, or similar)
Experience supporting .NET, Java, or microservices applications

It would be great if you had:

ITIL v3/v4 certification or ServiceNow experience
Knowledge of Swift payment flows or financial clearing processes

£55,000–£70,000 base depending on experience, plus bonus & benefits. Hybrid working — 3 days per week in the City of London office.

If this looks relevant, feel free to apply or message me directly!

Apply Now

Site Reliability Engineer

Job Details