Site Reliability Engineer
Key info:
- Permanent · London · £55,000–£70,000 base
- Location: City of London
- Hybrid pattern: 3 days onsite a week
The role:
A global financial markets client is looking for an SRE to join its Production Services team. You'll support mission-critical trading, clearing, and market data platforms - environments where reliability genuinely isn't optional.
The role blends application support, platform engineering and SRE practice. It suits someone who leans toward automation and observability over reactive firefighting.
Responsibilities:
- Managing OpenShift and Kubernetes clusters across physical, virtual, and containerised environments
- Operating observability stacks (Grafana, Prometheus, Splunk) and driving proactive monitoring
- Automating operational tasks using Python, Bash, or PowerShell
- Supporting CI/CD pipelines (Bamboo, Bitbucket) and IaC delivery via Ansible Tower
- Responding to production incidents and contributing to problem and change management
- Participating in DR exercises and on-call rotation
Key Requirements:
- Hands-on Kubernetes and/or OpenShift experience in production
- Scripting skills in Python, Bash, or PowerShell
- Familiarity with observability tooling and SRE principles
- SQL and database knowledge (MySQL, Oracle, or similar)
- Experience supporting .NET, Java, or microservices applications
It would be great if you had:
- ITIL v3/v4 certification or ServiceNow experience
- Knowledge of Swift payment flows or financial clearing processes
£55,000–£70,000 base depending on experience, plus bonus & benefits. Hybrid working — 3 days per week in the City of London office.
If this looks relevant, feel free to apply or message me directly!