Azure Site Reliability Engineer
Azure Site Reliability Engineer
£75,000 - £95,000 + benefits
London (Hybrid)
We’re looking for an Azure SRE to support and operate high-availability platforms processing tens of thousands of transactions per second.
You’ll work closely with engineering teams to maintain reliable, scalable infrastructure and improve platform performance using SRE practices.
Responsibilities
- Operate and support Azure-based infrastructure
- Manage infrastructure using Terraform
- Maintain and support Kubernetes clusters
- Support distributed platforms including Kafka (or similar messaging tools)
- Define and track SLIs, SLOs and error budgets
- Improve monitoring, alerting and incident response
- Reduce operational toil through automation
- Help improve MTTR and MTTD
Requirements
- Strong experience with Microsoft Azure
- Experience with Terraform / Infrastructure as Code
- Hands-on experience with Kubernetes
- Experience with Kafka or other Messaging tools.
- Good understanding of SRE concepts including SLIs, SLOs, error budgets and toil reduction
- Experience working with high-throughput distributed systems
Azure Site Reliability Engineer
£75,000 - £95,000 + benefits
London (Hybrid)