Site Reliability Engineer
Site Reliability Engineer (SRE)
Central London (Hybrid – 3 days per week in the office)
£65,000 – £75,000 per annum + Excellent Benefits
We’re working with an innovative software company that’s scaling its platform to support rapid customer growth and product expansion. They’re looking for a Site Reliability Engineer (SRE) to join their platform team and help build the foundations for high performance, resilience, and automation.
You’ll play a key role in ensuring the reliability, scalability, and security of cloud-based systems that power their core software products. Expect a collaborative engineering culture, modern cloud-native stack, and plenty of freedom to influence tooling, architecture, and reliability practices.
If you’re passionate about automation, observability, and designing systems that just don’t fail, this is the perfect environment for you.
Tech Stack
- Cloud: AWS (EC2, RDS, S3, IAM, Lambda, CloudWatch)
- Containerisation & Orchestration: Docker, Kubernetes (EKS)
- Infrastructure as Code: Terraform
- Configuration Management: Ansible
- Monitoring & Observability: Prometheus, Grafana, ELK Stack
- CI/CD: GitHub Actions
- Scripting & Automation: Python, Bash or Go
What You’ll Be Doing
- Designing and maintaining reliable, scalable, and secure infrastructure for production systems.
- Automating operational tasks and improving system efficiency.
- Implementing observability tooling to monitor system health, performance, and capacity.
- Working closely with development teams to integrate reliability and performance into the software lifecycle.
- Managing and evolving CI/CD pipelines to ensure smooth deployments and rollbacks.
- Contributing to incident response, post-mortems, and reliability improvements.
- Championing SRE principles such as error budgets, SLIs/SLOs, and automation-first thinking.
What We’re Looking For
- Strong experience running cloud infrastructure (AWS preferred) in production.
- Proven background in Kubernetes operations (EKS, Helm, or similar).
- Solid knowledge of monitoring, alerting, and logging (Grafana, Prometheus, ELK).
- Hands-on experience with Terraform and CI/CD tooling.
- Strong scripting or development background (Python, Go, or similar).
- Excellent troubleshooting skills and a proactive, problem-solving mindset.
- Passion for reliability, performance, and developer experience.
Why Join
- Join a fast-moving software company with a modern cloud-native engineering culture.
- Influence how reliability and performance are engineered at scale.
- Work with talented developers and DevOps engineers in a collaborative environment.
AWS | Site Reliability | SRE | Cloud | Kubernetes | Terraform | CI/CD | Observability | Python | Go | Automation
Click “APPLY NOW” to be considered for this position!
Follow ReVybe IT Recruitment to stay up to date with the latest Cloud, Platform & SRE opportunities.
- Company
- Revybe IT Recruitment Ltd
- Location
- City of London, London, England, United Kingdom
- Employment Type
- Full-Time
- Salary
- £65,000 - £75,000 per annum
- Posted
- Company
- Revybe IT Recruitment Ltd
- Location
- City of London, London, England, United Kingdom
- Employment Type
- Full-Time
- Salary
- £65,000 - £75,000 per annum
- Posted