Site Reliability Engineer
Site Reliability Engineer (Hybrid – London) | RegTech Innovator | AWS, Terraform, Kubernetes
Location: London (Hybrid – 2-3 days in office)
Are you passionate about scalable infrastructure and modern DevOps practices? Want to make a tangible impact in a fast-growing RegTech company that’s transforming how businesses navigate regulatory compliance?
Join us as a Site Reliability Engineer (SRE) and help build and operate the infrastructure that powers cutting-edge compliance solutions used by global financial institutions.
What You'll Do
- Maintain and improve our AWS-based infrastructure using Infrastructure-as-Code (Terraform)
- Support and scale Kubernetes clusters hosting critical microservices
- Design and enhance observability, alerting, and incident response processes
- Collaborate closely with engineers to ensure systems are reliable, secure, and performant
- Lead root cause analysis for production incidents and help prevent recurrence
- Build tooling to automate repetitive tasks and improve deployment pipelines (CI/CD)
- Participate in on-call rotation and provide hands-on production support
Tech Stack
- Cloud: AWS (EKS, ECS, RDS, IAM, Lambda, etc.)
- IaC: Terraform, Terragrunt
- Containerisation: Docker, Kubernetes (EKS)
- CI/CD: GitHub Actions, Argo CD, Helm
- Monitoring: Prometheus, Grafana, CloudWatch, OpenTelemetry
- Languages: Python, Bash, Go (bonus)
What We're Looking For
- Strong experience in SRE, DevOps, or Production Engineering roles
- Proven hands-on skills with AWS, Terraform, and Kubernetes
- Experience with production support, incident management, and RCA practices
- Comfortable working in a fast-paced startup or scale-up environment
- Strong problem-solving mindset and a passion for automation
- Company
- Explore Group
- Location
- Slough, Berkshire, UK
Hybrid / WFH Options - Employment Type
- Full-time
- Posted
- Company
- Explore Group
- Location
- Slough, Berkshire, UK
Hybrid / WFH Options - Employment Type
- Full-time
- Posted