Site Reliability Engineer

Site Reliability Engineer (Hybrid – London) | RegTech Innovator | AWS, Terraform, Kubernetes

Location: London (Hybrid – 2-3 days in office)

Are you passionate about scalable infrastructure and modern DevOps practices? Want to make a tangible impact in a fast-growing RegTech company that’s transforming how businesses navigate regulatory compliance?

Join us as a Site Reliability Engineer (SRE) and help build and operate the infrastructure that powers cutting-edge compliance solutions used by global financial institutions.

What You'll Do

  • Maintain and improve our AWS-based infrastructure using Infrastructure-as-Code (Terraform)
  • Support and scale Kubernetes clusters hosting critical microservices
  • Design and enhance observability, alerting, and incident response processes
  • Collaborate closely with engineers to ensure systems are reliable, secure, and performant
  • Lead root cause analysis for production incidents and help prevent recurrence
  • Build tooling to automate repetitive tasks and improve deployment pipelines (CI/CD)
  • Participate in on-call rotation and provide hands-on production support

Tech Stack

  • Cloud: AWS (EKS, ECS, RDS, IAM, Lambda, etc.)
  • IaC: Terraform, Terragrunt
  • Containerisation: Docker, Kubernetes (EKS)
  • CI/CD: GitHub Actions, Argo CD, Helm
  • Monitoring: Prometheus, Grafana, CloudWatch, OpenTelemetry
  • Languages: Python, Bash, Go (bonus)

What We're Looking For

  • Strong experience in SRE, DevOps, or Production Engineering roles
  • Proven hands-on skills with AWS, Terraform, and Kubernetes
  • Experience with production support, incident management, and RCA practices
  • Comfortable working in a fast-paced startup or scale-up environment
  • Strong problem-solving mindset and a passion for automation
Company
Explore Group
Location
Slough, Berkshire, UK
Hybrid / WFH Options
Employment Type
Full-time
Posted
Company
Explore Group
Location
Slough, Berkshire, UK
Hybrid / WFH Options
Employment Type
Full-time
Posted