Senior Site Reliability Engineer
πΌ Principal Platform SRE
π AI Infrastructure Startup
π Gloucester/London - Remote First
π΅ Β£90-120k + Equity
Do you want to work for a pioneering tech company thatβs redefining how AI infrastructure is built and scaled?
Do you want to work in a business critical technical role, utilising all the latest technologies within DevOps, HPC and AI?
My client are a rapidly scaling HPC Cloud & Compute firm who are setting new standards in AI infrastructure. They are backed by leading investors and have just received Series B funding which they will be putting towards a huge tech scale-out
They are now looking for a Principal SRE to join their team and play a pivotal role in the ongoing scaling and optimisation of their platform.
Required Skills:
- Kubernetes (CNI, Cilium, Bare Metal)
- Automation (Ansible, Terraform)
- Linux (Kernal level troubleshooting)
- Cloud (Public, On Prem)
- SRE (SLAs, SLOs, 24/7 Support models)
Benefits:
β Remote First Working (onsite couple times p/m in London or Gloucester)
β Equity
β 30 days annual leave (+ public holidays)
β Β£400 work-from-home allowance
β Private Medical insurance
β 12 Learning & Development days per year + dedicated budget)
β Flexi-Time
If youβre ready to lead large-scale automation initiatives and shape the future of AI infrastructure, apply today!