Lead DevOps Engineer
Lead DevOps Engineer
Europe / UK (Remote)
We’re building a next-generation AI platform powering large-scale generative media and enterprise-grade creative tools - and we need a DevOps leader to own the infrastructure behind it.
This is not a support function.
This is core platform ownership at the intersection of GenAI, GPU compute, and production-scale systems.
What you’ll own:
- Architect and scale hybrid cloud + on-prem GPU infrastructure
- Lead platform engineering across Kubernetes + Slurm environments
- Build and optimise CI/CD for model training & serving pipelines
- Drive reliability, observability, and incident management
- Own vendor strategy (GCP, Datadog, etc.) and technical roadmap
- Optimise GPU workloads, latency, and cost (FinOps focus)
What you bring:
- 8+ years in DevOps / SRE / Platform Engineering
- Deep expertise in Kubernetes, Terraform, CI/CD pipelines
- Strong experience running production ML or data-heavy systems
- Solid grounding in cloud (AWS/GCP/Azure) + security best practices
- Experience leading teams or acting as a technical anchor
Bonus if you’ve worked with:
- GPU clusters / model serving / ML pipelines
- HPC, VFX, or high-performance workloads
- Tools like Prometheus, Grafana, Argo, Airflow, Kafka
You’ll be joining a team building production-grade AI systems used in real-world creative and enterprise environments - where performance, scale, and reliability actually matter.
If you're interested, hit reply & let's discuss the next steps.