Lead DevOps Engineer

Lead DevOps Engineer

Europe / UK (Remote)

We’re building a next-generation AI platform powering large-scale generative media and enterprise-grade creative tools - and we need a DevOps leader to own the infrastructure behind it.

This is not a support function.

This is core platform ownership at the intersection of GenAI, GPU compute, and production-scale systems.

What you’ll own:

  • Architect and scale hybrid cloud + on-prem GPU infrastructure
  • Lead platform engineering across Kubernetes + Slurm environments
  • Build and optimise CI/CD for model training & serving pipelines
  • Drive reliability, observability, and incident management
  • Own vendor strategy (GCP, Datadog, etc.) and technical roadmap
  • Optimise GPU workloads, latency, and cost (FinOps focus)

What you bring:

  • 8+ years in DevOps / SRE / Platform Engineering
  • Deep expertise in Kubernetes, Terraform, CI/CD pipelines
  • Strong experience running production ML or data-heavy systems
  • Solid grounding in cloud (AWS/GCP/Azure) + security best practices
  • Experience leading teams or acting as a technical anchor

Bonus if you’ve worked with:

  • GPU clusters / model serving / ML pipelines
  • HPC, VFX, or high-performance workloads
  • Tools like Prometheus, Grafana, Argo, Airflow, Kafka

You’ll be joining a team building production-grade AI systems used in real-world creative and enterprise environments - where performance, scale, and reliability actually matter.

If you're interested, hit reply & let's discuss the next steps.

Job Details

Company
DeepRec.ai
Location
United Kingdom
Posted