Staff Software Engineer
Senior Software Engineer - AI Infrastructure
We’re working with a hyper growth company.
They are building the GPU infrastructure to the best ai labs and the biggest enterprise companies.
They are building the solution that allows researches to focus on their models, while utilising the phenomenal scale and reliability of the world best ai cloud platform.
The engineering team is small, ambitious, and deeply technical, building the orchestration systems that keep thousands of GPUs running at peak performance across global data centres.
This role sits at the heart of it, designing and scaling the systems that make AI at exascale possible.
What You’ll Focus On
You’ll help shape the orchestration layer for one of the most advanced AI compute environments in the world. Your work will involve:
- Designing core platform services for cluster provisioning, workload orchestration, and resource management APIs.
- Building integrations with schedulers (Kubernetes, Slurm) and container runtimes for reliable, high-performance GPU workloads.
- Developing automation for deployment, imaging, and multi-tenant resource allocation.
- Optimising scheduler performance and resource utilisation across diverse workloads.
- Building lifecycle management and automated remediation systems for large-scale clusters.
- Creating Infrastructure-as-Code modules to support rapid, repeatable deployments across varied environments.
About You
You’re a pragmatic systems builder who thrives in complexity, enjoys autonomy, and understands what it means to own production at scale. You’ll likely bring:
- 5+ years’ experience building distributed systems in Go within cloud-native environments.
- Deep hands-on experience with Kubernetes and container orchestration.
- A strong grasp of Infrastructure-as-Code (Terraform) and configuration management tools (Ansible, Puppet, or similar).
- Experience deploying and operating large-scale GPU clusters or HPC systems.
- Working knowledge of ML infrastructure and familiarity with GPU drivers, CUDA, and container runtimes.
- A low-ego, collaborative approach and a clear, proactive communication style.
In short: This is a role for engineers who like big systems, hard problems, and meaningful ownership. You’ll be joining a team operating at the intersection of software, hardware, and AI.
- Company
- Motive Group
- Location
- London, UK
- Posted
- Company
- Motive Group
- Location
- London, UK
- Posted