Senior AI Infrastructure Engineer

Senior AI Infrastructure Engineer (OpenStack & Kubernetes)

Location: Remote (UK or EU Preferred) Sector: High-Performance GPU Cloud Computing

The Opportunity

I am representing a fast-growing, international scale-up that is building next-generation GPU cloud infrastructure. This company is the powerhouse behind a high-performance platform designed specifically for the most demanding AI, Machine Learning, and HPC workloads.

As they scale their global footprint to meet massive demand, they are seeking a Senior Infrastructure Engineer who enjoys deep technical autonomy. This is a role for a specialist who wants to move fast, solve complex problems, and have direct ownership over the stability and scalability of business-critical systems.

What You’ll Be Doing

Owning Infrastructure: Designing, deploying, and operating OpenStack and Kubernetes clusters optimized for multi-tenant GPU workloads.
Driving Automation: Building and maintaining infrastructure-as-code and GitOps practices to ensure seamless scalability.
Optimizing Performance: Enabling reliable workload scheduling through Kubernetes-native tooling, container runtime optimization, and NVIDIA integrations.
Ensuring Resilience: Maintaining high availability and observability through proactive monitoring, logging, and incident response.
Strengthening Security: Implementing strong controls, including RBAC and network policies, to ensure tenant isolation.
Cross-Team Collaboration: Working closely with DevOps, AI, and Product teams to align infrastructure capabilities with customer needs.

The Ideal Profile

OpenStack Expert: Significant hands-on experience operating OpenStack in a production environment.
K8s Specialist: Strong experience running production-grade Kubernetes, ideally in bare-metal or private cloud setups.
Systems Generalist: A solid grounding in Linux, networking, and storage with a practical approach to troubleshooting.
Modern Workflows: Experience with infrastructure automation, CI/CD, and Git-based workflows.
Scale-up Mindset: The ability to thrive in a fast-moving environment with a strong sense of accountability.

Nice to Have

Exposure to GPU-based infrastructure, large-scale compute platforms, or HPC.
Familiarity with advanced networking technologies.
Contributions to open-source or cloud-native communities.

What’s on Offer?

Impact: The opportunity to make a visible, meaningful impact on a platform used by teams running compute-heavy applications.
Flexibility: Flexible working arrangements, including remote or hybrid options.
Growth: Clear career progression and the chance to help shape the company's culture and future.
Culture: A collaborative, transparent, and international culture built on trust.
Benefits: Competitive salary, annual discretionary bonus, 25 days holiday (plus public holidays), and wellbeing benefits.

Apply Now

Senior AI Infrastructure Engineer

Job Details