Senior AI Infrastructure Engineer

Senior AI Infrastructure Engineer (OpenStack & Kubernetes)

Location: Remote (UK or EU Preferred) Sector: High-Performance GPU Cloud Computing

The Opportunity

I am representing a fast-growing, international scale-up that is building next-generation GPU cloud infrastructure. This company is the powerhouse behind a high-performance platform designed specifically for the most demanding AI, Machine Learning, and HPC workloads.

As they scale their global footprint to meet massive demand, they are seeking a Senior Infrastructure Engineer who enjoys deep technical autonomy. This is a role for a specialist who wants to move fast, solve complex problems, and have direct ownership over the stability and scalability of business-critical systems.

What You’ll Be Doing

  • Owning Infrastructure: Designing, deploying, and operating OpenStack and Kubernetes clusters optimized for multi-tenant GPU workloads.
  • Driving Automation: Building and maintaining infrastructure-as-code and GitOps practices to ensure seamless scalability.
  • Optimizing Performance: Enabling reliable workload scheduling through Kubernetes-native tooling, container runtime optimization, and NVIDIA integrations.
  • Ensuring Resilience: Maintaining high availability and observability through proactive monitoring, logging, and incident response.
  • Strengthening Security: Implementing strong controls, including RBAC and network policies, to ensure tenant isolation.
  • Cross-Team Collaboration: Working closely with DevOps, AI, and Product teams to align infrastructure capabilities with customer needs.

The Ideal Profile

  • OpenStack Expert: Significant hands-on experience operating OpenStack in a production environment.
  • K8s Specialist: Strong experience running production-grade Kubernetes, ideally in bare-metal or private cloud setups.
  • Systems Generalist: A solid grounding in Linux, networking, and storage with a practical approach to troubleshooting.
  • Modern Workflows: Experience with infrastructure automation, CI/CD, and Git-based workflows.
  • Scale-up Mindset: The ability to thrive in a fast-moving environment with a strong sense of accountability.

Nice to Have

  • Exposure to GPU-based infrastructure, large-scale compute platforms, or HPC.
  • Familiarity with advanced networking technologies.
  • Contributions to open-source or cloud-native communities.

What’s on Offer?

  • Impact: The opportunity to make a visible, meaningful impact on a platform used by teams running compute-heavy applications.
  • Flexibility: Flexible working arrangements, including remote or hybrid options.
  • Growth: Clear career progression and the chance to help shape the company's culture and future.
  • Culture: A collaborative, transparent, and international culture built on trust.
  • Benefits: Competitive salary, annual discretionary bonus, 25 days holiday (plus public holidays), and wellbeing benefits.

Job Details

Company
Hamilton Barnes 🌳
Location
United Kingdom
Posted