Software Engineer

🚀 GPU Infrastructure / Performance Engineer

📍 London (Onsite) | 🌍 Visa Sponsorship + Relocation

Join a frontier AI company backed by NVIDIA, building large-scale open-weight foundation models alongside researchers and engineers from DeepMind, OpenAI, Meta, Anthropic, and Google Brain.

⚡ What You’ll Do

  • Optimise GPU performance and training efficiency across 1,000+ GPU clusters
  • Improve utilisation, throughput, and reliability across distributed training infrastructure
  • Build tooling for orchestration, monitoring, scheduling, and observability
  • Work closely with research teams to accelerate large-scale model training

🔧 What They’re Looking For

  • Deep GPU infrastructure / distributed systems experience
  • Strong knowledge of CUDA, NCCL, PyTorch, DeepSpeed, JAX, Megatron-LM, vLLM, etc.
  • Experience operating large-scale GPU clusters (1,000+ GPUs)
  • Kubernetes, Slurm, or similar orchestration expertise

BONUS: Experience working on NVIDIA Blackwell chips (B200, B300, GB200, GB300)

💰 Package

  • Salary open to candidate expectations
  • Meaningful startup equity
  • Full visa sponsorship + relocation support

Job Details

Company
Acceler8 Talent
Location
United Kingdom
Posted