Software Engineer

🚀 GPU Infrastructure / Performance Engineer

📍 London (Onsite) | 🌍 Visa Sponsorship + Relocation

Join a frontier AI company backed by NVIDIA, building large-scale open-weight foundation models alongside researchers and engineers from DeepMind, OpenAI, Meta, Anthropic, and Google Brain.

⚡ What You’ll Do

Optimise GPU performance and training efficiency across 1,000+ GPU clusters
Improve utilisation, throughput, and reliability across distributed training infrastructure
Build tooling for orchestration, monitoring, scheduling, and observability
Work closely with research teams to accelerate large-scale model training

🔧 What They’re Looking For

Deep GPU infrastructure / distributed systems experience
Strong knowledge of CUDA, NCCL, PyTorch, DeepSpeed, JAX, Megatron-LM, vLLM, etc.
Experience operating large-scale GPU clusters (1,000+ GPUs)
Kubernetes, Slurm, or similar orchestration expertise

BONUS: Experience working on NVIDIA Blackwell chips (B200, B300, GB200, GB300)

💰 Package

Salary open to candidate expectations
Meaningful startup equity
Full visa sponsorship + relocation support

Apply Now

Software Engineer

Job Details