AI Platform Engineer

Platform Engineer

Openings X3

Location: London (Hybrid)

Employment Type: Full-time

The Opportunity

We are building a next-generation computational platform that powers large-scale machine learning, data science, and scientific discovery. Our teams work at the intersection of cloud infrastructure, high-performance computing, and data engineering, enabling researchers and ML practitioners to move faster—from experimentation to real-world impact.

This role sits at the heart of the platform: designing, scaling, and operating systems that support GPU-accelerated workloads, batch pipelines, and data-intensive applications.

Who This Role Is For (Choose Your Strength)

We're open to different profiles and will shape the role around your strengths:

AI Platform / ML Infrastructure Engineers

  • Kubernetes-based compute platforms
  • GPU scheduling, batch & distributed workloads
  • Supporting ML training, inference, and experimentation at scale

HPC / GPU Engineers

  • Job schedulers, MPI, multi-node workloads
  • Hybrid cloud and on-prem compute
  • Performance, reliability, and cost optimisation

Strong Data Engineers

  • Large-scale data pipelines and data platforms
  • Data reliability, orchestration, and observability
  • Close collaboration with ML and research teams

What You'll Work On

  • Designing and evolving Kubernetes-based compute platforms across hybrid and multi-cloud environments
  • Building and operating GPU-enabled infrastructure for ML and scientific workloads
  • Developing and maintaining core platform services, APIs, and internal tooling
  • Improving CI/CD pipelines and Infrastructure-as-Code workflows
  • Implementing monitoring, alerting, and reliability engineering practices
  • Ensuring security, data protection, backup, and disaster recovery best practices
  • Partnering closely with ML engineers, data scientists, and researchers to unblock compute and data challenges

What We're Looking For

  • Strong experience in one or more of:
  • Platform / infrastructure engineering
  • ML infrastructure or MLOps
  • HPC or GPU compute
  • Data engineering at scale
  • Solid experience with Linux and cloud environments
  • Hands-on work with Kubernetes or distributed systems
  • Experience with Python (or similar) for automation or services
  • Familiarity with CI/CD, Git-based workflows, and automation
  • Strong problem-solving skills and a collaborative mindset

Bonus

  • Terraform or other IaC tools
  • Slurm, Kueue, Ray, Spark, or similar systems
  • GPU tooling (CUDA, Nvidia operators, schedulers)
  • Experience supporting ML training or data science teams

Job Details

Company
Hlx Life Sciences
Location
Stockport, Greater Manchester, UK
Hybrid / Remote Options
Employment Type
Full-time
Posted