Lead HPC & AI Infrastructure Engineer
Dorset, England, United Kingdom
Hybrid / WFH Options
Hybrid / WFH Options
Hays Specialist Recruitment Limited
infrastructure solutions across compute, storage, and networking Producing detailed technical documentation: hardware specs, data centre layouts, cabling, power and cooling Installing and tuning Linux-based operating systems and configuring SLURM job schedulers Optimising high-speed networking technologies (Infiniband, RoCE) Automating deployments and maintenance using Ansible, Terraform, Bash, and Python Troubleshooting complex distributed systems and mentoring junior engineers This is … building systems that scale, this role is for you. What you'll need to succeed Proven experience designing and scaling large HPC clusters (hundreds to thousands of nodes) Strong SLURM configuration skills - partitions, priorities, resource management Advanced Linux administration and performance tuning Expertise in high-performance networking (Infiniband, RoCE, RDMA) Experience with distributed file systems (Lustre, Ceph, WEKA, VAST More ❯
Employment Type: Full-Time
Salary: £130,000 per annum
Posted: