and troubleshooting interconnectivity issues. Linux Systems: Advanced Linux administration skills, including performance tuning and OS-level troubleshooting. Storage Systems: Experience with parallel/distributed file systems (eg Lustre, Ceph, WEKA, VAST). Automation & Scripting: Proficiency in Bash, Python, and tools like Ansible and Terraform for deployment and maintenance. Monitoring & Resilience: Experience implementing monitoring solutions and ensuring high availability and security More ❯
Dorset, England, United Kingdom Hybrid / WFH Options
Hays Specialist Recruitment Limited
Strong SLURM configuration skills - partitions, priorities, resource management Advanced Linux administration and performance tuning Expertise in high-performance networking (Infiniband, RoCE, RDMA) Experience with distributed file systems (Lustre, Ceph, WEKA, VAST) Proficiency in automation and scripting (Ansible, Terraform, Bash, Python) A solid understanding of monitoring, resilience, and security compliance Excellent documentation skills and a passion for mentoring and knowledge sharing More ❯
or leading software development team(s) Deep expertise in HPC and scale-out enterprise storage solutions Knowledge of distributed file systems used for large-scale cluster computing (Lustre, GPFS, WEKA, S3, CEPH, etc.) Strong leadership, communication, and stakeholder management skills Deep technical understanding of commodity storage technologies Knowledgeable on storage industry trends Deep technical understanding of server architecture, design, and More ❯
or leading software development team(s) Deep expertise in HPC and scale-out enterprise storage solutions Knowledge of distributed file systems used for large-scale cluster computing (Lustre, GPFS, WEKA, S3, CEPH, etc.) Strong leadership, communication, and stakeholder management skills Track record of working successfully in a collaborative/cross-team environment Benefits Market-leading salary + bonuses + generous More ❯