skills Your technical skills: Strong Linux system expertise, with good experience with distribution management (Red Hat, ) HA clusters Strong knowledge of HPC systems and underlying components Parallel filesystems (Lustre, GPFS, ) High-speed network (Infiniband, OmniPath, Slingshot ) DevOps: Ansible, Git, Puppet, Bash or Python scripting, Parallel computing and development software stacks Big Data databases: Elastic/OpenSearch Monitoring tools and dashboards More ❯
London, South East, England, United Kingdom Hybrid / WFH Options
Solutions Through Knowledge
BF16, INT8, etc.) GPU utilization profiling and tuning Inference workload modeling and scaling AI model deployment and performance optimization Storage Design and operation of parallelfile systems (e.g. Lustre, GPFS) Integration and optimization of NVMe storage tiers Modeling storage throughput and demand for AI/HPC workloads We have multiple upcoming roles in the High-Performance Computing industry, so if More ❯
london, south east england, united kingdom Hybrid / WFH Options
Sky
systems, with the ability to work both independently and collaboratively within a global team environment. What you'll do Design, implement, and manage storage solutions using IBM Storage Scale (GPFS), Storage Protect (TSM), Spectrum Archive, Dell/EMC PowerScale (Isilon), Unity, PowerVault, Brocade and Cisco SAN switches, IBM V7000 and tape libraries, and NetBackup. Ensure optimal performance, reliability, and scalability … across the team. Troubleshoot and resolve complex storage-related issues, minimizing downtime and ensuring data integrity. What you'll bring Extensive experience managing, supporting and troubleshooting: IBM Storage Scale (GPFS), Storage Protect (TSM), Spectrum Archive Dell PowerScale (Isilon), Unity, PowerVault storage systems Brocade and Cisco SAN switches IBM V7000/V5000 storage systems Veritas NetBackup Proficiency in Linux, Windows, and More ❯