Cambridge, Cambridgeshire, United Kingdom Hybrid / WFH Options
Arm Limited
IBM Spectrum LSF administration and any other cloud-based HPC deployments. Deep technical knowledge in handling large distributed linux systems. Experience working with network storage solutions such as NetApp, Lustre, Weka, LakeFS, etc. Experience in networking technologies and services such as load balancing, DNS, packet tracing and debugging, etc. Deep understanding of LDAP implementations such as Oracle Unified Directory (OUD More ❯
communication skills Your technical skills: Strong Linux system expertise, with good experience with distribution management (Red Hat, ) HA clusters Strong knowledge of HPC systems and underlying components Parallel filesystems (Lustre, GPFS, ) High-speed network (Infiniband, OmniPath, Slingshot ) DevOps: Ansible, Git, Puppet, Bash or Python scripting, Parallel computing and development software stacks Big Data databases: Elastic/OpenSearch Monitoring tools and More ❯
least one programming language, preferably in Go. Expertise in patch and OS management at scale Experienced in Linux performance benchmarking, tuning, and troubleshooting Familiarity with distributed storage solutions like Lustre and Ceph Knowledgeable in networking technologies and protocols, including Ethernet and ideally Infiniband Proactive and solution-oriented mindset Excellent problem-solving skills Initiative-driven and able to take ownership What More ❯
Skills You'll Need: A desire for operational work as primary job function 2+ years of professional experience with Linux systems High performance computing (HPC), including parallel filesystems (e.g., Lustre, GPFS), batch systems (e.g., Slurm, Grid Engine), and high-performance network interconnects experience is a plus, but not required High proficiency with at least one programming/scripting language (e.g. More ❯
FP16, BF16, INT8, etc.) GPU utilization profiling and tuning Inference workload modelling and scaling AI model deployment and performance optimization Storage Design and operation of parallel file systems (eg Lustre, GPFS) Integration and optimization of NVMe storage tiers Modeling storage throughput and demand for AI/HPC workloads More ❯
core hours of 10am - 4pm, with no on-call demands, you'll dive deep into the world of cluster computing, Linux kernels, and cutting-edge storage tools like GPFS, Lustre, and Isilon. If you bring strong professional HPC experience, then I want to hear from you! Academic qualifications are secondary to your technical prowess Let's connect and explore how More ❯
Central London, London, United Kingdom Hybrid / WFH Options
STK Recruitment
FP16, BF16, INT8, etc.) GPU utilization profiling and tuning Inference workload modeling and scaling AI model deployment and performance optimization Storage Design and operation of parallel file systems (e.g. Lustre, GPFS) Integration and optimization of NVMe storage tiers Modeling storage throughput and demand for AI/HPC workloads We have multiple upcoming roles in the High-Performance Computing industry, so More ❯
London, South East, England, United Kingdom Hybrid / WFH Options
Solutions Through Knowledge
FP16, BF16, INT8, etc.) GPU utilization profiling and tuning Inference workload modeling and scaling AI model deployment and performance optimization Storage Design and operation of parallel file systems (e.g. Lustre, GPFS) Integration and optimization of NVMe storage tiers Modeling storage throughput and demand for AI/HPC workloads We have multiple upcoming roles in the High-Performance Computing industry, so More ❯
s largest and most critical customers Expected to work directly with customer administrative staff to solve issues Must be willing to quickly engage with customers to resolve problems Resolve Lustre file system issues on large, scalable customer systems and ensure customer satisfaction. Create test plans and procedures for customer upgrades and troubleshooting. Work with engineeringfor enhancing product quality using customer … problem solving. Proven skills and a solid team player. Good verbal and written communication skills (English, second language beneficial). Essential Technical Requirements 7+ years of experience working with Lustre or similar Parallel Filesystems; administration/implementation/support. Strong knowledge of Linux architecture and fundamentals. Good understanding of the technical fundamentals of the system infrastructure including Storage systems Linux More ❯