Slurm Workload Manager Jobs in London

7 of 7 Slurm Workload Manager Jobs in London

HPC Engineer

London, United Kingdom
LinuxRecruit
the future of healthcare today. This company is on the hunt for HPC Engineers to power their 25 Petabyte system Sound good? Well there's more! Imagine working with Slurm clusters and GPFS storage, all while being an integral part of groundbreaking translational research. You will work in adynamic team of five, where your hands-on expertise will support More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

HPC Research Engineer

London, United Kingdom
LinuxRecruit
research engineer, you will play a pivotal role in managing and optimising a large-scale infrastructure. Your expertise in Linux systems, along with experience in High-Performance Computing (HPC), Slurm workload management, and advanced storage solutions, will be essential to ensuring smooth and efficient operations. You'll be working alongside some of the brightest minds in research, directly More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Member of Technical Staff, Post-Training

London, United Kingdom
Cohere
if you have: Extremely strong software engineering skills. Proficiency in Python and related ML frameworks such as JAX, Pytorch and XLA/MLIR. Experience with distributed training infrastructures (Kubernetes, Slurm) and associated frameworks (Ray). Experience using large-scale distributed training strategies. Hands on experience on training large model at scale. Hands on experience with the post training phase More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Senior Specialist Solutions Architect, AI Accelerator Specialist, Amazon Global Sales, AWS

London, United Kingdom
Amazon
based on Nvidia and AWS infrastructure - Acceleration Technologies - Optimization of Nvidia and Tranium-based GPU cluster archirtectures for ML/AI applications using CUDA/Neuron/EKS/Slurm - Performance Tuning - Maximizing efficiency and throughput across compute-intensive tasks, with knowledge of Nvidia NVLink, AWS Neuron and AWS EFA technologies - Cost Optimization - Strategic resource allocation on a range More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Software Development Engineer II, AWS SageMaker Training

London, United Kingdom
Amazon
support for customers who require specialized security solutions for their cloud services. At AWS AI, we want to make it easy for our customers to train their deep learning workload in the cloud. With Amazon SageMaker Training, we are building customer-facing services to empower data scientists and software engineers in their deep learning endeavors. As our customers rapidly … next-generation AI compute platform that's optimized for LLMs and distributed training.At AWS AI, we want to make it easy for our customers to train their deep learning workload in the cloud. With Amazon SageMaker Training, we are building customer-facing services to empower data scientists and software engineers in their deep learning endeavors. As our customers rapidly … are essential to success in this role. You have solid experience in multi-threaded asynchronous C++ or Go development. You have prior experience in one of: resource orchestrators like slurm/kubernetes, high performance computing, building scalable systems, experience in large language model training. This is a great team to come to have a huge impact on AWS and More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

HPC Linux Operations Engineer Chicago & London

London, United Kingdom
Jump Trading, LLC
compute, storage, and interconnects. Technologies involved include RDMA fabrics, parallel filesystems, HPC batch schedulers, FUSE filesystems, internal Jump software, multi-vendor hardware, cybersecurity requirements, a challenging and unpredictable client workload, and high user expectations Solve problem reports and questions posed by members of Jump's research community, escalating as needed and managing the entire problem lifecycle Respond to alerts … desire for operational work as primary job function 2+ years of professional experience with Linux systems High performance computing (HPC), including parallel filesystems (e.g., Lustre, GPFS), batch systems (e.g., Slurm, Grid Engine), and high-performance network interconnects experience is a plus, but not required High proficiency with at least one programming/scripting language (e.g., Go, Python, C) and More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Lead HPC Engineer

London, United Kingdom
LinuxRecruit
a Lead HPC Engineer, you'll be at the forefront of designing, optimising, and managing advanced computational infrastructure. You'll have a solid grasp of all things HPC, Linux, Slurm, and storage systems (bonus points if you're familiar with GPFS). Your expertise will ensure the systems are reliable, scalable, and high-performing, ready to support researchers in … about emerging technologies will be key to keeping our infrastructure at the forefront of innovation. We're looking for someone with deep expertise in HPC environments, including: Linux systems, workload management, parallel storage, and high-speed networking. You'll also bring strong leadership skills, inspiring and managing teams, while rolling up your sleeves to tackle technical challenges. Clear communication More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted: