C-Level executives. This requires deep familiarity across the stack - compute infrastructure (Amazon EC2, Amazon EKA), ML frameworks PyTorch, JAX, orchestration layers Kubernetes and Slurm, parallel computing (NCCL, MPI), MLOPs, through to Amazon SageMaker Hyperpod, Amazon Bedrock as well as target use cases in the cloud. This is an More ❯
CCP-EM, Doppio, and CryoCloud. Experience with modern software development tools and practices, including object-oriented programming, Git/GitHub, DevOps tools, AI tools, Slurm, and Google Cloud. Web development, including browser-based visualisation and plug-ins. Proven expertise in leading the analysis and interpretation of scientific data. Closing More ❯
Stevenage, Hertfordshire, United Kingdom Hybrid / WFH Options
WISE Campaign
CCP-EM, Doppio, and CryoCloud. Experience with modern software development tools and practices, including object-oriented programming, Git/GitHub, DevOps tools, AI tools, Slurm, and Google Cloud. Web development, including browser-based visualisation and plug-ins. Proven expertise in leading the analysis and interpretation of scientific data. Closing More ❯
Willingness to engage in technical discussion and commit to producing high quality code Enthusiasm to learn and grow in your role Any understanding of Slurm and HPC a bonus Developing in Python within an SRE team spanning across the business with project and product work, there is a huge More ❯
london (city of london), south east england, united kingdom
Ncounter Technology Recruitment
Willingness to engage in technical discussion and commit to producing high quality code Enthusiasm to learn and grow in your role Any understanding of Slurm and HPC a bonus Developing in Python within an SRE team spanning across the business with project and product work, there is a huge More ❯
to omics datasets Ability to read machine learning research articles and implement the algorithms described Experience of working with high performance computing clusters (Bash, Slurm etc) Good understanding of MLOps for experiment tracking, model and data versioning, hyperparameter tuning and results visualisation Experience in database technologies: SQL, NoSQL. What More ❯
software engineering skills. Proficiency in Python and related ML frameworks such as JAX, Pytorch and XLA/MLIR. Experience with distributed training infrastructures (Kubernetes, Slurm) and associated frameworks (Ray). Experience using large-scale distributed training strategies. Hands on experience on training large model at scale and having contributed More ❯
be all the more interesting if you also have: Experience in an AI/ML environment. Experience of high-performance computing (HPC) systems and workload managers (Slurm). Worked with modern AI-oriented solutions (Fluidstack, Coreweave, Vast ). Benefits Competitive cash salary and equity. Food: Daily lunch vouchers. More ❯
performance of models on accelerated computing (GPU, TPU, AI ASICs) clusters with high-speed networking. Experience scaling model training and inference using technologies like Slurm, ParallelCluster, Amazon SageMaker. Experience in developing and deploying large scale machine learning or deep learning models and/or systems into production, including batch More ❯
performance of models on accelerated computing (GPU, TPU, AI ASICs) clusters with high-speed networking. - Experience scaling model training and inference using technologies like Slurm, ParallelCluster, Amazon SageMaker. - Experience in developing and deploying large scale machine learning or deep learning models and/or systems into production, including batch More ❯
with an open mind and a positive attitude. We value effectiveness, competence, and a growth mindset. Overview: We are seeking a skilled Technical Account Manager (TAM) to serve as a trusted advisor and strategic partner to our diverse customer base. In this role, you will be responsible for building … Technology, Engineering, or a related field (or equivalent experience). 3+ years of experience in a customer-facing technical role, such as Technical Account Manager, Solutions Architect, or Cloud Support Engineer. Strong understanding of cloud architecture, DevOps practices, and tools such as Docker, Kubernetes, SLURM, CI/CD More ❯
Engineer Location: Remote – Sheffield Salary: Up to £45,000 plus an excellent benefits package HPC, Linux Systems Admin, Supercomputing, Scripting, UNIX, Linux, Nvidia, GPU, Slurm, Torque, GPFS, Lustre Chapman Tate Associates seeks a Linux Technical Consultant to join this established technology house that deliver a range of AI, ML … drive continuous improvement. Technical and soft skills needed will include: Proficiency in Linux system administration and shell scripting. Experience with HPC technologies such as Slurm, Torque, OpenMPI, GPFS, Lustre, etc. Strong networking skills, including TCP/IP, DNS, DHCP, and firewall management. Familiarity with monitoring and performance tuning tools. More ❯
Cambridge, England, United Kingdom Hybrid / WFH Options
Gazelle Global
ll play a key role in ensuring the performance, availability, and scalability of HPC systems used by engineering teams across the organization. From managing workload schedulers to enhancing security and performance tuning, you will be at the heart of our mission to deliver world-class compute infrastructure. Key Responsibilities … and performance tuning . Work closely with engineering teams to ensure optimal usage of HPC systems for compute-intensive workloads. Manage and support HPC workload management tools , such as IBM Spectrum LSF . Automate common administrative and maintenance tasks using Shell, Bash, or Python scripting . Ensure the HPC … plus. In-depth knowledge of HPC infrastructure , including cluster management and optimization. Hands-on experience with HPC job schedulers such as IBM Spectrum LSF , SLURM , or similar. Strong scripting skills in Shell , Bash , Python , or Perl . Experience with cloud platforms (AWS, GCP, Azure) is a plus. Familiarity with More ❯
cambridge, east anglia, united kingdom Hybrid / WFH Options
Gazelle Global
ll play a key role in ensuring the performance, availability, and scalability of HPC systems used by engineering teams across the organization. From managing workload schedulers to enhancing security and performance tuning, you will be at the heart of our mission to deliver world-class compute infrastructure. Key Responsibilities … and performance tuning . Work closely with engineering teams to ensure optimal usage of HPC systems for compute-intensive workloads. Manage and support HPC workload management tools , such as IBM Spectrum LSF . Automate common administrative and maintenance tasks using Shell, Bash, or Python scripting . Ensure the HPC … plus. In-depth knowledge of HPC infrastructure , including cluster management and optimization. Hands-on experience with HPC job schedulers such as IBM Spectrum LSF , SLURM , or similar. Strong scripting skills in Shell , Bash , Python , or Perl . Experience with cloud platforms (AWS, GCP, Azure) is a plus. Familiarity with More ❯
Cambridge, south west england, united kingdom Hybrid / WFH Options
Gazelle Global
ll play a key role in ensuring the performance, availability, and scalability of HPC systems used by engineering teams across the organization. From managing workload schedulers to enhancing security and performance tuning, you will be at the heart of our mission to deliver world-class compute infrastructure. Key Responsibilities … and performance tuning . Work closely with engineering teams to ensure optimal usage of HPC systems for compute-intensive workloads. Manage and support HPC workload management tools , such as IBM Spectrum LSF . Automate common administrative and maintenance tasks using Shell, Bash, or Python scripting . Ensure the HPC … plus. In-depth knowledge of HPC infrastructure , including cluster management and optimization. Hands-on experience with HPC job schedulers such as IBM Spectrum LSF , SLURM , or similar. Strong scripting skills in Shell , Bash , Python , or Perl . Experience with cloud platforms (AWS, GCP, Azure) is a plus. Familiarity with More ❯
Qube Research & Technologies (QRT) is a global quantitative and systematic investment manager, operating in all liquid asset classes across the world. We are a technology and data driven group implementing a scientific approach to investing. Combining data, research, technology and trading expertise has shaped QRT's collaborative mindset which … robust distributed systems at scale Flexible in adopting new technologies and a proactive approach to continuous skill development (Preferred) Knowledge of scheduling systems (eg. Slurm) and workload management in large HPC environments (Preferred) Experience optimizing GPU-accelerated workloads (CUDA, NCCL) (Bonus) Experience working within or supporting quantitative research More ❯
hear from you.) 💡 The Stack & Environment: A diverse, modern environment spanning: Linux, Windows, MacOS, Microsoft 365, Azure AD, Intune, Teams, NICE DCV, Nvidia CUDA, Slurm, Jira Service Desk, Terraform, Azure Resource Manager 💡 What We’re Looking For: 2+ years of experience administering HPC infrastructure Hands-on experience with … Infiniband, Slurm, and GPU compute platforms (e.g. CUDA) Proficiency in systems administration and troubleshooting Strong documentation habits and a customer-focused mindset Experience with VDI solutions and monitoring tools 💡 Bonus Points: Familiarity with Jira Service Desk and Terraform scripting Exposure to SSL management, infrastructure-as-code, or cloud database More ❯
london, south east england, united kingdom Hybrid / WFH Options
The Engage Partnership Recruitment
hear from you.) 💡 The Stack & Environment: A diverse, modern environment spanning: Linux, Windows, MacOS, Microsoft 365, Azure AD, Intune, Teams, NICE DCV, Nvidia CUDA, Slurm, Jira Service Desk, Terraform, Azure Resource Manager 💡 What We’re Looking For: 2+ years of experience administering HPC infrastructure Hands-on experience with … Infiniband, Slurm, and GPU compute platforms (e.g. CUDA) Proficiency in systems administration and troubleshooting Strong documentation habits and a customer-focused mindset Experience with VDI solutions and monitoring tools 💡 Bonus Points: Familiarity with Jira Service Desk and Terraform scripting Exposure to SSL management, infrastructure-as-code, or cloud database More ❯
engineering and building a high performance culture within a team. The best of both worlds. You know how to engineer HPC clusters, confident with Slurm for scheduling and GPFS for storage. Linux is the bread and butter and you have exposure to the cloud. If you have experience with More ❯