London, England, United Kingdom Hybrid / WFH Options
PhysicsX Ltd
for computer vision, geometry processing, or scientific computing; software engineering concepts and best practices (e.g., versioning, testing, CI/CD, API design, MLOps); container-ization and orchestration (Docker, Kubernetes, Slurm); writing pipelines and experiment environments, including running experiments in pipelines in a systematic way. What we offer Be part of something larger: Make an impact and meaningfully shape an More ❯
London, England, United Kingdom Hybrid / WFH Options
ZipRecruiter
days per week. This is a 6 month temporary contract, to start ASAP. Day rate: Competitive Market rate. The right candidate should have a strong understanding of HPC (Slurm) including the installation and configuration. Key Requirements: Strong understanding of Infrastructure (Azure, On-premises and other cloud techs) Knowledge of Cloud Platforms: Understanding of cloud platforms and Azure, in particular … R : The candidate should have a good understanding of R HPC Skills: The candidate should have a strong understanding of HPC (Slurm) including the installation and configuration Experience with Python: The candidate should have experience with the Python installation and configuration on Linux system Associates should have deep understanding of Biostatistics and Life science domain (especially Clinical) knowledge Basic More ❯
London, England, United Kingdom Hybrid / WFH Options
Project Recruit
market rate. Key Requirements: Strong understanding of Infrastructure (Azure, On-premises, and other cloud technologies) Knowledge of Cloud Platforms, especially Azure Proficiency in R programming HPC Skills: experience with Slurm, including installation and configuration Experience with Python installation and configuration on Linux systems Deep understanding of Biostatistics and Life Science domain, especially Clinical Basic understanding of SAS Ability to More ❯
software development tooling, such as Gitlab, Artifactory, or Docker. Experience with infrastructure automation and configuration management, such as Ansible and Terraform. Experience with HPC and orchestration technologies, such as Slurm or Kubernetes. Experience with Databases and Observability systems, such as Elasticsearch, Datadog, Prometheus, PostgreSQL. #J-18808-Ljbffr More ❯
4+ years of experience in DevOps, SRE, or platform engineering roles. Experience with software development (Python, Git) Experience with system administration (Bash, Linux, Containerization) Deep knowledge of HPC (e.g. Slurm) or orchestration technologies (e.g. Kubernetes) Excellent written and verbal communication skills. Ability to work well in a fast-paced environment. Nice to have: Experience with other orchestration technologies (Prefect More ❯
social responsibility. SUSE Stack: System is based on SUSE’s transactional Leap Micro, using Salt Stack configuration through Uyuni. High-Performance Computing (HPC): Experience with HPC and tools like SLURM would be advantageous; either as a user or admin. Distributed Workloads: Background in managing distributed systems. Single Pane Solutions: Familiarity with Rancher, Azure Arc, etc. Workflow Orchestration Tools: Knowledge More ❯
learn, PyTorch) Experience building distributed systems with message buses (Kafka, ZeroMQ) and asynchronous I/O Experience with cloud or on-prem orchestration and scheduling frameworks (Kubernetes, HT Condor, SLURM) Benefits Tower's headquarters are in the historic Equitable Building, right in the heart of NYC's Financial District and our impact is global, with over a dozen offices More ❯
London, England, United Kingdom Hybrid / WFH Options
European Bioinformatics Institute | EMBL-EBI
managerial experience Previous experience working as part of a broad collaborative project Familiarity with cloud technologies (Docker, Kubernetes) Experience with high-performance computing environments and job schedulers such as SLURM Web development skills (JavaScript, CSS, Angular, Bootstrap) Other Helpful Information Contract length: 3 years (grant based) Salary: Monthly salary starting at £4,111 after tax but excluding pension and More ❯
Be among the first 25 applicants Join to apply for the GCP Public Cloud Infrastructure Architect (HPC, GKE) role at Derisk360 Direct message the job poster from Derisk360 Assistant Manager - Talent AcquisitionStrategic Planning | Global Talent Acquisition | Japan and India | Stakeholder Management | Talent... GCP Public Cloud Infrastructure Architect (HPC, GKE) We’re Hiring: GCP Public Cloud Infrastructure Architect (HPC, GKE … What You Bring: 10+ years in cloud infrastructure design or DevOps roles. Proven expertise in Google Cloud infrastructure, GKE, and HPC architecture. Strong background in batch scheduling, job queuing (Slurm), and distributed storage systems. Proficient in Kubernetes internals, pod autoscaling, node management. Skilled in Infrastructure as Code (Terraform, Deployment Manager). Hands-on experience with Docker, Helm, Istio … Trivy or Aqua. Fluent in English, with excellent communication and problem-solving skills. Certification : Google Professional Cloud Architect (mandatory). Nice to Have: Experience with GPU/TPU workloads, Slurm, Intel MPI/OpenMPI. Exposure to hybrid or multi-cloud setups using Anthos or GCVE. Familiarity with GitOps (ArgoCD, Flux), workload identity, and K8s RBAC. Experience in life More ❯
London, England, United Kingdom Hybrid / WFH Options
Derisk360
What You Bring 10+ years in cloud infrastructure design or DevOps roles. Proven expertise in Google Cloud infrastructure, GKE, and HPC architecture. Strong background in batch scheduling, job queuing (Slurm), and distributed storage systems. Proficient in Kubernetes internals, pod autoscaling, node management. Skilled in Infrastructure as Code (Terraform, Deployment Manager). Hands-on experience with Docker, Helm, Istio … Trivy or Aqua. Fluent in English, with excellent communication and problem-solving skills. Certification: Google Professional Cloud Architect (mandatory). Nice To Have Experience with GPU/TPU workloads, Slurm, Intel MPI/OpenMPI. Exposure to hybrid or multi-cloud setups using Anthos or GCVE. Familiarity with GitOps (ArgoCD, Flux), workload identity, and K8s RBAC. Experience in life More ❯
London, England, United Kingdom Hybrid / WFH Options
JR United Kingdom
on-call rotations to support high-priority incidents and escalations. About You Skills & Experience Proven experience supporting HPC and/or AI workloads in production environments. Strong expertise with Slurmworkloadmanager, including tuning and troubleshooting. Proficiency with system-level debugging, including kernel modules and network interfaces. Experience with GPU compute platforms (NVIDIA and/or AMD … settings. Comfort operating in fast-paced, ambiguous, high-growth environments. Nice to have Experience with OpenStack and troubleshooting infrastructure in cloud environments. Kubernetes expertise, particularly in HPC or AI workload contexts. Familiarity with distributed file systems and advanced storage configurations. Understanding of GPU virtualization and multi-tenant HPC architecture. Exposure to machine learning frameworks and AI optimization workflows. Scripting More ❯
Hounslow, England, United Kingdom Hybrid / WFH Options
JR United Kingdom
on-call rotations to support high-priority incidents and escalations. About You Skills & Experience Proven experience supporting HPC and/or AI workloads in production environments. Strong expertise with Slurmworkloadmanager, including tuning and troubleshooting. Proficiency with system-level debugging, including kernel modules and network interfaces. Experience with GPU compute platforms (NVIDIA and/or AMD … settings. Comfort operating in fast-paced, ambiguous, high-growth environments. Nice to have Experience with OpenStack and troubleshooting infrastructure in cloud environments. Kubernetes expertise, particularly in HPC or AI workload contexts. Familiarity with distributed file systems and advanced storage configurations. Understanding of GPU virtualization and multi-tenant HPC architecture. Exposure to machine learning frameworks and AI optimization workflows. Scripting More ❯
currently seeking an HPC Solution Architect based in Hertfordshire or London for an initial 6-month contract. Note: *** INSIDE IR35 *** The candidate should have a strong understanding of HPC (Slurm), including installation and configuration. Main Responsibilities: Contribute to the development and understanding of various architectural levels. Key Skills: Linux Azure Cloud HPC Python Posit Component (Rstudio) GitHub SAS Design More ❯
drive business growth. What You Will Do Enhance our CPU, GPU, HPC, and cloud infrastructure Implement upgrades, patching, and system enhancements Provide expertise with technologies such as Linux, CUDA, SLURM, Python etc. Innovate to maintain the highest standards for our technology stack Drive IT solutions that align with our business objectives Research and evaluate new technology solutions Collaborate with More ❯
AWS and GCP). disaster recovery management leveraging cloud specific capabilities. Cloud Storage concepts (Block storage/Blob storage). job scheduling tools such as Airflow, Prefect Scheduler and Slurm (or other HPC scheduler). designing and maintaining CICD pipelines to ensure fast delivery and integration of the platform services. Contact If this sounds like you, or you'd More ❯
London, England, United Kingdom Hybrid / WFH Options
Mistral AI
Your application will be all the more interesting if you also have: • E xperience in an AI/ML environment • E xperience of high-performance computing (HPC) systems and workload managers (Slurm) • W orked with AI-oriented solutions (Fluidstack, Coreweave, Vast...) Hiring Process •Intro Call (30min) •Tech Culture Interview (30min) •Technical Rounds - System Design Interview (45min) - Deep Dive More ❯
Jupyter, pandas, numpy, sklearn, with ML experience Bachelors or Masters degree in Computer Science, Mathematics, Statistics, or related STEM field from a top-tier university Good understanding of using Slurm or similar parallel computing tools Benefits & Incentives: Significant salary + bonus + benefits Dynamic, fast-paced environment; excellent career growth opportunities Collaborative culture and an energetic, dynamic engineering atmosphere More ❯
programming skills in high-level languages such as Python, Julia. Proficient in modern data science tools stacks (Jupyter, pandas, numpy, sklearn) with machine learning experience. Good understanding of using Slurm or similar parallel computing tools. Bachelor's or Master's degree in Computer Science, Mathematics, Statistics, or related STEM field from a top-ranked University. Proficient in quantitative analysis More ❯
Software Development Engineer, AWS Parallel Computing Service, Slurm team The Parallel Computing Service (PCS) team at AWS is seeking a Software Development Engineer to join the core Slurm team. The role involves building and shipping services that focus on advancing PCS capabilities to run and scale high-performance computing (HPC) workloads using the open-source Slurm scheduler. … experiences are critical to customer success. If you are passionate about High Performance Computing, want to join a collaborative and fast-paced environment, and contribute to the future of Slurm and computation on AWS, we encourage you to apply and be part of our talented team at PCS. Key job responsibilities The ideal candidate has thrived and succeeded in … in a broad range of design approaches and know when it is appropriate to use them (and when it is not). Your solutions are pragmatic. Collaborate with the Slurm maintainers and open-source community to drive improvements and ensure alignment with industry best practices. Provide mentorship and knowledge sharing within the team to facilitate a collaborative and learning More ❯
London, England, United Kingdom Hybrid / WFH Options
Genomics England
both on-premise and AWS. About the Tech Stack Our HPC clusters are built in our on-premises data centres and in AWS. We use IBM LSF for our workload management currently. Hardware wise we have a large footprint of FGPA Servers (DRAGEN) both on-premises and in AWS, as well as standard HPC Compute nodes both on-premises … certifications, we are primarily interested in your real-world experience. Essential Skills and Experience: Extensive knowledge and understanding of HPC Technologies – Including but not limited to IBM LSF, NextFlow, Slurm, AWS Batch. Experience working within an On-Premises estate and working to build/design platform on premise considering physical networking, Bare Metal Servers, and Hardware Lifecycles. Strong Experience More ❯
London, England, United Kingdom Hybrid / WFH Options
TieTalent
both on-premise and AWS. About the Tech Stack Our HPC clusters are built in our on-premises data centres and in AWS. We use IBM LSF for our workload management currently. Hardware wise we have a large footprint of FGPA Servers (DRAGEN) both on-premises and in AWS, as well as standard HPC Compute nodes both on-premises … certifications, we are primarily interested in your real-world experience. Essential Skills And Experience Extensive knowledge and understanding of HPC Technologies – Including but not limited to IBM LSF, NextFlow, Slurm, AWS Batch. Experience working within an On-Premises estate and working to build/design platform on premise considering physical networking, Bare Metal Servers, and Hardware Lifecycles. Strong Experience More ❯
London, England, United Kingdom Hybrid / WFH Options
Genomics England
both on-premise and AWS. About The Tech Stack Our HPC clusters are built in our on-premises data centres and in AWS. We use IBM LSF for our workload management currently. Hardware wise we have a large footprint of FGPA Servers (DRAGEN) both on-premises and in AWS, as well as standard HPC Compute nodes both on-premises … certifications, we are primarily interested in your real-world experience. Essential Skills and Experience: Extensive knowledge and understanding of HPC Technologies – Including but not limited to IBM LSF, NextFlow, Slurm, AWS Batch. Experience working within an On-Premises estate and working to build/design platform on premise considering physical networking, Bare Metal Servers, and Hardware Lifecycles. Strong Experience More ❯
Experience with Gitlab, Bitbucket, and CI tools like GitHub or Bamboo. Willingness to engage in technical discussions and produce high-quality code. Enthusiasm to learn and grow. Knowledge of Slurm and HPC is a bonus. The role involves developing in Python within an SRE team, impacting a greenfield set of services that will enhance a leading European trading platform. More ❯
If You Have Extremely strong software engineering skills. Proficiency in Python and related ML frameworks such as JAX, Pytorch and XLA/MLIR. Experience with distributed training infrastructures (Kubernetes, Slurm) and associated frameworks (Ray). Experience using large-scale distributed training strategies. Hands on experience on training large model at scale. Hands on experience with the post training phase … ago Beckenham, England, United Kingdom 3 weeks ago London, England, United Kingdom 2 weeks ago London, England, United Kingdom 5 days ago London, England, United Kingdom 5 days ago Manager, Global Mobility Services – Financial Services City Of London, England, United Kingdom £70,000.00-£80,000.00 3 weeks ago Divisional Technical Director - Europe & Middle East London, England, United Kingdom … months ago London, England, United Kingdom 3 weeks ago Divisional Technical Director - Europe & Middle East - London, UK London, England, United Kingdom 2 weeks ago US Private Client Senior Tax Manager/Associate Director BU Technical Director - Data Centre Solutions - London London, England, United Kingdom 3 weeks ago Sustainability & Resiliency Area Business Class Leader Croydon, England, United Kingdom 2 weeks More ❯
the future of healthcare today. This company is on the hunt for HPC Engineers to power their 25 Petabyte system Sound good? Well there's more! Imagine working with Slurm clusters and GPFS storage, all while being an integral part of groundbreaking translational research. You will work in adynamic team of five, where your hands-on expertise will support More ❯