London, England, United Kingdom Hybrid / WFH Options
PhysicsX Ltd
for computer vision, geometry processing, or scientific computing; software engineering concepts and best practices (e.g., versioning, testing, CI/CD, API design, MLOps); container-ization and orchestration (Docker, Kubernetes, Slurm); writing pipelines and experiment environments, including running experiments in pipelines in a systematic way. What we offer Be part of something larger: Make an impact and meaningfully shape an More ❯
ensure resolution. Essential Qualifications A bachelor’s degree or master’s degree in Computer Science or related field. 5+ years of experience administering HPC clusters and systems. Experience with SLURM and Grid Engine scheduling software. 5+ years of professional experience in Solution Architecture or Cloud Infrastructure Deployment and support. 7+ years professional experience developing or administering compute solutions for … working with cross-functional IT (Public Cloud skills being a plus) and sciences skillsets. Experience with Python, R, or other related data science programming. Experience with POSIT products (Package Manager, Connect, Workbench) either in an end-user or administrator capacity. Experience working with databases and/or supporting. Experience managing large amounts of data effectively. Experience working with AI …/ML technologies. Experience with containerizing compute workload via Docker or Singularity. Experience with Nvidia DGX systems. Additional information Great talent should benefit from a great work environment. If you join our team, you’ll have access to: A competitive salary and bonus package based on experience. Comprehensive health and wellness benefits, including Medical, Dental, and Vision Insurance. Company More ❯
Farnborough, Hampshire, United Kingdom Hybrid / WFH Options
Lenovo
Analyse and characterise scientific codes and build performance extrapolation to future generation of HPC/AI hardware. Interact with customers and the Lenovo sales team to offer insight into workload performance characteristics that drive system configurations. Complete competitive comparison studies of different technologies to showcase Intel technology advantages. Develop seller enablement collateral for Lenovo Sellers and Business Partners, and … or accelerated applications using more than one of OpenMP, MPI, CUDA, ROCm, OpenCL, SYCL paradigms. Experience of production HPC environment: large-scale filesystems (ideally Storage Scale), batch scheduling (ideally SLURM) as well as common HPC SW and management tools. Experience with analysis and profiling tools for HPC/AI codes: Intel OneAPI suite (Vtune ), AMD (uProf), nVidia toolkit. HPC More ❯
for both front-end and back-end components to ensure best practices are followed across the development process. Manage high-performance computing (HPC) setups, such as AWS ParallelCluster or Slurm, to support large-scale data processing tasks. Promote the use of serverless principles and microservice patterns within the development team. Required Qualifications: Experience with Cloud Native and Cloud AI More ❯
London, England, United Kingdom Hybrid / WFH Options
ZipRecruiter
days per week. This is a 6 month temporary contract, to start ASAP. Day rate: Competitive Market rate. The right candidate should have a strong understanding of HPC (Slurm) including the installation and configuration. Key Requirements: Strong understanding of Infrastructure (Azure, On-premises and other cloud techs) Knowledge of Cloud Platforms: Understanding of cloud platforms and Azure, in particular … R : The candidate should have a good understanding of R HPC Skills: The candidate should have a strong understanding of HPC (Slurm) including the installation and configuration Experience with Python: The candidate should have experience with the Python installation and configuration on Linux system Associates should have deep understanding of Biostatistics and Life science domain (especially Clinical) knowledge Basic More ❯
London, England, United Kingdom Hybrid / WFH Options
Project Recruit
market rate. Key Requirements: Strong understanding of Infrastructure (Azure, On-premises, and other cloud technologies) Knowledge of Cloud Platforms, especially Azure Proficiency in R programming HPC Skills: experience with Slurm, including installation and configuration Experience with Python installation and configuration on Linux systems Deep understanding of Biostatistics and Life Science domain, especially Clinical Basic understanding of SAS Ability to More ❯
software development tooling, such as Gitlab, Artifactory, or Docker. Experience with infrastructure automation and configuration management, such as Ansible and Terraform. Experience with HPC and orchestration technologies, such as Slurm or Kubernetes. Experience with Databases and Observability systems, such as Elasticsearch, Datadog, Prometheus, PostgreSQL. #J-18808-Ljbffr More ❯
4+ years of experience in DevOps, SRE, or platform engineering roles. Experience with software development (Python, Git) Experience with system administration (Bash, Linux, Containerization) Deep knowledge of HPC (e.g. Slurm) or orchestration technologies (e.g. Kubernetes) Excellent written and verbal communication skills. Ability to work well in a fast-paced environment. Nice to have: Experience with other orchestration technologies (Prefect More ❯
social responsibility. SUSE Stack: System is based on SUSE’s transactional Leap Micro, using Salt Stack configuration through Uyuni. High-Performance Computing (HPC): Experience with HPC and tools like SLURM would be advantageous; either as a user or admin. Distributed Workloads: Background in managing distributed systems. Single Pane Solutions: Familiarity with Rancher, Azure Arc, etc. Workflow Orchestration Tools: Knowledge More ❯
learn, PyTorch) Experience building distributed systems with message buses (Kafka, ZeroMQ) and asynchronous I/O Experience with cloud or on-prem orchestration and scheduling frameworks (Kubernetes, HT Condor, SLURM) Benefits Tower's headquarters are in the historic Equitable Building, right in the heart of NYC's Financial District and our impact is global, with over a dozen offices More ❯
London, England, United Kingdom Hybrid / WFH Options
European Bioinformatics Institute | EMBL-EBI
managerial experience Previous experience working as part of a broad collaborative project Familiarity with cloud technologies (Docker, Kubernetes) Experience with high-performance computing environments and job schedulers such as SLURM Web development skills (JavaScript, CSS, Angular, Bootstrap) Other Helpful Information Contract length: 3 years (grant based) Salary: Monthly salary starting at £4,111 after tax but excluding pension and More ❯
Be among the first 25 applicants Join to apply for the GCP Public Cloud Infrastructure Architect (HPC, GKE) role at Derisk360 Direct message the job poster from Derisk360 Assistant Manager - Talent AcquisitionStrategic Planning | Global Talent Acquisition | Japan and India | Stakeholder Management | Talent... GCP Public Cloud Infrastructure Architect (HPC, GKE) We’re Hiring: GCP Public Cloud Infrastructure Architect (HPC, GKE … What You Bring: 10+ years in cloud infrastructure design or DevOps roles. Proven expertise in Google Cloud infrastructure, GKE, and HPC architecture. Strong background in batch scheduling, job queuing (Slurm), and distributed storage systems. Proficient in Kubernetes internals, pod autoscaling, node management. Skilled in Infrastructure as Code (Terraform, Deployment Manager). Hands-on experience with Docker, Helm, Istio … Trivy or Aqua. Fluent in English, with excellent communication and problem-solving skills. Certification : Google Professional Cloud Architect (mandatory). Nice to Have: Experience with GPU/TPU workloads, Slurm, Intel MPI/OpenMPI. Exposure to hybrid or multi-cloud setups using Anthos or GCVE. Familiarity with GitOps (ArgoCD, Flux), workload identity, and K8s RBAC. Experience in life More ❯
London, England, United Kingdom Hybrid / WFH Options
Derisk360
What You Bring 10+ years in cloud infrastructure design or DevOps roles. Proven expertise in Google Cloud infrastructure, GKE, and HPC architecture. Strong background in batch scheduling, job queuing (Slurm), and distributed storage systems. Proficient in Kubernetes internals, pod autoscaling, node management. Skilled in Infrastructure as Code (Terraform, Deployment Manager). Hands-on experience with Docker, Helm, Istio … Trivy or Aqua. Fluent in English, with excellent communication and problem-solving skills. Certification: Google Professional Cloud Architect (mandatory). Nice To Have Experience with GPU/TPU workloads, Slurm, Intel MPI/OpenMPI. Exposure to hybrid or multi-cloud setups using Anthos or GCVE. Familiarity with GitOps (ArgoCD, Flux), workload identity, and K8s RBAC. Experience in life More ❯
Bath, England, United Kingdom Hybrid / WFH Options
JR United Kingdom
on-call rotations to support high-priority incidents and escalations. About You Skills & Experience Proven experience supporting HPC and/or AI workloads in production environments. Strong expertise with Slurmworkloadmanager, including tuning and troubleshooting. Proficiency with system-level debugging, including kernel modules and network interfaces. Experience with GPU compute platforms (NVIDIA and/or AMD … settings. Comfort operating in fast-paced, ambiguous, high-growth environments. Nice to have Experience with OpenStack and troubleshooting infrastructure in cloud environments. Kubernetes expertise, particularly in HPC or AI workload contexts. Familiarity with distributed file systems and advanced storage configurations. Understanding of GPU virtualization and multi-tenant HPC architecture. Exposure to machine learning frameworks and AI optimization workflows. Scripting More ❯
Slough, England, United Kingdom Hybrid / WFH Options
JR United Kingdom
on-call rotations to support high-priority incidents and escalations. About You Skills & Experience Proven experience supporting HPC and/or AI workloads in production environments. Strong expertise with Slurmworkloadmanager, including tuning and troubleshooting. Proficiency with system-level debugging, including kernel modules and network interfaces. Experience with GPU compute platforms (NVIDIA and/or AMD … settings. Comfort operating in fast-paced, ambiguous, high-growth environments. Nice to have Experience with OpenStack and troubleshooting infrastructure in cloud environments. Kubernetes expertise, particularly in HPC or AI workload contexts. Familiarity with distributed file systems and advanced storage configurations. Understanding of GPU virtualization and multi-tenant HPC architecture. Exposure to machine learning frameworks and AI optimization workflows. Scripting More ❯
Woking, England, United Kingdom Hybrid / WFH Options
JR United Kingdom
on-call rotations to support high-priority incidents and escalations. About You Skills & Experience Proven experience supporting HPC and/or AI workloads in production environments. Strong expertise with Slurmworkloadmanager, including tuning and troubleshooting. Proficiency with system-level debugging, including kernel modules and network interfaces. Experience with GPU compute platforms (NVIDIA and/or AMD … settings. Comfort operating in fast-paced, ambiguous, high-growth environments. Nice to have Experience with OpenStack and troubleshooting infrastructure in cloud environments. Kubernetes expertise, particularly in HPC or AI workload contexts. Familiarity with distributed file systems and advanced storage configurations. Understanding of GPU virtualization and multi-tenant HPC architecture. Exposure to machine learning frameworks and AI optimization workflows. Scripting More ❯
Aberdeen, Scotland, United Kingdom Hybrid / WFH Options
JR United Kingdom
on-call rotations to support high-priority incidents and escalations. About You Skills & Experience Proven experience supporting HPC and/or AI workloads in production environments. Strong expertise with Slurmworkloadmanager, including tuning and troubleshooting. Proficiency with system-level debugging, including kernel modules and network interfaces. Experience with GPU compute platforms (NVIDIA and/or AMD … settings. Comfort operating in fast-paced, ambiguous, high-growth environments. Nice to have Experience with OpenStack and troubleshooting infrastructure in cloud environments. Kubernetes expertise, particularly in HPC or AI workload contexts. Familiarity with distributed file systems and advanced storage configurations. Understanding of GPU virtualization and multi-tenant HPC architecture. Exposure to machine learning frameworks and AI optimization workflows. Scripting More ❯
Liverpool, England, United Kingdom Hybrid / WFH Options
JR United Kingdom
on-call rotations to support high-priority incidents and escalations. About You Skills & Experience Proven experience supporting HPC and/or AI workloads in production environments. Strong expertise with Slurmworkloadmanager, including tuning and troubleshooting. Proficiency with system-level debugging, including kernel modules and network interfaces. Experience with GPU compute platforms (NVIDIA and/or AMD … settings. Comfort operating in fast-paced, ambiguous, high-growth environments. Nice to have Experience with OpenStack and troubleshooting infrastructure in cloud environments. Kubernetes expertise, particularly in HPC or AI workload contexts. Familiarity with distributed file systems and advanced storage configurations. Understanding of GPU virtualization and multi-tenant HPC architecture. Exposure to machine learning frameworks and AI optimization workflows. Scripting More ❯
Brighton, England, United Kingdom Hybrid / WFH Options
JR United Kingdom
on-call rotations to support high-priority incidents and escalations. About You Skills & Experience Proven experience supporting HPC and/or AI workloads in production environments. Strong expertise with Slurmworkloadmanager, including tuning and troubleshooting. Proficiency with system-level debugging, including kernel modules and network interfaces. Experience with GPU compute platforms (NVIDIA and/or AMD … settings. Comfort operating in fast-paced, ambiguous, high-growth environments. Nice to have Experience with OpenStack and troubleshooting infrastructure in cloud environments. Kubernetes expertise, particularly in HPC or AI workload contexts. Familiarity with distributed file systems and advanced storage configurations. Understanding of GPU virtualization and multi-tenant HPC architecture. Exposure to machine learning frameworks and AI optimization workflows. Scripting More ❯
Glasgow, Scotland, United Kingdom Hybrid / WFH Options
JR United Kingdom
on-call rotations to support high-priority incidents and escalations. About You Skills & Experience Proven experience supporting HPC and/or AI workloads in production environments. Strong expertise with Slurmworkloadmanager, including tuning and troubleshooting. Proficiency with system-level debugging, including kernel modules and network interfaces. Experience with GPU compute platforms (NVIDIA and/or AMD … settings. Comfort operating in fast-paced, ambiguous, high-growth environments. Nice to have Experience with OpenStack and troubleshooting infrastructure in cloud environments. Kubernetes expertise, particularly in HPC or AI workload contexts. Familiarity with distributed file systems and advanced storage configurations. Understanding of GPU virtualization and multi-tenant HPC architecture. Exposure to machine learning frameworks and AI optimization workflows. Scripting More ❯
Cheltenham, England, United Kingdom Hybrid / WFH Options
JR United Kingdom
on-call rotations to support high-priority incidents and escalations. About You Skills & Experience Proven experience supporting HPC and/or AI workloads in production environments. Strong expertise with Slurmworkloadmanager, including tuning and troubleshooting. Proficiency with system-level debugging, including kernel modules and network interfaces. Experience with GPU compute platforms (NVIDIA and/or AMD … settings. Comfort operating in fast-paced, ambiguous, high-growth environments. Nice to have Experience with OpenStack and troubleshooting infrastructure in cloud environments. Kubernetes expertise, particularly in HPC or AI workload contexts. Familiarity with distributed file systems and advanced storage configurations. Understanding of GPU virtualization and multi-tenant HPC architecture. Exposure to machine learning frameworks and AI optimization workflows. Scripting More ❯
Reading, England, United Kingdom Hybrid / WFH Options
JR United Kingdom
on-call rotations to support high-priority incidents and escalations. About You Skills & Experience Proven experience supporting HPC and/or AI workloads in production environments. Strong expertise with Slurmworkloadmanager, including tuning and troubleshooting. Proficiency with system-level debugging, including kernel modules and network interfaces. Experience with GPU compute platforms (NVIDIA and/or AMD … settings. Comfort operating in fast-paced, ambiguous, high-growth environments. Nice to have Experience with OpenStack and troubleshooting infrastructure in cloud environments. Kubernetes expertise, particularly in HPC or AI workload contexts. Familiarity with distributed file systems and advanced storage configurations. Understanding of GPU virtualization and multi-tenant HPC architecture. Exposure to machine learning frameworks and AI optimization workflows. Scripting More ❯
High Wycombe, England, United Kingdom Hybrid / WFH Options
JR United Kingdom
on-call rotations to support high-priority incidents and escalations. About You Skills & Experience Proven experience supporting HPC and/or AI workloads in production environments. Strong expertise with Slurmworkloadmanager, including tuning and troubleshooting. Proficiency with system-level debugging, including kernel modules and network interfaces. Experience with GPU compute platforms (NVIDIA and/or AMD … settings. Comfort operating in fast-paced, ambiguous, high-growth environments. Nice to have Experience with OpenStack and troubleshooting infrastructure in cloud environments. Kubernetes expertise, particularly in HPC or AI workload contexts. Familiarity with distributed file systems and advanced storage configurations. Understanding of GPU virtualization and multi-tenant HPC architecture. Exposure to machine learning frameworks and AI optimization workflows. Scripting More ❯
Bournemouth, England, United Kingdom Hybrid / WFH Options
JR United Kingdom
on-call rotations to support high-priority incidents and escalations. About You Skills & Experience Proven experience supporting HPC and/or AI workloads in production environments. Strong expertise with Slurmworkloadmanager, including tuning and troubleshooting. Proficiency with system-level debugging, including kernel modules and network interfaces. Experience with GPU compute platforms (NVIDIA and/or AMD … settings. Comfort operating in fast-paced, ambiguous, high-growth environments. Nice to have Experience with OpenStack and troubleshooting infrastructure in cloud environments. Kubernetes expertise, particularly in HPC or AI workload contexts. Familiarity with distributed file systems and advanced storage configurations. Understanding of GPU virtualization and multi-tenant HPC architecture. Exposure to machine learning frameworks and AI optimization workflows. Scripting More ❯
Portsmouth, England, United Kingdom Hybrid / WFH Options
JR United Kingdom
on-call rotations to support high-priority incidents and escalations. About You Skills & Experience Proven experience supporting HPC and/or AI workloads in production environments. Strong expertise with Slurmworkloadmanager, including tuning and troubleshooting. Proficiency with system-level debugging, including kernel modules and network interfaces. Experience with GPU compute platforms (NVIDIA and/or AMD … settings. Comfort operating in fast-paced, ambiguous, high-growth environments. Nice to have Experience with OpenStack and troubleshooting infrastructure in cloud environments. Kubernetes expertise, particularly in HPC or AI workload contexts. Familiarity with distributed file systems and advanced storage configurations. Understanding of GPU virtualization and multi-tenant HPC architecture. Exposure to machine learning frameworks and AI optimization workflows. Scripting More ❯