|
5 of 5 Remote Slurm Workload Manager Jobs in the East of England
Hemel Hempstead, England, United Kingdom Hybrid / WFH Options JR United Kingdom
on-call rotations to support high-priority incidents and escalations. About You Skills & Experience Proven experience supporting HPC and/or AI workloads in production environments. Strong expertise with Slurm workload manager, including tuning and troubleshooting. Proficiency with system-level debugging, including kernel modules and network interfaces. Experience with GPU compute platforms (NVIDIA and/or AMD … settings. Comfort operating in fast-paced, ambiguous, high-growth environments. Nice to have Experience with OpenStack and troubleshooting infrastructure in cloud environments. Kubernetes expertise, particularly in HPC or AI workload contexts. Familiarity with distributed file systems and advanced storage configurations. Understanding of GPU virtualization and multi-tenant HPC architecture. Exposure to machine learning frameworks and AI optimization workflows. Scripting More ❯
Watford, England, United Kingdom Hybrid / WFH Options JR United Kingdom
on-call rotations to support high-priority incidents and escalations. About You Skills & Experience Proven experience supporting HPC and/or AI workloads in production environments. Strong expertise with Slurm workload manager, including tuning and troubleshooting. Proficiency with system-level debugging, including kernel modules and network interfaces. Experience with GPU compute platforms (NVIDIA and/or AMD … settings. Comfort operating in fast-paced, ambiguous, high-growth environments. Nice to have Experience with OpenStack and troubleshooting infrastructure in cloud environments. Kubernetes expertise, particularly in HPC or AI workload contexts. Familiarity with distributed file systems and advanced storage configurations. Understanding of GPU virtualization and multi-tenant HPC architecture. Exposure to machine learning frameworks and AI optimization workflows. Scripting More ❯
Cambridge, England, United Kingdom Hybrid / WFH Options Arm
to ensure high availability and observability. Tune platform performance under high-throughput workloads and lead capacity planning. Automate and execute stress/load tests using both synthetic and real workload profiles. Mentor junior engineers and contribute to architectural direction and code quality. Collaborate cross-functionally with infrastructure, tools, and FinOps teams to ensure the platform remains efficient and cost … and production support practices. Proven experience monitoring production systems, designing actionable alerts, and improving reliability through observability (metrics, logs, tracing). “Nice To Have” Skills And Experience Experience with workload orchestration or job scheduling systems (e.g., AWS Batch, Slurm, LSF). Exposure to compute-intensive domains such as EDA, HPC, or large-scale simulation. Knowledge of FinOps practices More ❯
Cambridge, England, United Kingdom Hybrid / WFH Options ONE NUCLEUS
managerial experience - Previous experience working as part of a broad collaborative project - Familiarity with cloud technologies (Docker, Kubernetes) - Experience with high-performance computing environments and job schedulers such as SLURM Apply now! Benefits and Contract Information - Financial incentives: depending on circumstances, monthly family/marriage allowance of £272 monthly child allowance of £328 per child. Non resident allowance up More ❯
Cambridge, England, United Kingdom Hybrid / WFH Options Arm Limited
Experience working in Linux-based environments, particularly in performance-sensitive contexts General experience working in compute or storage-heavy environments Exposure to basic job scheduling systems (e.g., LSF, Jenkins, SLURM) Familiarity with monitoring tools like Prometheus, Grafana, or Linux-based telemetry Familiarity with profiling tools Ability to troubleshoot issues related to CPU, memory, I/O, or network performance More ❯
|
|