role Remote£550 Inside ir35 6 Months contract Key Skills needed - Design/implementing Unix/Linux system and services open-source solutions and performance tuning.- HPC technologies: Lustre, Slurm- Configuration systems such as Ansible and Terraform- Unix/Linux scripting.- Networking: TCP/IP, DHCP, VLANs, spanning tree protocol, link aggregation for performance (MTU settings) and reliability requirements. More ❯
oxford district, south east england, united kingdom
Ellison Institute of Technology
Computing Facility, the HPC Engineer will design, deploy, and optimise systems that enable large-scale data processing, AI-driven analytics, and simulation workloads across. For example deploying Kubernetes and Slurm to enable real-time data analysis from instruments, MLOps, or scientific workflow managers. We will be hiring either at the regular or senior level, depending on the applicant's … computational research workloads. Evaluate and integrate advanced technologies including GPU/TPU acceleration, high-speed interconnects, and parallel file systems. Manage HPC environments, including Linux-based clusters, schedulers (e.g., Slurm), and high-performance storage systems (e.g., Lustre, BeeGFS, GPFS). Implement robust monitoring, fault-tolerance, and capacity management for high availability and reliability. Develop automation scripts and tools (Python … or cloud computing) in scientific or research settings. Proficiency in Linux system administration, networking, and parallel computing (MPI, OpenMP, CUDA, or ROCm). Experience with using HPC job schedulers (Slurm preferred) and parallel file systems (Lustre, BeeGFS, GPFS). At the senior level: Extensive experience designing, deploying, and managing HPC clusters (or cloud computing) in scientific or research settings. More ❯
Stevenage, Hertfordshire, South East, United Kingdom
Anson Mccade
scripting, particularly Bash, Python, and at least one other language. Clustering: Experience with clustered environments and cluster orchestration tools. Storage: Experience with clustered, parallel file systems (e.g., Lustre). Workload Management: Experience managing batch scheduling systems (PBS Pro, Slurm, SGE/UGE, etc.). HPC Knowledge: Knowledge of HPC management systems (e.g., Bright). Networking/Storage Admin More ❯