role Remote£550 Inside ir35 6 Months contract Key Skills needed - Design/implementing Unix/Linux system and services open-source solutions and performance tuning.- HPC technologies: Lustre, Slurm- Configuration systems such as Ansible and Terraform- Unix/Linux scripting.- Networking: TCP/IP, DHCP, VLANs, spanning tree protocol, link aggregation for performance (MTU settings) and reliability requirements. More ❯
of shared compute environments. Past experience with diagnosing and resolving network/storage/CPU/RAM bottlenecks across complex workloads. Experience deploying and managing a grid compute system (Slurm/LSF/SGE). Proficiency with containerisation frameworks (Docker/Singularity). More ❯
oxford district, south east england, united kingdom
Ellison Institute of Technology
Computing Facility, the HPC Engineer will design, deploy, and optimise systems that enable large-scale data processing, AI-driven analytics, and simulation workloads across. For example deploying Kubernetes and Slurm to enable real-time data analysis from instruments, MLOps, or scientific workflow managers. We will be hiring either at the regular or senior level, depending on the applicant's … computational research workloads. Evaluate and integrate advanced technologies including GPU/TPU acceleration, high-speed interconnects, and parallel file systems. Manage HPC environments, including Linux-based clusters, schedulers (e.g., Slurm), and high-performance storage systems (e.g., Lustre, BeeGFS, GPFS). Implement robust monitoring, fault-tolerance, and capacity management for high availability and reliability. Develop automation scripts and tools (Python … or cloud computing) in scientific or research settings. Proficiency in Linux system administration, networking, and parallel computing (MPI, OpenMP, CUDA, or ROCm). Experience with using HPC job schedulers (Slurm preferred) and parallel file systems (Lustre, BeeGFS, GPFS). At the senior level: Extensive experience designing, deploying, and managing HPC clusters (or cloud computing) in scientific or research settings. More ❯
customers, including the network infrastructure, security, server, storage, end user compute and device management. Role Overview : The UNIX Systems Specialist reports to the Unix Systems Group lead, Infrastructure Systems Manager (UNIX), and is responsible for design, management and support in the Linux System Administration team, manage the day-to-day running of the UKAEA Linux based IT Systems, HPC …/BPSS level minimum). Desirable o Experience of managing Linux systems at scale. o Experience managing IT projects. o Experience setting up and supporting batch queueing systems (i.e. slurm) o Experience setting up and supporting Nvidia GPU systems o Ability to write well documented code in a high-level language or script (Python/Perl) o Experience in More ❯
South West London, London, United Kingdom Hybrid/Remote Options
Client Server
a hands-on role at a global systematic trading firm with $25 billion under management, earning significant bonuses. As a Senior Platform Engineer you'll develop and support scalable workload scheduling solutions for HPC environments using tools such as YellowDog within a large scale computing environment with both on-premise and cloud (AWS) based services. You'll collaborate with … with flexibility to work from home 1-2 days a week. About you: You have experience of engineering and supporting at least one HPC scheduler, such as YellowDog, Ray, Slurm or IBM Symphony You have a deep knowledge of Linux You have a good understanding of both loosely coupled and tightly coupled HPC workloads and experience of working on More ❯
a related field2. Proven industry experience in building, deploying, and maintaining Linux servers (Red Hat/Rocky Linux)3. A working knowledge and practical experience with batch queuing systems (Slurm) and cloud computing, particularly AWSKey Words: Linux Systems Administrator/Scientific Computing/Red Hat/Rocky Linux/Slurm/AWS/Oracle DBA/IT Security More ❯