role Remote £550 Inside ir35 6 Months contract Key Skills needed - Design/implementing Unix/Linux system and services open-source solutions and performance tuning. - HPC technologies: Lustre, Slurm - Configuration systems such as Ansible and Terraform - Unix/Linux scripting. - Networking: TCP/IP, DHCP, VLANs, spanning tree protocol, link aggregation for performance (MTU settings) and reliability requirements. More ❯
role Remote£550 Inside ir35 6 Months contract Key Skills needed - Design/implementing Unix/Linux system and services open-source solutions and performance tuning.- HPC technologies: Lustre, Slurm- Configuration systems such as Ansible and Terraform- Unix/Linux scripting.- Networking: TCP/IP, DHCP, VLANs, spanning tree protocol, link aggregation for performance (MTU settings) and reliability requirements. More ❯
oxford district, south east england, united kingdom
Ellison Institute of Technology
Computing Facility, the HPC Engineer will design, deploy, and optimise systems that enable large-scale data processing, AI-driven analytics, and simulation workloads across. For example deploying Kubernetes and Slurm to enable real-time data analysis from instruments, MLOps, or scientific workflow managers. We will be hiring either at the regular or senior level, depending on the applicant's … computational research workloads. Evaluate and integrate advanced technologies including GPU/TPU acceleration, high-speed interconnects, and parallel file systems. Manage HPC environments, including Linux-based clusters, schedulers (e.g., Slurm), and high-performance storage systems (e.g., Lustre, BeeGFS, GPFS). Implement robust monitoring, fault-tolerance, and capacity management for high availability and reliability. Develop automation scripts and tools (Python … or cloud computing) in scientific or research settings. Proficiency in Linux system administration, networking, and parallel computing (MPI, OpenMP, CUDA, or ROCm). Experience with using HPC job schedulers (Slurm preferred) and parallel file systems (Lustre, BeeGFS, GPFS). At the senior level: Extensive experience designing, deploying, and managing HPC clusters (or cloud computing) in scientific or research settings. More ❯
startup environment Your application will be all the more interesting if you also have: Experience in an AI/ML environment Experience of high performance computing (HPC) systems and workload managers (Slurm) Worked with modern AI oriented solutions (Fluidstack, Coreweave, Vast ) Location & Remote This role is primarily based at one of our European offices (Paris, France and London More ❯
customers, including the network infrastructure, security, server, storage, end user compute and device management. Role Overview : The UNIX Systems Specialist reports to the Unix Systems Group lead, Infrastructure Systems Manager (UNIX), and is responsible for design, management and support in the Linux System Administration team, manage the day-to-day running of the UKAEA Linux based IT Systems, HPC …/BPSS level minimum). Desirable o Experience of managing Linux systems at scale. o Experience managing IT projects. o Experience setting up and supporting batch queueing systems (i.e. slurm) o Experience setting up and supporting Nvidia GPU systems o Ability to write well documented code in a high-level language or script (Python/Perl) o Experience in More ❯
South West London, London, United Kingdom Hybrid/Remote Options
Client Server
a hands-on role at a global systematic trading firm with $25 billion under management, earning significant bonuses. As a Senior Platform Engineer you'll develop and support scalable workload scheduling solutions for HPC environments using tools such as YellowDog within a large scale computing environment with both on-premise and cloud (AWS) based services. You'll collaborate with … with flexibility to work from home 1-2 days a week. About you: You have experience of engineering and supporting at least one HPC scheduler, such as YellowDog, Ray, Slurm or IBM Symphony You have a deep knowledge of Linux You have a good understanding of both loosely coupled and tightly coupled HPC workloads and experience of working on More ❯
will play a pivotal role in shaping the next generation of our compute ecosystem. Key Responsibilities Architect and evolve DRW's distributed compute platforms to support current and emerging workload types across trading, research, and analytics. Develop abstractions, APIs, and developer libraries that simplify job orchestration, data movement, and distributed execution for end users. Collaborate with system and infrastructure … and translate them into scalable, user-friendly solutions. Drive modernization initiatives around containerization, orchestration, and hybrid compute strategies to enable future workloads. Establish best practices and governance for distributed workload design, reproducibility, and reliability. Mentor engineers and guide teams in distributed systems architecture, performance tuning, and platform design. Champion user experience improvements, making the platform intuitive for all technical … performance computing systems at scale. Strong programming skills in Python, C++, or similar languages, with a focus on building APIs or libraries that abstract infrastructure complexity. Deep knowledge of workload schedulers and orchestration platforms (e.g., Slurm, LSF, HTCondor). Experience driving cross-functional projects that span infrastructure and software development teams. Proven ability to enhance developer experience and More ❯
a related field2. Proven industry experience in building, deploying, and maintaining Linux servers (Red Hat/Rocky Linux)3. A working knowledge and practical experience with batch queuing systems (Slurm) and cloud computing, particularly AWSKey Words: Linux Systems Administrator/Scientific Computing/Red Hat/Rocky Linux/Slurm/AWS/Oracle DBA/IT Security More ❯