Lead Engineer - Software & HPC Engineering

Lead / Senior HPC Engineer
Location: On-site (due to secure, air-gapped systems)
Full-time | 40 hours per week

Are you ready to play a key role in one of the most ambitious technological challenges of our time?

We are a pioneering UK-based deep-tech company developing next-generation solutions at the cutting edge of advanced physics, simulation, and machine learning. Our work is focused on unlocking scalable, clean energy through breakthrough approaches, supported by world-class computational capabilities and innovative engineering.

Alongside our core mission, we collaborate with leading organisations across advanced industries, applying our proprietary simulation tools and technologies to solve complex, high-impact challenges.

This is a rare opportunity to join a highly skilled, mission-driven team working at the forefront of science and engineering innovation.



The Role

We're seeking a Lead HPC Engineer - or an experienced Senior HPC Engineer ready to step up - to take ownership of a large-scale, high-performance computing environment.

You'll support and evolve an HPC cluster of over 10,000 cores, ensuring reliability, performance, and scalability for workloads ranging from single high-precision runs to thousands of parallel simulations.

Working within the Software & HPC Engineering team, you'll collaborate closely with computational scientists, data engineers, and IT specialists to deliver a robust platform that underpins cutting-edge research and development.



Key Responsibilities

  • Maintain and optimise HPC hardware, working with external vendors where required
  • Manage core system software and ensure platform stability
  • Monitor performance, troubleshoot issues, and drive continuous improvements
  • Oversee backups of critical data and system configurations
  • Schedule and perform maintenance aligned with user activity
  • Profile workloads and enhance system efficiency
  • Communicate system status, updates, and major issues to stakeholders
  • Capture user requirements and contribute to upgrade and capacity planning
  • Support procurement processes and vendor negotiations
  • Produce clear documentation for both technical teams and end users
  • Collaborate across engineering and IT teams on shared infrastructure


Current Environment

You'll be working with a modern HPC stack, including:

  • Large-scale multi-vendor server infrastructure (AMD EPYC, Intel Xeon)
  • High-speed networking (100Gb LAN) and high-performance storage systems
  • Linux-based environments (AlmaLinux, Ubuntu)
  • Distributed file systems (Lustre, GlusterFS, NFS)
  • HPC tooling including Slurm, Ansible, and monitoring frameworks
  • Development ecosystems supporting C++, Fortran, MPI, and Python


About You

Essential:

  • Degree in Computer Science (or equivalent experience)
  • Strong expertise in Linux, HPC systems, storage, and networking
  • Experience with MPI and scientific computing environments (C++, Fortran)
  • Familiarity with job schedulers and workload management systems
  • Scripting skills (Shell, Python) and version control (Git)
  • Ability to design, implement, and support complex HPC systems
  • Strong analytical thinking and problem-solving skills
  • Excellent communication and collaboration abilities

Desirable:

  • Deep expertise in HPC optimisation and performance profiling
  • Experience with configuration management tools (e.g. Ansible)
  • Knowledge of containerisation (e.g. Singularity, Apptainer)
  • Experience working with secure or air-gapped environments
  • Familiarity with HPC accounting systems and SQL databases
  • Experience supporting and training end users

Rullion celebrates and supports diversity and is committed to ensuring equal opportunities for both employees and applicants.

Job Details

Company
Rullion Engineering Cumbria
Location
Oxford, Oxfordshire, United Kingdom
Employment Type
Permanent
Posted