AI Systems Research Engineer

AI Systems Research Engineer

Permanent

Edinburgh (On-site 5 days work )

salary : Competitive and negotiable

In an era where Large Language Models (LLMs) are rebuilding the foundational software stack, our client is at the forefront of reshaping how large-scale models are trained, served, and deployed. Operating at the intersection of advanced systems research and industrial-scale engineering, their Edinburgh-based team is driving new AI Infrastructure & Agentic Serving architectures.

This role is a unique opportunity to help define next-generation large-scale data centres and AI infrastructure systems, turning innovative system designs into deployable, real-world technologies.

We are seeking Systems Research Engineers with a deep passion for computer systems, distributed AI infrastructure, and performance optimization. These roles are ideal for recent PhD graduates or exceptional BSc/MSc engineers looking to build research-driven experience in Operating Systems, Distributed Systems, AI Model Serving, Machine learning infrastructure.

You will work closely with architects to prototype and optimize the next generation of global AI clusters.

What you will be doing

  • Distributed Systems Research & Development : Architect, implement, and evaluate distributed system components for emerging AI and data-centric workloads. Drive modular design and scalability across GPU, and NPU clusters, building highly efficient serving and scheduling systems.
  • Performance Optimization & Profiling : Conduct in-depth profiling and performance tuning of large-scale inference and data pipelines, focusing on KV cache management, heterogeneous memory scheduling, and high-throughput inference serving using frameworks like vLLM, Ray Serve, and modern PyTorch Distributed systems.
  • Scalable Model Serving Infrastructure : Develop and evaluate frameworks that enable efficient multi-tenant, low-latency, and fault-tolerant AI serving across distributed environments. Research and prototype new techniques for cache sharing, data locality, and resource orchestration and scheduling within AI clusters.
  • Research & Publications : Translate innovative research ideas into publishable contributions at leading venues (e.g., OSDI, NSDI, EuroSys, SoCC, MLSys, NeurIPS, ICML, ICLR) while driving internal adoption of novel methods and architectures.
  • Cross-Team Collaboration : Communicate technical insights, research progress, and evaluation outcomes effectively to multidisciplinary stakeholders and global research teams

What we are looking for

  • Bachelor’s or Master’s degree in Computer Science, Electrical Engineering, or related field / PhD in systems, distributed computing, or large-scale AI infrastructure
  • Strong knowledge of distributed systems, operating systems, machine learning systems architecture, Inference serving, and AI Infrastructure.
  • Hands-on experience with LLM Inference and LLM serving frameworks (e.g., vLLM, Ray Serve, TensorRT-LLM, TGI, PyTorch) and distributed KV cache optimization.
  • Familiarity with GPU and how they execute LLMs
  • Solid grounding in systems research methodology, distributed algorithms, and profiling tools.
  • Proficiency in C/C++, with additional experience in Python for research prototyping.
  • Team-oriented mindset with effective technical communication skills

If this sounds like a role you can take hold of, we would love to hear from you! To apply for this role, please send your CV to Maggie Kwong

maggie.kwong@projectpeople.com

Great journeys start here, apply now!

Job Details

Company
Project People
Location
United Kingdom
Posted