System Engineer

Systems Research Engineer (AI Infrastructure) - Edinburgh - Global Leading Firm

As large language models reshape the foundational software stack, next-generation AI-native infrastructure is redefining how large-scale models are trained, served, and deployed. We are driving innovation in AI infrastructure and agent-oriented serving architectures to define future large-scale data centre and distributed AI systems.

Key Responsibilities

  • Distributed Systems R&D: Architect and implement distributed system components for AI workloads across heterogeneous clusters (CPU, GPU, accelerators).
  • Performance Optimisation: Conduct deep profiling and performance tuning of large-scale inference pipelines, focusing on KV cache management and memory scheduling.
  • Scalable Serving Infrastructure: Develop frameworks for multi-tenant, low-latency, and fault-tolerant AI serving, researching techniques for cache sharing and data locality.
  • Research & Publications: Translate novel designs into publishable contributions for leading systems and ML venues (e.g., OSDI, SOSP, EuroSys, NeurIPS).
  • Cross-Team Collaboration: Communicate technical insights to multidisciplinary teams and align on long-term infrastructure strategy.

Qualifications

  • Education: BSc, MSc, or PhD in Computer Science, Electrical Engineering, or a related field.
  • Systems Expertise: Strong knowledge of Operating Systems, Distributed Systems, and AI inference serving.
  • Technical Stack: Proficiency in C/C++ for systems development and Python for research prototyping.
  • Hands-on Experience: Experience with LLM serving frameworks, distributed cache optimisation, and performance profiling tools.
  • Preferred: A track record of publications in top-tier systems or ML conferences and practical experience in load balancing or resource scheduling for inference clusters.

Job Details

Company
European Tech Recruit
Location
Edinburgh, Scotland, United Kingdom
Posted