AI Infrastructure Engineer

Key Responsibilities:

  • Collaborative engineering: Work within a larger team to rapidly develop proof-of-concept prototypes to validate research ideas and integrate them into production systems and infrastructure
  • Performance Analysis: Conduct in-depth profiling and tuning of operating systems and large-scale distributed systems, leveraging heterogeneous hardware (CPU, NPU).
  • Documentation and Reporting: Maintain clear technical documentation of research findings, design decisions, and implementation details to ensure reproducibility and facilitate knowledge transfer within the team.
  • Research & Technology Exploration: Stay current with the latest advancements in AI infrastructure, cloud-native technologies, and operating systems. E.g. techniques to efficiently execute inference workload based on SW/HW co-design; exploit workload characteristics to prefetch memory/minimize communication.
  • Stakeholder Communication: Present project milestones, performance metrics, and key findings to internal stakeholders.

Person Specification:

Required:

  • Bachelor's or Master's degree in Computer Science or a related technical field.
  • A solid background in operating systems and/or distributed systems and/or ML systems.
  • Excellent programming skills, master of at least one language, such as C/C++.
  • Good communication and teamwork skills.
  • Be comfortable with research methodology.

Desired:

  • Familiarity with current LLM architectures (e.g. Llama3, DeepSeek V3)
  • Familiarity with production LLM serving systems and inference optimizations (e.g. VLLM)
  • Experience with accelerator programming (e.g. CUDA, Triton) and communication libraries (e.g. NCCL)
Company
Project People
Location
Edinburgh, City of Edinburgh, United Kingdom
Employment Type
Contract
Salary
£40000 - £50000/annum
Posted
Company
Project People
Location
Edinburgh, City of Edinburgh, United Kingdom
Employment Type
Contract
Salary
£40000 - £50000/annum
Posted