2 of 2 vLLM Jobs in Scotland

AI Systems Research Engineer

Hiring Organisation
microTECH Global LTD
Location
Edinburgh, Scotland, United Kingdom
Strong knowledge of distributed systems, operating systems, machine learning systems architecture, Inference serving, and AI Infrastructure. · Hands-on experience with LLM serving frameworks (e.g., vLLM, Ray Serve, TensorRT-LLM, TGI) and distributed KV cache optimization. · Proficiency in C/C++, with additional experience in Python for research prototyping. · Solid grounding ...

AI Systems Research Engineer - LLM Optimisation

Hiring Organisation
Project People
Location
City Of Edinburgh, Scotland, United Kingdom
tuning of large-scale inference and data pipelines, focusing on KV cache management, heterogeneous memory scheduling, and high-throughput inference serving using frameworks like vLLM, Ray Serve, and modern PyTorch Distributed systems. Scalable Model Serving Infrastructure : Develop and evaluate frameworks that enable efficient multi-tenant, low-latency, and fault-tolerant … systems, distributed computing, or large-scale AI infrastructure are also welcome At least 2 years of experience with LLM inference/serving framework optimization (vLLM/Ray Serve/TensorRT-LLM/PyTorch) Hands-on experience with distributed KV cache optimization Familiarity with GPU and how they execute LLMs Strong ...