Systems Research Engineer - LLM Optimisation (vLLM / TensorRT-LLM)
- Hiring Organisation
- Project People
- Location
- City Of Edinburgh, Scotland, United Kingdom
Systems Research Engineer - LLM Optimisation (vLLM/TensorRT-LLM) Permanent Edinburgh City Centre (On-site 5 days), walking distance from local transport links Salary : Competitive and negotiable, generous benefits package In an era where Large Language Models (LLMs) are rebuilding the foundational software stack, our client is at the forefront … tuning of large-scale inference and data pipelines, focusing on KV cache management, heterogeneous memory scheduling, and high-throughput inference serving using frameworks like vLLM, Ray Serve, and modern PyTorch Distributed systems. Scalable Model Serving Infrastructure : Develop and evaluate frameworks that enable efficient multi-tenant, low-latency, and fault-tolerant ...