AI Systems Research Engineer - LLM Optimisation
- Hiring Organisation
- Project People
- Location
- City Of Edinburgh, Scotland, United Kingdom
using frameworks like vLLM, Ray Serve, and modern PyTorch Distributed systems. Scalable Model Serving Infrastructure : Develop and evaluate frameworks that enable efficient multi-tenant, low-latency, and fault-tolerant AI serving across distributed environments. Research and prototype new techniques for cache sharing, data locality, and resource orchestration ...