5 of 5 Low Latency Jobs in Scotland

Systems Research Engineer - LLM Optimisation (vLLM / TensorRT-LLM)

Hiring Organisation
Project People
Location
City Of Edinburgh, Scotland, United Kingdom
using frameworks like vLLM, Ray Serve, and modern PyTorch Distributed systems. Scalable Model Serving Infrastructure : Develop and evaluate frameworks that enable efficient multi-tenant, low-latency, and fault-tolerant AI serving across distributed environments. Research and prototype new techniques for cache sharing, data locality, and resource orchestration ...

Systems Research Engineer - Distributed Systems / C++

Hiring Organisation
European Tech Recruit
Location
Edinburgh, Scotland, United Kingdom
workloads across CPU, GPU, and NPU clusters. Conduct in-depth profiling and performance tuning of inference pipelines, focusing on KV cache management. Develop low-latency, fault-tolerant AI serving frameworks using vLLM, Ray Serve, and PyTorch Distributed. Research and prototype novel techniques for cache sharing, data locality ...

Systems Research Engineer

Hiring Organisation
European Tech Recruit
Location
Edinburgh, Scotland, United Kingdom
using frameworks like vLLM, Ray Serve, and modern PyTorch Distributed systems. Scalable Model Serving Infrastructure: Develop and evaluate frameworks that enable efficient multi-tenant, low-latency, and fault-tolerant AI serving across distributed environments. Research and prototype new techniques for cache sharing, data locality, and resource orchestration ...

Systems Research Engineer - AI Infrastructure / Distributed Systems

Hiring Organisation
European Tech Recruit
Location
Edinburgh, Scotland, United Kingdom
inference pipelines Improve key-value cache efficiency and memory scheduling Identify bottlenecks and enhance system scalability using systematic performance analysis AI Serving Infrastructure Develop low-latency, multi-tenant, fault-tolerant model serving systems Work on areas such as cache sharing, data locality, and cluster scheduling Prototype and evaluate ...

Systems Research Engineer

Hiring Organisation
European Tech Recruit
Location
Edinburgh, Scotland, United Kingdom
pipelines, specifically focusing on KV cache management and heterogeneous memory scheduling. AI Serving: Optimising high-throughput frameworks (vLLM, Ray Serve, PyTorch Distributed) to ensure low-latency, multi-tenant performance. Research Leadership: Contributing to top-tier venues (OSDI, NSDI, EuroSys, MLSys) and driving those innovations into real-world production. ...