4 of 4 Low Latency Jobs in Scotland

Systems Research Engineer - Distributed Systems / C++

Hiring Organisation
European Tech Recruit
Location
Edinburgh, UK
workloads across CPU, GPU, and NPU clusters. Conduct in-depth profiling and performance tuning of inference pipelines, focusing on KV cache management. Develop low-latency, fault-tolerant AI serving frameworks using vLLM, Ray Serve, and PyTorch Distributed. Research and prototype novel techniques for cache sharing, data locality ...

Systems Research Engineer - Distributed Systems / C++

Hiring Organisation
European Tech Recruit
Location
Edinburgh, Scotland, United Kingdom
workloads across CPU, GPU, and NPU clusters. Conduct in-depth profiling and performance tuning of inference pipelines, focusing on KV cache management. Develop low-latency, fault-tolerant AI serving frameworks using vLLM, Ray Serve, and PyTorch Distributed. Research and prototype novel techniques for cache sharing, data locality ...

Systems Research Engineer

Hiring Organisation
European Tech Recruit
Location
Edinburgh, UK
pipelines, specifically focusing on KV cache management and heterogeneous memory scheduling. AI Serving: Optimising high-throughput frameworks (vLLM, Ray Serve, PyTorch Distributed) to ensure low-latency, multi-tenant performance. Research Leadership: Contributing to top-tier venues (OSDI, NSDI, EuroSys, MLSys) and driving those innovations into real-world production. ...

Systems Research Engineer

Hiring Organisation
European Tech Recruit
Location
Edinburgh, Scotland, United Kingdom
pipelines, specifically focusing on KV cache management and heterogeneous memory scheduling. AI Serving: Optimising high-throughput frameworks (vLLM, Ray Serve, PyTorch Distributed) to ensure low-latency, multi-tenant performance. Research Leadership: Contributing to top-tier venues (OSDI, NSDI, EuroSys, MLSys) and driving those innovations into real-world production. ...