Systems Research Engineer - Distributed Systems / C++
- Hiring Organisation
- European Tech Recruit
- Location
- Edinburgh, Scotland, United Kingdom
workloads across CPU, GPU, and NPU clusters. Conduct in-depth profiling and performance tuning of inference pipelines, focusing on KV cache management. Develop low-latency, fault-tolerant AI serving frameworks using vLLM, Ray Serve, and PyTorch Distributed. Research and prototype novel techniques for cache sharing, data locality ...