4 of 4 Low Latency Jobs in Scotland

Systems Research Engineer - LLM Optimisation (vLLM / TensorRT-LLM)

Hiring Organisation
Project People
Location
City Of Edinburgh, Scotland, United Kingdom
using frameworks like vLLM, Ray Serve, and modern PyTorch Distributed systems. Scalable Model Serving Infrastructure : Develop and evaluate frameworks that enable efficient multi-tenant, low-latency, and fault-tolerant AI serving across distributed environments. Research and prototype new techniques for cache sharing, data locality, and resource orchestration ...

Cloud Platform Engineer

Hiring Organisation
Alphayotta
Location
Glasgow, Scotland, United Kingdom
certificate lifecycle (cert-manager, HSM/KMS) Strong CS fundamentals - networking (L3-L7), distributed systems, data structures & algorithms Experience building high-volume, low-latency, resilient infrastructure services Nice to have: TypeScript/React experience for operator dashboard development AWS infrastructure experience (EKS, MSK, Lambda, Direct Connect, Network Firewall ...

Systems Research Engineer - Distributed Systems / C++

Hiring Organisation
European Tech Recruit
Location
Edinburgh, Scotland, United Kingdom
workloads across CPU, GPU, and NPU clusters. Conduct in-depth profiling and performance tuning of inference pipelines, focusing on KV cache management. Develop low-latency, fault-tolerant AI serving frameworks using vLLM, Ray Serve, and PyTorch Distributed. Research and prototype novel techniques for cache sharing, data locality ...

Systems Research Engineer

Hiring Organisation
European Tech Recruit
Location
Edinburgh, Scotland, United Kingdom
pipelines, specifically focusing on KV cache management and heterogeneous memory scheduling. AI Serving: Optimising high-throughput frameworks (vLLM, Ray Serve, PyTorch Distributed) to ensure low-latency, multi-tenant performance. Research Leadership: Contributing to top-tier venues (OSDI, NSDI, EuroSys, MLSys) and driving those innovations into real-world production. ...