4 of 4 Permanent Ray Jobs in the UK

AI Systems Research Engineer

Hiring Organisation: microTECH Global LTD
Location: Edinburgh, Scotland, United Kingdom

knowledge of distributed systems, operating systems, machine learning systems architecture, Inference serving, and AI Infrastructure. · Hands-on experience with LLM serving frameworks (e.g., vLLM, Ray Serve, TensorRT-LLM, TGI) and distributed KV cache optimization. · Proficiency in C/C++, with additional experience in Python for research prototyping. · Solid grounding ...

Staff / Principal Machine Learning Engineer, Serving

Hiring Organisation: Inworld AI
Location: United Kingdom

optimized Python. You know how to profile code and squeeze every ounce of performance out of NVIDIA GPUs. Distributed Systems & Scaling. Experience with Kubernetes, Ray, custom load balancing, multi-GPU/multi-node inference, and reliably handling thousands of concurrent connections. Public work. Non-trivial systems programming projects, open-source ...

Applied Scientist II - Computer Vision

Hiring Organisation: Entrust
Location: London Area, United Kingdom

about twenty machine learning scientists. The team is supported by an ML Ops team that provides state-of-the-art tooling (including AWS, Encord, Ray, PyTorch Lightning and Weights & Biases). The Applied Science team works closely with product engineering to deploy models to serve our worldwide customer base. Position ...

AI Systems Research Engineer - LLM Optimisation

Hiring Organisation: Project People
Location: United Kingdom

large-scale inference and data pipelines, focusing on KV cache management, heterogeneous memory scheduling, and high-throughput inference serving using frameworks like vLLM, Ray Serve, and modern PyTorch Distributed systems. Scalable Model Serving Infrastructure : Develop and evaluate frameworks that enable efficient multi-tenant, low-latency, and fault-tolerant AI serving … computing, or large-scale AI infrastructure are also welcome At least 2 years of experience with LLM inference/serving framework optimization (vLLM/Ray Serve/TensorRT-LLM/PyTorch) Hands-on experience with distributed KV cache optimization Familiarity with GPU and how they execute LLMs Strong knowledge ...