5 of 5 vLLM Jobs in the UK excluding London

Senior Platform Engineer

Hiring Organisation
Lorien
Location
London, South East, England, United Kingdom
Employment Type
Contractor
Contract Rate
Salary negotiable
behave in production. Experience or exposure to areas such as: MLOps platforms (e.g. Kubeflow or similar frameworks) Model serving and inference platforms (e.g. KServe, vLLM , or equivalent) Supporting LLM-based workloads , including performance and scaling considerations Notebook environments such as JupyterHub Awareness of emerging tooling around Responsible/Trustworthy ...

Lead Platform Engineer

Hiring Organisation
Lorien
Location
London, South East, England, United Kingdom
Employment Type
Contractor
Contract Rate
Salary negotiable
areas such as: Building or operating MLOps platforms using tools like Kubeflow or similar frameworks Running model serving and inference platforms (e.g. KServe, vLLM, or equivalent) Supporting LLM-based workloads , including optimisation and serving considerations Providing notebook-based environments such as JupyterHub in secure platforms Exposure to emerging tooling such ...

Systems Research Engineer - LLM Optimisation (vLLM / TensorRT-LLM)

Hiring Organisation
Project People
Location
City Of Edinburgh, Scotland, United Kingdom
Systems Research Engineer - LLM Optimisation (vLLM/TensorRT-LLM) Permanent Edinburgh City Centre (On-site 5 days), walking distance from local transport links Salary : Competitive and negotiable, generous benefits package In an era where Large Language Models (LLMs) are rebuilding the foundational software stack, our client is at the forefront … tuning of large-scale inference and data pipelines, focusing on KV cache management, heterogeneous memory scheduling, and high-throughput inference serving using frameworks like vLLM, Ray Serve, and modern PyTorch Distributed systems. Scalable Model Serving Infrastructure : Develop and evaluate frameworks that enable efficient multi-tenant, low-latency, and fault-tolerant ...

Systems Research Engineer - Distributed Systems / C++

Hiring Organisation
European Tech Recruit
Location
Edinburgh, Scotland, United Kingdom
Conduct in-depth profiling and performance tuning of inference pipelines, focusing on KV cache management. Develop low-latency, fault-tolerant AI serving frameworks using vLLM, Ray Serve, and PyTorch Distributed. Research and prototype novel techniques for cache sharing, data locality, and resource orchestration. Translate innovative designs into publishable contributions … distributed systems, or related field. Strong knowledge of Distributed Systems, OS internals, and Machine Learning systems architecture. Hands-on experience with LLM serving frameworks (vLLM, Ray Serve, TensorRT-LLM, or TGI). Proficiency in C/C++ for systems development and Python for research prototyping. Solid grounding in distributed algorithms ...

Systems Research Engineer

Hiring Organisation
European Tech Recruit
Location
Edinburgh, Scotland, United Kingdom
depth profiling of large-scale inference pipelines, specifically focusing on KV cache management and heterogeneous memory scheduling. AI Serving: Optimising high-throughput frameworks (vLLM, Ray Serve, PyTorch Distributed) to ensure low-latency, multi-tenant performance. Research Leadership: Contributing to top-tier venues (OSDI, NSDI, EuroSys, MLSys) and driving those innovations … Stack: Strong proficiency in C/C++ for systems work, with Python for rapid prototyping. Expertise: Hands-on experience with LLM serving frameworks ( vLLM, Ray Serve, TensorRT-LLM ) and distributed algorithms. Mindset: A solid grounding in systems research methodology and performance profiling tools. The "Value Add" (Desired): A PhD focused ...