12 of 12 Permanent vLLM Jobs in Scotland

DevOps Engineer

Hiring Organisation
Aveni
Location
Aberdeen, UK
Employment Type
Full-time
scripting skills in Python or Bash Familiar with ML lifecycle tools, model monitoring, and versioning Exposure to tools like KServe, Ray Serve, Triton, or vLLM a big plus Bonus Points: Experience with observability frameworks like Prometheus or OpenTelemetry Knowledge of ML libraries: TensorFlow, PyTorch, HuggingFace Exposure to Azure ...

AI Systems Research Engineer

Hiring Organisation
microTECH Global LTD
Location
Dunfermline, Fife, UK
Employment Type
Full-time
Strong knowledge of distributed systems, operating systems, machine learning systems architecture, Inference serving, and AI Infrastructure. · Hands-on experience with LLM serving frameworks (e.g., vLLM, Ray Serve, TensorRT-LLM, TGI) and distributed KV cache optimization. · Proficiency in C/C++, with additional experience in Python for research prototyping. · Solid grounding ...

AI Systems Research Engineer

Hiring Organisation
microTECH Global LTD
Location
Livingston, West Lothian, UK
Employment Type
Full-time
Strong knowledge of distributed systems, operating systems, machine learning systems architecture, Inference serving, and AI Infrastructure. · Hands-on experience with LLM serving frameworks (e.g., vLLM, Ray Serve, TensorRT-LLM, TGI) and distributed KV cache optimization. · Proficiency in C/C++, with additional experience in Python for research prototyping. · Solid grounding ...

AI Systems Research Engineer

Hiring Organisation
microTECH Global LTD
Location
Edinburgh, Scotland, United Kingdom
Strong knowledge of distributed systems, operating systems, machine learning systems architecture, Inference serving, and AI Infrastructure. · Hands-on experience with LLM serving frameworks (e.g., vLLM, Ray Serve, TensorRT-LLM, TGI) and distributed KV cache optimization. · Proficiency in C/C++, with additional experience in Python for research prototyping. · Solid grounding ...

AI Infrastructure Architect

Hiring Organisation
Microtech Global Ltd
Location
Dunfermline, Fife, UK
Employment Type
Full-time
operating systems, and runtime environments; Hands-on experience with Serverless architectures and cloud-native optimization technologies such as containers, Kubernetes, service orchestration, and autoscaling vLLM, SGLang, Ray Serve, etc.); understand common optimization concepts such as continuous batching, KV-Cache reuse, parallelism, and compression/quantization/distillation Proficient in using ...

AI Infrastructure Architect

Hiring Organisation
Microtech Global Ltd
Location
Edinburgh, Midlothian, Scotland, United Kingdom
Employment Type
Permanent
operating systems, and runtime environments; Hands-on experience with Serverless architectures and cloud-native optimization technologies such as containers, Kubernetes, service orchestration, and autoscaling vLLM, SGLang, Ray Serve, etc.); understand common optimization concepts such as continuous batching, KV-Cache reuse, parallelism, and compression/quantization/distillation Proficient in using ...

AI Infrastructure Architect

Hiring Organisation
Microtech Global Ltd
Location
Broughton, Scottish Borders, UK
Employment Type
Full-time
operating systems, and runtime environments; Hands-on experience with Serverless architectures and cloud-native optimization technologies such as containers, Kubernetes, service orchestration, and autoscaling vLLM, SGLang, Ray Serve, etc.); understand common optimization concepts such as continuous batching, KV-Cache reuse, parallelism, and compression/quantization/distillation Proficient in using ...

AI Infrastructure Architect

Hiring Organisation
Microtech Global Ltd
Location
Livingston, West Lothian, UK
Employment Type
Full-time
operating systems, and runtime environments; Hands-on experience with Serverless architectures and cloud-native optimization technologies such as containers, Kubernetes, service orchestration, and autoscaling vLLM, SGLang, Ray Serve, etc.); understand common optimization concepts such as continuous batching, KV-Cache reuse, parallelism, and compression/quantization/distillation Proficient in using ...

Infrastructure Architect

Hiring Organisation
microTECH Global LTD
Location
Edinburgh, Scotland, United Kingdom
operating systems, and runtime environments; Hands-on experience with Serverless architectures and cloud-native optimization technologies such as containers, Kubernetes, service orchestration, and autoscaling vLLM, SGLang, Ray Serve, etc.); understand common optimization concepts such as continuous batching, KV-Cache reuse, parallelism, and compression/quantization/distillation Proficient in using ...

Infrastructure Architect

Hiring Organisation
microTECH Global LTD
Location
Dunfermline, Fife, UK
Employment Type
Full-time
operating systems, and runtime environments; Hands-on experience with Serverless architectures and cloud-native optimization technologies such as containers, Kubernetes, service orchestration, and autoscaling vLLM, SGLang, Ray Serve, etc.); understand common optimization concepts such as continuous batching, KV-Cache reuse, parallelism, and compression/quantization/distillation Proficient in using ...

Infrastructure Architect

Hiring Organisation
microTECH Global LTD
Location
Livingston, West Lothian, UK
Employment Type
Full-time
operating systems, and runtime environments; Hands-on experience with Serverless architectures and cloud-native optimization technologies such as containers, Kubernetes, service orchestration, and autoscaling vLLM, SGLang, Ray Serve, etc.); understand common optimization concepts such as continuous batching, KV-Cache reuse, parallelism, and compression/quantization/distillation Proficient in using ...

esearch (Systems) Engineer

Hiring Organisation
Microtech Global Ltd
Location
Edinburgh, Midlothian, Scotland, United Kingdom
Employment Type
Permanent
field. Strong knowledge of distributed systems, operating systems,machine learning systems architecture, Inference serving, and AI Infrastructure. Hands-on experience withLLM serving frameworks(e.g.,vLLM,Ray Serve,TensorRT-LLM,TGI) anddistributed KV cache optimization. Proficiency inC/C++, with additional experience inPythonfor research prototyping. Solid grounding insystems research methodology,distributed ...