5 of 5 Remote/Hybrid vLLM Jobs

AI Engineer (Fluent in Mandarin & English)

Hiring Organisation: Jobleads-UK
Location: Greater London, England, United Kingdom

experience with LLM training cycles, parameter-efficient fine-tuning (PEFT), and sophisticated prompt engineering. Inference Stack: Experience with high-performance inference servers (e.g., vLLM, TGI, or Triton ) and an understanding of how to optimize models for GPU deployment. Infrastructure: Comfortable working in Linux-based environments and proficient in managing containerized ...

Senior Engineer, Infrastructure

Hiring Organisation: Jobleads-UK
Location: United Kingdom

fundamentals, problem-solving skills, and the ability to quickly learn unfamiliar technologies. Experience deploying or operating large language models with serving frameworks such as vLLM or SGLang is considered an advantage. Benefits Fully remote position with the flexibility to work from Europe. Opportunity to shape infrastructure foundations for next-generation ...

Senior Research Scientist | Model Steering

Hiring Organisation: Jobleads-UK
Location: Greater London, England, United Kingdom

without degrading their reasoning capabilities. Experience with machine translation, multilingual NLP, or language quality estimation. Familiarity with inference and serving at scale (e.g. via vLLM, SGLang, TensorRT‐LLM, etc) and long‐context modelling. Publications at top‐tier venues. What we offer Diverse and internationally distributed team : joining our team means ...

Applied AI Engineer

Hiring Organisation: McGregor Boyall Associates Limited
Location: London, United Kingdom
Employment Type: Permanent, Work From Home

production ML systems Comfortable working across models, infrastructure and product Enjoy working in fast-moving, early-stage environments Tech Stack Python * PyTorch * JAX * LLMs * vLLM * Vector Databases * Modern Agent Frameworks Get in touch for more details - McGregor Boyall is an equal opportunity employer and do not discriminate on any grounds. ...

AI Platform engineer

Hiring Organisation: Nextech Group Limited
Location: East London, London, United Kingdom
Employment Type: Permanent, Work From Home
Salary: £85,000

processing of high-volume inference requests Implement observability and cost-tracking for token usage across multiple LLM providers (Anthropic, OpenAI, open-source models via vLLM) Own database performance for both relational (Postgres) and vector (pgvector/Pinecone) workloads Collaborate with ML engineers on model-serving infrastructure and prompt-caching strategies … Node.js) Databases: PostgreSQL, Redis, Pinecone/pgvector Infra: AWS (ECS, Lambda, SQS/SNS), Docker, Kubernetes, Terraform AI/ML tooling: LangChain/LlamaIndex, vLLM, Anthropic & OpenAI APIs, embedding models Observability: Datadog, Grafana, OpenTelemetry CI/CD: GitHub Actions, ArgoCD Requirements: 4+ years backend development experience, ideally with at least ...