LLMs. Proven hands-on knowledge with: LangChain, AutoGen, LangGraph and RAG Strong proficiency in Python and experience with frameworks like PyTorch or TensorFlow. Experience working with vector databases (e.g., FAISS, Weaviate, Pinecone). Familiarity with HR systems and data (Workday, SAP SuccessFactors, etc.) is a plus. Excellent communication skills and ability to translate technical concepts for non-technical stakeholders. Ideally More ❯
LS22, Wetherby, City and Borough of Leeds, West Yorkshire, United Kingdom
Handshaik
/ownership mindset, strong communication skills and a team player 5+ years of professional experience in full-stack development. Hands-on experience with RAG systems, vector databases (pgvector/FAISS/Weaviate/ES k-NN), embeddings, and hybrid search (BM25 + vectors). Strong grasp of chunking strategies, metadata, indexing, recall/precision trade-offs, reranking, and evaluation (ground More ❯
with zero external dependencies. Key Responsibilities - Build end-to-end RAG pipelines on isolated defence networks using open-source LLMs (Llama 3, Mistral, Qwen) - Deploy local vector stores (Chroma, FAISS, Milvus) with sensitive document ingestion pipelines - Host and optimise LLMs using vLLM/TGI on local GPU clusters without internet connectivity - Implement agent orchestration using LangChain/LangGraph in completely … Requirements - Active SC Clearance (non-negotiable) - willingness to undergo DV if required - Demonstrable experience deploying open-source LLMs (Llama, Mistral, Falcon) on-premises - Expertise with local vector databases (Chroma, FAISS, Weaviate) in offline deployments - Strong vLLM/Text Generation Inference experience for high-throughput model serving - Proven ability to work on air-gapped systems with no external package repositories - Experience … H100) and CUDA optimisation - Python expertise with offline dependency management and local package mirrors Technical Stack (All On-Premises) Models: Llama 3, Mistral, Qwen (locally hosted) Vector Stores: Chroma, FAISS, Milvus Orchestration: LangChain, LangGraph for agents Hosting: vLLM, TGI, Ollama on bare metal/private cloud Infrastructure: Air-gapped Kubernetes, local container registries Desirable Skills - Experience with defence/government More ❯