LLMs. Proven hands-on knowledge with: LangChain, AutoGen, LangGraph and RAG Strong proficiency in Python and experience with frameworks like PyTorch or TensorFlow. Experience working with vector databases (e.g., FAISS, Weaviate, Pinecone). Familiarity with HR systems and data (Workday, SAP SuccessFactors, etc.) is a plus. Excellent communication skills and ability to translate technical concepts for non-technical stakeholders. Ideally More ❯
with a focus on leveraging AI/GAI technologies and large language models (LLMs) Advanced AI Integration : Apply experience with retrieval-augmented generation (RAG), vector databases (e.g. Pinecone, Weaviate, FAISS), and enterprise search for AI-driven knowledge discovery Optimize AI Performance : Utilize practical experience designing structured prompts, fine-tuning models, and cost optimization strategies Cloud Architecture : Incorporate cloud architecture best More ❯
with zero external dependencies. Key Responsibilities - Build end-to-end RAG pipelines on isolated defence networks using open-source LLMs (Llama 3, Mistral, Qwen) - Deploy local vector stores (Chroma, FAISS, Milvus) with sensitive document ingestion pipelines - Host and optimise LLMs using vLLM/TGI on local GPU clusters without internet connectivity - Implement agent orchestration using LangChain/LangGraph in completely … Requirements - Active SC Clearance (non-negotiable) - willingness to undergo DV if required - Demonstrable experience deploying open-source LLMs (Llama, Mistral, Falcon) on-premises - Expertise with local vector databases (Chroma, FAISS, Weaviate) in offline deployments - Strong vLLM/Text Generation Inference experience for high-throughput model serving - Proven ability to work on air-gapped systems with no external package repositories - Experience … H100) and CUDA optimisation - Python expertise with offline dependency management and local package mirrors Technical Stack (All On-Premises) Models: Llama 3, Mistral, Qwen (locally hosted) Vector Stores: Chroma, FAISS, Milvus Orchestration: LangChain, LangGraph for agents Hosting: vLLM, TGI, Ollama on bare metal/private cloud Infrastructure: Air-gapped Kubernetes, local container registries Desirable Skills - Experience with defence/government More ❯