LLMs. Proven hands-on knowledge with: LangChain, AutoGen, LangGraph and RAG Strong proficiency in Python and experience with frameworks like PyTorch or TensorFlow. Experience working with vector databases (e.g., FAISS, Weaviate, Pinecone). Familiarity with HR systems and data (Workday, SAP SuccessFactors, etc.) is a plus. Excellent communication skills and ability to translate technical concepts for non-technical stakeholders. Ideally More ❯
LLMs. Proven hands-on knowledge with: LangChain, AutoGen, LangGraph and RAG Strong proficiency in Python and experience with frameworks like PyTorch or TensorFlow. Experience working with vector databases (e.g., FAISS, Weaviate, Pinecone). Familiarity with HR systems and data (Workday, SAP SuccessFactors, etc.) is a plus. Excellent communication skills and ability to translate technical concepts for non-technical stakeholders. Ideally More ❯
infrastructure automation. Experience with CI/CD pipelines, configuration management, and onboarding/monitoring of ML systems. Experience working with knowledge graphs and vector databases (e.g., Neo4j, Weaviate, Pinecone, FAISS Strong understanding of data and AI pipeline configuration and orchestration. Equal Opportunity Employer. We are an equal opportunity employer. All aspects of employment including the decision to hire, promote, discipline More ❯
Houston, Texas, United States Hybrid / WFH Options
INSPYR Solutions
setups). Advanced proficiency in Python, including scripting for LLM pipelines, handling dependencies with tools like Poetry or Pipenv, and integrating with libraries such as Sentence Transformers for embeddings, FAISS for vector search, or Streamlit/Gradio for prototyping interfaces. Experience with vector databases and semantic search (e.g., Pinecone, Weaviate, or FAISS) to support efficient retrieval in LLM applications. Demonstrated More ❯
with a focus on leveraging AI/GAI technologies and large language models (LLMs) Advanced AI Integration : Apply experience with retrieval-augmented generation (RAG), vector databases (e.g. Pinecone, Weaviate, FAISS), and enterprise search for AI-driven knowledge discovery Optimize AI Performance : Utilize practical experience designing structured prompts, fine-tuning models, and cost optimization strategies Cloud Architecture : Incorporate cloud architecture best More ❯
Qualifications: Strong proficiency in Python Expertise in FastAPI for backend development Experience with LangChain and LangGraph Hands-on experience with Retrieval-Augmented Generation (RAG) Proficiency in vector databases (Milvus, FAISS, or similar) Database integration (MongoDB/PostgreSQL, GraphDB) Experience integrating AWS Bedrock models Desired Qualifications: Prompt engineering for optimizing LLM interactions Ability to design and deploy AI agents using LangGraph More ❯
TTS providers, text-based agents, or multi- modal setups) Sound knowledge and hands-on experience with Python for AI development, including building and integrating RAG systems (e.g., using LangChain, FAISS, or Pinecone for knowledge base retrieval) Excellent knowledge of modern programming languages and AI tools (Python, TypeScript, etc.), with experience in creating RAG pipelines, integrating with transactional systems (e.g., databases More ❯
with zero external dependencies. Key Responsibilities - Build end-to-end RAG pipelines on isolated defence networks using open-source LLMs (Llama 3, Mistral, Qwen) - Deploy local vector stores (Chroma, FAISS, Milvus) with sensitive document ingestion pipelines - Host and optimise LLMs using vLLM/TGI on local GPU clusters without internet connectivity - Implement agent orchestration using LangChain/LangGraph in completely … Requirements - Active SC Clearance (non-negotiable) - willingness to undergo DV if required - Demonstrable experience deploying open-source LLMs (Llama, Mistral, Falcon) on-premises - Expertise with local vector databases (Chroma, FAISS, Weaviate) in offline deployments - Strong vLLM/Text Generation Inference experience for high-throughput model serving - Proven ability to work on air-gapped systems with no external package repositories - Experience … H100) and CUDA optimisation - Python expertise with offline dependency management and local package mirrors Technical Stack (All On-Premises) Models: Llama 3, Mistral, Qwen (locally hosted) Vector Stores: Chroma, FAISS, Milvus Orchestration: LangChain, LangGraph for agents Hosting: vLLM, TGI, Ollama on bare metal/private cloud Infrastructure: Air-gapped Kubernetes, local container registries Desirable Skills - Experience with defence/government More ❯
with zero external dependencies. Key Responsibilities - Build end-to-end RAG pipelines on isolated defence networks using open-source LLMs (Llama 3, Mistral, Qwen) - Deploy local vector stores (Chroma, FAISS, Milvus) with sensitive document ingestion pipelines - Host and optimise LLMs using vLLM/TGI on local GPU clusters without internet connectivity - Implement agent orchestration using LangChain/LangGraph in completely … Requirements - Active SC Clearance (non-negotiable) - willingness to undergo DV if required - Demonstrable experience deploying open-source LLMs (Llama, Mistral, Falcon) on-premises - Expertise with local vector databases (Chroma, FAISS, Weaviate) in offline deployments - Strong vLLM/Text Generation Inference experience for high-throughput model serving - Proven ability to work on air-gapped systems with no external package repositories - Experience … H100) and CUDA optimisation - Python expertise with offline dependency management and local package mirrors Technical Stack (All On-Premises) Models: Llama 3, Mistral, Qwen (locally hosted) Vector Stores: Chroma, FAISS, Milvus Orchestration: LangChain, LangGraph for agents Hosting: vLLM, TGI, Ollama on bare metal/private cloud Infrastructure: Air-gapped Kubernetes, local container registries Desirable Skills - Experience with defence/government More ❯
with zero external dependencies. Key Responsibilities - Build end-to-end RAG pipelines on isolated defence networks using open-source LLMs (Llama 3, Mistral, Qwen) - Deploy local vector stores (Chroma, FAISS, Milvus) with sensitive document ingestion pipelines - Host and optimise LLMs using vLLM/TGI on local GPU clusters without internet connectivity - Implement agent orchestration using LangChain/LangGraph in completely … Requirements - Active SC Clearance (non-negotiable) - willingness to undergo DV if required - Demonstrable experience deploying open-source LLMs (Llama, Mistral, Falcon) on-premises - Expertise with local vector databases (Chroma, FAISS, Weaviate) in offline deployments - Strong vLLM/Text Generation Inference experience for high-throughput model serving - Proven ability to work on air-gapped systems with no external package repositories - Experience … H100) and CUDA optimisation - Python expertise with offline dependency management and local package mirrors Technical Stack (All On-Premises) Models: Llama 3, Mistral, Qwen (locally hosted) Vector Stores: Chroma, FAISS, Milvus Orchestration: LangChain, LangGraph for agents Hosting: vLLM, TGI, Ollama on bare metal/private cloud Infrastructure: Air-gapped Kubernetes, local container registries Desirable Skills - Experience with defence/government More ❯