RAG) for augmenting LLMs with domain-specific knowledge. Prompt engineering and fine-tuning for tailoring model behavior to business-specific contexts. Use of embedding stores and vector databases (e.g., Pinecone, Redis, Azure AI Search) to support semantic search and recommendation systems. Building intelligent features like AI-powered chatbots , assistants , and question-answering systems using LLMs and conversational agents. Awareness of More ❯
RAG) for augmenting LLMs with domain-specific knowledge. Prompt engineering and fine-tuning for tailoring model behavior to business-specific contexts. Use of embedding stores and vector databases (e.g., Pinecone, Redis, Azure AI Search) to support semantic search and recommendation systems. Building intelligent features like AI-powered chatbots , assistants , and question-answering systems using LLMs and conversational agents. Awareness of More ❯
RAG) for augmenting LLMs with domain-specific knowledge. Prompt engineering and fine-tuning for tailoring model behavior to business-specific contexts. Use of embedding stores and vector databases (e.g., Pinecone, Redis, Azure AI Search) to support semantic search and recommendation systems. Building intelligent features like AI-powered chatbots , assistants , and question-answering systems using LLMs and conversational agents. Awareness of More ❯
Express, Next.js Integrate ML models and embeddings into production pipelines using AWS SageMaker , Bedrock or OpenAI APIs Build support systems for autonomous agents including memory storage, vector search (e.g., Pinecone, Weaviate) and tool registries Enforce system-level requirements for security, compliance, observability and CI/CD Drive PoCs and reference architectures for multi-agent coordination , intelligent routing and goal-directed … similar Experience with secure cloud deployments and production ML model integration Bonus Skills Applied work with multi-agent systems , tool orchestration, or autonomous decision-making Experience with vector databases (Pinecone, Weaviate, FAISS) and embedding pipelines Knowledge of AI chatbot frameworks (Rasa, BotPress, Dialogflow) or custom LLM-based UIs Awareness of AI governance , model auditing, and data privacy regulation (GDPR, DPA More ❯
City of London, London, United Kingdom Hybrid / WFH Options
Staffworx
Express, Next.js Integrate ML models and embeddings into production pipelines using AWS SageMaker , Bedrock or OpenAI APIs Build support systems for autonomous agents including memory storage, vector search (e.g., Pinecone, Weaviate) and tool registries Enforce system-level requirements for security, compliance, observability and CI/CD Drive PoCs and reference architectures for multi-agent coordination , intelligent routing and goal-directed … similar Experience with secure cloud deployments and production ML model integration Bonus Skills Applied work with multi-agent systems , tool orchestration, or autonomous decision-making Experience with vector databases (Pinecone, Weaviate, FAISS) and embedding pipelines Knowledge of AI chatbot frameworks (Rasa, BotPress, Dialogflow) or custom LLM-based UIs Awareness of AI governance , model auditing, and data privacy regulation (GDPR, DPA More ❯
South East London, England, United Kingdom Hybrid / WFH Options
Staffworx
Express, Next.js Integrate ML models and embeddings into production pipelines using AWS SageMaker , Bedrock or OpenAI APIs Build support systems for autonomous agents including memory storage, vector search (e.g., Pinecone, Weaviate) and tool registries Enforce system-level requirements for security, compliance, observability and CI/CD Drive PoCs and reference architectures for multi-agent coordination , intelligent routing and goal-directed … similar Experience with secure cloud deployments and production ML model integration Bonus Skills Applied work with multi-agent systems , tool orchestration, or autonomous decision-making Experience with vector databases (Pinecone, Weaviate, FAISS) and embedding pipelines Knowledge of AI chatbot frameworks (Rasa, BotPress, Dialogflow) or custom LLM-based UIs Awareness of AI governance , model auditing, and data privacy regulation (GDPR, DPA More ❯
/or LLM-powered applications in production environments. Proficiency in Python and ML libraries such as PyTorch, Hugging Face Transformers , or TensorFlow. Experience with vector search tools (e.g., FAISS, Pinecone, Weaviate) and retrieval frameworks (e.g., LangChain, LlamaIndex). Hands-on experience with fine-tuning and distillation of large language models. Comfortable with cloud platforms (Azure preferred), CI/CD tools More ❯
and fine-tune SLMs/LLMs using domain-specific data (e.g., ITSM, security, operations) • Design and optimize Retrieval-Augmented Generation (RAG) pipelines with vector DBs (e.g., FAISS, Chroma, Weaviate, Pinecone) • Develop agent-based architectures using LangGraph, AutoGen, CrewAI, or custom frameworks • Integrate AI agents with enterprise tools (ServiceNow, Jira, SAP, Slack, etc.) • Optimize model performance (quantization, distillation, batching, caching) • Collaborate … and attention mechanisms • Experience with LangChain, Transformers (HuggingFace), or LlamaIndex • Working knowledge of LLM fine-tuning (LoRA, QLoRA, PEFT) and prompt engineering • Hands-on experience with vector databases (FAISS, Pinecone, Weaviate, Chroma) • Cloud experience on Azure, AWS, or GCP (Azure preferred) • Experience with Kubernetes, Docker, and scalable microservice deployments • Experience integrating with REST APIs, webhooks, and enterprise systems (ServiceNow, SAP More ❯
retrieval-augmented generation, prompt management, model orchestration). Work with embeddings, vector stores, and similarity search to enable contextual AI responses. Integrate with vector databases (e.g., FAISS, Weaviate, or Pinecone) to support semantic search and information retrieval. Build scalable APIs and services using FastAPI or similar frameworks. Use tools like MLflow to manage model experimentation, versioning, and deployment. Collaborate closely … in Python, with experience building services using FastAPI or similar frameworks. Working with embeddings for text or document representation and semantic search. Familiarity with vector databases (e.g., FAISS, Weaviate, Pinecone). Understanding of AI infrastructure: versioning, tracking, and deployment with tools like MLflow. Exposure to building production-grade APIs, services, or workflows in an agile, collaborative environment. Awareness of AI More ❯
virtual assistants): Requirements: • Strong experience with Python and Al/ML libraries (Langchain, TensorFlow, PyTorch) • Experience with frontend frameworks like React or Angular • Knowledge of vector databases (e.g., FAISS, Pinecone, Weaviate) • Familiarity with LLM integrations (e.g., OpenAl, HuggingFace) • Experience building and consuming REST/gRPC APis • Understanding of prompt engineering and RAG architectures • Familiar with cloud platforms (AWS, GCP, or More ❯
and modern web frameworks Deep experience with AI/ML frameworks (PyTorch, TensorFlow, Transformers, LangChain) Mastery of prompt engineering and fine-tuning Large Language Models Proficient in vector databases (Pinecone, Weaviate, Milvus) and embedding technologies Expert in building RAG (Retrieval-Augmented Generation) systems at scale Strong experience with MLOps practices and model deployment pipelines Proficient in cloud AI services (AWS More ❯
and modern web frameworks Deep experience with AI/ML frameworks (PyTorch, TensorFlow, Transformers, LangChain) Mastery of prompt engineering and fine-tuning Large Language Models Proficient in vector databases (Pinecone, Weaviate, Milvus) and embedding technologies Expert in building RAG (Retrieval-Augmented Generation) systems at scale Strong experience with MLOps practices and model deployment pipelines Proficient in cloud AI services (AWS More ❯
MLOps (model/component dockerization, Kubernetes deployment) in multiple environments (AWS, AZURE, GCP). Operationalization of AI solutions to production. •Relational DB (SQL), Graph DB (Neo4j) and Vector DB (Pinecone, Weviate, Qdrant) •Guide team to debug issues with pipeline failures •Engage with Business/Stakeholders with status update on progress of development and issue fix •Automation, Technology and Process Improvement … MLOps (model/component dockerization, Kubernetes deployment) in multiple environments (AWS, AZURE, GCP). Operationalization of AI solutions to production. •Relational DB (SQL), Graph DB (Neo4j) and Vector DB (Pinecone, Weviate, Qdrant) •Experience designing and implementing ML Systems & pipelines, MLOps practices •Exposure to event driven orchestration, Online Model deployment •Hands on experience in working with client IT/Business teams More ❯
healthcare data interoperability (FHIR, HL7, CDA). You've built real-time AI applications, including voice AI, speech recognition, or NLP pipelines. You have experience in vector databases (e.g., Pinecone, Weaviate) and retrieval-augmented generation (RAG) architectures. What's in it for you? The opportunity to build and scale AI models in production that directly impact healthcare efficiency. A role More ❯
Liverpool, Lancashire, United Kingdom Hybrid / WFH Options
TEKsystems, Inc
management using frameworks such as LangChain, CrewAI, and Autogen. Engineer and tune prompts to enhance the performance and reliability of generative tasks. Design RAG systems using vector databases like Pinecone, Chroma, and PosgreSQL for contextual retrieval. Incorporate semantic search and embedding strategies for more relevant and grounded LLM responses. Utilize Guardrails to implement applications that adhere to responsible AI guidelines. More ❯
monitoring. Full-Stack Integration : Develop APIs and integrate ML models into web applications using FastAPI, Flask, React, TypeScript, and Node.js. Vector Databases & Search : Implement embeddings and retrieval mechanisms using Pinecone, Weaviate, FAISS, Milvus, ChromaDB, or OpenSearch. Required skills & experience: 3-5+ years in machine learning and software development Proficient in Python, PyTorch or TensorFlow or Hugging Face Transformers Experience More ❯
monitoring. Full-Stack Integration : Develop APIs and integrate ML models into web applications using FastAPI, Flask, React, TypeScript, and Node.js. Vector Databases & Search : Implement embeddings and retrieval mechanisms using Pinecone, Weaviate, FAISS, Milvus, ChromaDB, or OpenSearch. Required skills & experience: 3-5+ years in machine learning and software development Proficient in Python, PyTorch or TensorFlow or Hugging Face Transformers Experience More ❯
monitoring. Full-Stack Integration : Develop APIs and integrate ML models into web applications using FastAPI, Flask, React, TypeScript, and Node.js. Vector Databases & Search : Implement embeddings and retrieval mechanisms using Pinecone, Weaviate, FAISS, Milvus, ChromaDB, or OpenSearch. Required skills & experience: 3-5+ years in machine learning and software development Proficient in Python, PyTorch or TensorFlow or Hugging Face Transformers Experience More ❯
monitoring. Full-Stack Integration : Develop APIs and integrate ML models into web applications using FastAPI, Flask, React, TypeScript, and Node.js. Vector Databases & Search : Implement embeddings and retrieval mechanisms using Pinecone, Weaviate, FAISS, Milvus, ChromaDB, or OpenSearch. Required skills & experience: 3-5+ years in machine learning and software development Proficient in Python, PyTorch or TensorFlow or Hugging Face Transformers Experience More ❯
City of London, London, Finsbury Square, United Kingdom
The Portfolio Group
monitoring. Full-Stack Integration : Develop APIs and integrate ML models into web applications using FastAPI, Flask, React, TypeScript, and Node.js. Vector Databases & Search : Implement embeddings and retrieval mechanisms using Pinecone, Weaviate, FAISS, Milvus, ChromaDB, or OpenSearch. Required skills & experience: 3-5+ years in machine learning and software development Proficient in Python, PyTorch or TensorFlow or Hugging Face Transformers Experience More ❯
in Python, with expertise in using frameworks like Hugging Face Transformers, LangChain, OpenAI APIs, or other LLM orchestration tools. A solid understanding of tokenization, embedding models, vector databases (e.g., Pinecone, Weaviate, FAISS), and retrieval-augmented generation (RAG) pipelines. Experience designing and evaluating LLM-powered systems such as chatbots, summarization tools, content generation workflows, or intelligent data extraction pipelines. Deep understanding More ❯
in Python, with expertise in using frameworks like Hugging Face Transformers, LangChain, OpenAI APIs, or other LLM orchestration tools. A solid understanding of tokenization, embedding models, vector databases (e.g., Pinecone, Weaviate, FAISS), and retrieval-augmented generation (RAG) pipelines. Experience designing and evaluating LLM-powered systems such as chatbots, summarization tools, content generation workflows, or intelligent data extraction pipelines. Deep understanding More ❯