Central London, London, United Kingdom Hybrid / WFH Options
Staffworx Limited
LLM integrations). Exposure to AI ethics, data privacy, and compliance regulations. Prior experience in multi-agent systems or autonomous AI workflows. Hands-on experience with vector databases (Pinecone, Weaviate, FAISS) and AI embeddings. Remote WorkingSome remote working CountryUnited Kingdom LocationWC1 Job TypeContract or Permanent Start DateApr-Jul 25 Duration9 months initial or permanent Visa RequirementApplicants must be eligible to More ❯
or LLM-powered applications in production environments. Proficiency in Python and ML libraries such as PyTorch, Hugging Face Transformers , or TensorFlow. Experience with vector search tools (e.g., FAISS, Pinecone, Weaviate) and retrieval frameworks (e.g., LangChain, LlamaIndex). Hands-on experience with fine-tuning and distillation of large language models. Comfortable with cloud platforms (Azure preferred), CI/CD tools, and More ❯
Next.js Integrate ML models and embeddings into production pipelines using AWS SageMaker , Bedrock or OpenAI APIs Build support systems for autonomous agents including memory storage, vector search (e.g., Pinecone, Weaviate) and tool registries Enforce system-level requirements for security, compliance, observability and CI/CD Drive PoCs and reference architectures for multi-agent coordination , intelligent routing and goal-directed AI … Experience with secure cloud deployments and production ML model integration Bonus Skills Applied work with multi-agent systems , tool orchestration, or autonomous decision-making Experience with vector databases (Pinecone, Weaviate, FAISS) and embedding pipelines Knowledge of AI chatbot frameworks (Rasa, BotPress, Dialogflow) or custom LLM-based UIs Awareness of AI governance , model auditing, and data privacy regulation (GDPR, DPA, etc. More ❯
City of London, London, United Kingdom Hybrid / WFH Options
Staffworx
Next.js Integrate ML models and embeddings into production pipelines using AWS SageMaker , Bedrock or OpenAI APIs Build support systems for autonomous agents including memory storage, vector search (e.g., Pinecone, Weaviate) and tool registries Enforce system-level requirements for security, compliance, observability and CI/CD Drive PoCs and reference architectures for multi-agent coordination , intelligent routing and goal-directed AI … Experience with secure cloud deployments and production ML model integration Bonus Skills Applied work with multi-agent systems , tool orchestration, or autonomous decision-making Experience with vector databases (Pinecone, Weaviate, FAISS) and embedding pipelines Knowledge of AI chatbot frameworks (Rasa, BotPress, Dialogflow) or custom LLM-based UIs Awareness of AI governance , model auditing, and data privacy regulation (GDPR, DPA, etc. More ❯
South East London, England, United Kingdom Hybrid / WFH Options
Staffworx
Next.js Integrate ML models and embeddings into production pipelines using AWS SageMaker , Bedrock or OpenAI APIs Build support systems for autonomous agents including memory storage, vector search (e.g., Pinecone, Weaviate) and tool registries Enforce system-level requirements for security, compliance, observability and CI/CD Drive PoCs and reference architectures for multi-agent coordination , intelligent routing and goal-directed AI … Experience with secure cloud deployments and production ML model integration Bonus Skills Applied work with multi-agent systems , tool orchestration, or autonomous decision-making Experience with vector databases (Pinecone, Weaviate, FAISS) and embedding pipelines Knowledge of AI chatbot frameworks (Rasa, BotPress, Dialogflow) or custom LLM-based UIs Awareness of AI governance , model auditing, and data privacy regulation (GDPR, DPA, etc. More ❯
and maintain AI microservices using Docker, Kubernetes, and FastAPI, ensuring smooth model serving and error handling; Vector Search & Retrieval: Implement retrieval-augmented workflows: ingest documents, index embeddings (Pinecone, FAISS, Weaviate), and build similarity search features. Rapid Prototyping: Create interactive AI demos and proofs-of-concept with Streamlit, Gradio, or Next.js for stakeholder feedback; MLOps & Deployment: Implement CI/CD pipelines … tuning LLMs via OpenAI, HuggingFace or similar APIs; Strong proficiency in Python; Deep expertise in prompt engineering and tooling like LangChain or LlamaIndex; Proficiency with vector databases (Pinecone, FAISS, Weaviate) and document embedding pipelines; Proven rapid-prototyping skills using Streamlit or equivalent frameworks for UI demos. Familiarity with containerization (Docker) and at least one orchestration/deployment platform; Excellent communication More ❯
and maintain AI microservices using Docker, Kubernetes, and FastAPI, ensuring smooth model serving and error handling; Vector Search & Retrieval: Implement retrieval-augmented workflows: ingest documents, index embeddings (Pinecone, FAISS, Weaviate), and build similarity search features. Rapid Prototyping: Create interactive AI demos and proofs-of-concept with Streamlit, Gradio, or Next.js for stakeholder feedback; MLOps & Deployment: Implement CI/CD pipelines … tuning LLMs via OpenAI, HuggingFace or similar APIs; Strong proficiency in Python; Deep expertise in prompt engineering and tooling like LangChain or LlamaIndex; Proficiency with vector databases (Pinecone, FAISS, Weaviate) and document embedding pipelines; Proven rapid-prototyping skills using Streamlit or equivalent frameworks for UI demos. Familiarity with containerization (Docker) and at least one orchestration/deployment platform; Excellent communication More ❯
and maintain AI microservices using Docker, Kubernetes, and FastAPI, ensuring smooth model serving and error handling; Vector Search & Retrieval: Implement retrieval-augmented workflows: ingest documents, index embeddings (Pinecone, FAISS, Weaviate), and build similarity search features. Rapid Prototyping: Create interactive AI demos and proofs-of-concept with Streamlit, Gradio, or Next.js for stakeholder feedback; MLOps & Deployment: Implement CI/CD pipelines … tuning LLMs via OpenAI, HuggingFace or similar APIs; Strong proficiency in Python; Deep expertise in prompt engineering and tooling like LangChain or LlamaIndex; Proficiency with vector databases (Pinecone, FAISS, Weaviate) and document embedding pipelines; Proven rapid-prototyping skills using Streamlit or equivalent frameworks for UI demos. Familiarity with containerization (Docker) and at least one orchestration/deployment platform; Excellent communication More ❯
Full-Stack Integration : Develop APIs and integrate ML models into web applications using FastAPI, Flask, React, TypeScript, and Node.js. Vector Databases & Search : Implement embeddings and retrieval mechanisms using Pinecone, Weaviate, FAISS, Milvus, ChromaDB, or OpenSearch. Required skills & experience: 3-5+ years in machine learning and software development Proficient in Python, PyTorch or TensorFlow or Hugging Face Transformers Experience with More ❯
City of London, London, Finsbury Square, United Kingdom
The Portfolio Group
Full-Stack Integration : Develop APIs and integrate ML models into web applications using FastAPI, Flask, React, TypeScript, and Node.js. Vector Databases & Search : Implement embeddings and retrieval mechanisms using Pinecone, Weaviate, FAISS, Milvus, ChromaDB, or OpenSearch. Required skills & experience: 3-5+ years in machine learning and software development Proficient in Python, PyTorch or TensorFlow or Hugging Face Transformers Experience with More ❯
tools Cloud & MLOps (AWS): Deploy with SageMaker, Bedrock, Lambda, S3, ECS, EKS Full-Stack Integration: Build APIs (FastAPI, Flask) and integrate with React, TypeScript, Node.js Vector Search: Use FAISS, Weaviate, Pinecone, ChromaDB, OpenSearch Required skills & experience: 3–5+ years of experience in ML engineering and software development Deep Python proficiency, with PyTorch, TensorFlow or Hugging Face Proven experience with LLMs More ❯
tools Cloud & MLOps (AWS): Deploy with SageMaker, Bedrock, Lambda, S3, ECS, EKS Full-Stack Integration: Build APIs (FastAPI, Flask) and integrate with React, TypeScript, Node.js Vector Search: Use FAISS, Weaviate, Pinecone, ChromaDB, OpenSearch Required skills & experience: 3–5+ years of experience in ML engineering and software development Deep Python proficiency, with PyTorch, TensorFlow or Hugging Face Proven experience with LLMs More ❯
Python, with expertise in using frameworks like Hugging Face Transformers, LangChain, OpenAI APIs, or other LLM orchestration tools. A solid understanding of tokenization, embedding models, vector databases (e.g., Pinecone, Weaviate, FAISS), and retrieval-augmented generation (RAG) pipelines. Experience designing and evaluating LLM-powered systems such as chatbots, summarization tools, content generation workflows, or intelligent data extraction pipelines. Deep understanding of More ❯
Python, with expertise in using frameworks like Hugging Face Transformers, LangChain, OpenAI APIs, or other LLM orchestration tools. A solid understanding of tokenization, embedding models, vector databases (e.g., Pinecone, Weaviate, FAISS), and retrieval-augmented generation (RAG) pipelines. Experience designing and evaluating LLM-powered systems such as chatbots, summarization tools, content generation workflows, or intelligent data extraction pipelines. Deep understanding of More ❯
experience building AI systems with LLMs — you’ve worked with tools like LangChain, LlamaIndex, Haystack, or built your own stack Hands-on with embedding models, vector DBs (e.g., FAISS, Weaviate, Qdrant), and retrieval logic Strong Python engineering skills — you write clean, production-ready code with tests Experience building and evaluating RAG pipelines in a real-world setting Familiarity with LLM More ❯
experience building AI systems with LLMs — you’ve worked with tools like LangChain, LlamaIndex, Haystack, or built your own stack Hands-on with embedding models, vector DBs (e.g., FAISS, Weaviate, Qdrant), and retrieval logic Strong Python engineering skills — you write clean, production-ready code with tests Experience building and evaluating RAG pipelines in a real-world setting Familiarity with LLM More ❯
You’ll Do Build scalable backend microservices in Python (FastAPI) to support RAG workflows and user queries Develop and optimise vector search pipelines using tools like PGVector, Pinecone, or Weaviate Design embedding orchestration and hybrid retrieval mechanisms Implement evaluation frameworks (BLEU, ROUGE, hallucination checks) to monitor answer quality Deploy production systems on GCP (Cloud Run, Vertex AI, BigQuery, Pub/ More ❯
You’ll Do Build scalable backend microservices in Python (FastAPI) to support RAG workflows and user queries Develop and optimise vector search pipelines using tools like PGVector, Pinecone, or Weaviate Design embedding orchestration and hybrid retrieval mechanisms Implement evaluation frameworks (BLEU, ROUGE, hallucination checks) to monitor answer quality Deploy production systems on GCP (Cloud Run, Vertex AI, BigQuery, Pub/ More ❯