AI Prompt Engineer

AI Prompt Engineering Consultant

Technically Sharp | Systems-Minded | GenAI Focused

Design, optimize, and operationalize prompt-driven and agentic AI systems. Architect LLM-powered workflows that connect people, data, and intelligent systems in high-impact, production-ready ways.

THE ROLE

Prompting, Reasoning & Agentic Systems

  • Design, test, and optimize prompts for leading frontier models (GPT-4.x/5, Claude 3+, Gemini, LLaMA, DeepSeek, and emerging open-weight models).
  • Apply advanced prompting and reasoning techniques, including:
  • Chain-of-Thought, ReAct, Tree-of-Thoughts, Graph-of-Thoughts, Program-of-Thoughts
  • Self-reflection and critique loops
  • Debate prompting and multi-agent collaboration
  • Architect agentic workflows using frameworks such as AutoGen, CrewAI, LangGraph, and custom orchestration layers.
  • Build systems with tool calling, long-term and short-term memory, retrieval pipelines, and structured reasoning constraints.

GenAI Application Engineering

  • Integrate LLMs into real-world applications using LangChain, LlamaIndex, Haystack, AutoGen, and OpenAI Assistant / Responses API patterns.
  • Design and implement high-performance Retrieval-Augmented Generation (RAG) pipelines, including:
  • Hybrid (keyword + vector) search
  • Reranking and embedding optimization
  • Chunking and document preprocessing strategies
  • Evaluation and regression testing harnesses
  • Develop APIs, microservices, and serverless GenAI workflows for scalable, secure deployment.

ML / LLM Engineering & LLMOps

  • Work across AI/ML platforms such as Azure ML, AWS SageMaker, Vertex AI, Databricks, Modal, and Fly.io.
  • Deploy and manage vector databases and embedding stores, including Pinecone, Weaviate, Milvus, FAISS, ChromaDB, and pgVector.
  • Implement LLMOps / PromptOps practices using tools such as:
  • Weights & Biases, MLflow, LangSmith, LangFuse, PromptLayer, Humanloop, Helicone, Arize Phoenix
  • Benchmark, evaluate, and monitor LLM systems using RAGAS, DeepEval, custom eval suites, and human-in-the-loop review.
  • Leverage AI-native developer tools (GitHub Copilot, Cursor, Codeium, Aider, Windsurf) to accelerate iteration and experimentation.

Deployment, Performance & Infrastructure

  • Containerize and deploy GenAI workloads using Docker, Kubernetes, KNative, and managed inference endpoints.
  • Optimize system performance with:
  • Caching, batching, routing, and fallback strategies
  • Quantization and distillation for efficient inference
  • Cost, latency, and reliability optimization
  • Design resilient, observable GenAI systems suitable for production environments.

EXPERIENCE

  • Strong Python engineering skills with hands-on experience across the modern GenAI ecosystem.
  • Deep understanding of LLM behavior, prompt optimization, embeddings, retrieval strategies, and data preparation workflows.
  • Practical experience with vector databases and semantic search systems.
  • Comfortable working in Linux environments with Bash/PowerShell, containers, and cloud infrastructure.
  • Strong communication skills, creativity, and a systems-thinking mindset.
  • Curious, adaptable, and motivated to stay ahead of rapid advances in GenAI and AI-native software development.

BENEFICIAL

  • Experience with PromptOps, LLM observability, and evaluation tooling.
  • Understanding of Responsible AI, safety, bias mitigation, governance, and compliance frameworks.
  • Background in Computer Science, AI/ML, Engineering, or a related technical discipline.
  • Experience deploying, fine-tuning, or serving open-source LLMs in production.

Staffworx is a UK-based Talent & Recruiting Partner supporting organisations across Digital Commerce, Software Engineering, and Value-Add Consulting sectors throughout the UK & EMEA.

Job Details

Company
Staffworx
Location
London, UK
Posted