AI Prompt Engineer
AI Prompt Engineering Consultant
Technically Sharp | Systems-Minded | GenAI Focused
Design, optimize, and operationalize prompt-driven and agentic AI systems. Architect LLM-powered workflows that connect people, data, and intelligent systems in high-impact, production-ready ways.
THE ROLE
Prompting, Reasoning & Agentic Systems
- Design, test, and optimize prompts for leading frontier models (GPT-4.x/5, Claude 3+, Gemini, LLaMA, DeepSeek, and emerging open-weight models).
- Apply advanced prompting and reasoning techniques, including:
- Chain-of-Thought, ReAct, Tree-of-Thoughts, Graph-of-Thoughts, Program-of-Thoughts
- Self-reflection and critique loops
- Debate prompting and multi-agent collaboration
- Architect agentic workflows using frameworks such as AutoGen, CrewAI, LangGraph, and custom orchestration layers.
- Build systems with tool calling, long-term and short-term memory, retrieval pipelines, and structured reasoning constraints.
GenAI Application Engineering
- Integrate LLMs into real-world applications using LangChain, LlamaIndex, Haystack, AutoGen, and OpenAI Assistant / Responses API patterns.
- Design and implement high-performance Retrieval-Augmented Generation (RAG) pipelines, including:
- Hybrid (keyword + vector) search
- Reranking and embedding optimization
- Chunking and document preprocessing strategies
- Evaluation and regression testing harnesses
- Develop APIs, microservices, and serverless GenAI workflows for scalable, secure deployment.
ML / LLM Engineering & LLMOps
- Work across AI/ML platforms such as Azure ML, AWS SageMaker, Vertex AI, Databricks, Modal, and Fly.io.
- Deploy and manage vector databases and embedding stores, including Pinecone, Weaviate, Milvus, FAISS, ChromaDB, and pgVector.
- Implement LLMOps / PromptOps practices using tools such as:
- Weights & Biases, MLflow, LangSmith, LangFuse, PromptLayer, Humanloop, Helicone, Arize Phoenix
- Benchmark, evaluate, and monitor LLM systems using RAGAS, DeepEval, custom eval suites, and human-in-the-loop review.
- Leverage AI-native developer tools (GitHub Copilot, Cursor, Codeium, Aider, Windsurf) to accelerate iteration and experimentation.
Deployment, Performance & Infrastructure
- Containerize and deploy GenAI workloads using Docker, Kubernetes, KNative, and managed inference endpoints.
- Optimize system performance with:
- Caching, batching, routing, and fallback strategies
- Quantization and distillation for efficient inference
- Cost, latency, and reliability optimization
- Design resilient, observable GenAI systems suitable for production environments.
EXPERIENCE
- Strong Python engineering skills with hands-on experience across the modern GenAI ecosystem.
- Deep understanding of LLM behavior, prompt optimization, embeddings, retrieval strategies, and data preparation workflows.
- Practical experience with vector databases and semantic search systems.
- Comfortable working in Linux environments with Bash/PowerShell, containers, and cloud infrastructure.
- Strong communication skills, creativity, and a systems-thinking mindset.
- Curious, adaptable, and motivated to stay ahead of rapid advances in GenAI and AI-native software development.
BENEFICIAL
- Experience with PromptOps, LLM observability, and evaluation tooling.
- Understanding of Responsible AI, safety, bias mitigation, governance, and compliance frameworks.
- Background in Computer Science, AI/ML, Engineering, or a related technical discipline.
- Experience deploying, fine-tuning, or serving open-source LLMs in production.
Staffworx is a UK-based Talent & Recruiting Partner supporting organisations across Digital Commerce, Software Engineering, and Value-Add Consulting sectors throughout the UK & EMEA.