AI Quality and Evaluation Manager
AI Quality & Evaluation Manager (Contract) 
                            Location: Hybrid working - Blackfriars 3 days per week Contract: 6 months, Outside IR35
Are you passionate about building the future of AI quality? Do you thrive in hands-on roles where you can shape frameworks from the ground up and make a real impact? We're looking for an experienced AI Quality & Evaluation Manager to join our team on a contract basis and lay the foundations for robust, reliable, and user-focused AI services across our business.
What You'll Do- Design and implement a comprehensive AI testing and evaluation framework for all AI solutions, including LLM-based tools, RAG systems, and third-party platforms.
 - Define and document quality standards for semantic accuracy, factual consistency, bias, tone, and relevance.
 - Develop reusable testing templates, data sets, and evaluation methods that can be scaled and maintained by internal teams.
 - Run hands-on testing of AI prototypes and production tools to assess technical performance and business value.
 - Collaborate with business users to guide practical testing and feedback processes.
 - Deliver training and upskilling materials to empower internal staff to sustain the framework after your contract ends.
 - Support vendor evaluations and POC assessments with robust test protocols.
 - Establish baseline metrics and dashboards to measure ongoing AI quality and relevance.
 - Work closely with engineering and product leads to embed testing into delivery workflows.
 - Champion responsible AI practices to ensure fairness, transparency, and user trust.
 
- Strong hands-on experience in testing and evaluation of AI or software systems, ideally with NLP or LLM-based applications.
 - Understanding of prompt evaluation, semantic search, and LLM behaviour (accuracy, hallucination, bias, tone, etc.).
 - Familiarity with tools like Trulens, HumanLoop, PromptLayer, or similar; experience designing QA approaches for GenAI environments.
 - Knowledge of modern AI architectures (RAG pipelines, embeddings, API integrations such as OpenAI, Azure OpenAI, Anthropic).
 - Experience designing and implementing structured test regimes in fast-evolving contexts.
 - Excellent communication and facilitation skills, engaging both technical and business audiences.
 - Proven ability to create sustainable frameworks, documentation, and training materials.
 
- A builder who loves creating practical, scalable solutions.
 - Hands-on and analytical, balancing experimentation with process.
 - Collaborative and empathetic, bridging technical and non-technical teams.
 - User-focused, driven by delivering real value.
 - Committed to responsible AI, fairness, and transparency.
 
Ready to shape the future of AI quality with us? Apply now and help us ensure our AI-enabled services are accurate, consistent, and trusted by all.
Carbon60, Lorien & SRG - The Impellam Group STEM Portfolio are acting as an Employment Business in relation to this vacancy.
- Company
 - Lorien
 - Location
 - London, South East, England, United Kingdom
Hybrid / WFH Options - Employment Type
 - Contractor
 - Salary
 - Salary negotiable
 - Posted
 
- Company
 - Lorien
 - Location
 - London, South East, England, United Kingdom
Hybrid / WFH Options - Employment Type
 - Contractor
 - Salary
 - Salary negotiable
 - Posted