Principal Data Scientist

Apply Now

You will be part of a team designing and building a Gen AI virtual agent to support customers and employees across multiple channels. You will build and run LLM-powered agentic experiences, owning the design, orchestration, MLOps, and continuous improvement.

Design & build client-specific GenAI/LLM virtual agents
Enable the orchestration, management, and execution of AI-powered interactions through purpose-built AI agents
Design, build and maintain robust LLM powered processing workflows
Develop cutting edge testing suites related to bespoke LLM performance metrics
Craft context-aware, multi-channel self-service experiences
Develop bespoke testing suites and LLM performance metrics
CI/CD for ML/LLM: automated build/train/validate/deploy pipelines for chatbots and agent services
IaC - Infrastructure as Code, (Terraform/CloudFormation) to provision scalable cloud for training and real-time inference
Observability: monitoring, drift detection, hallucination, SLOs, and alerting for model and service health
Serving at scale: containerised, auto-scaling (e.g., Kubernetes) with low-latency inference
Data & model versioning; maintain a central model registry with lineage and rollback
Workflow automation across the ML lifecycle (data ingestion → retraining → deployment)
Deliver a live performance dashboard (intent accuracy, latency, error rates) and a documented retraining strategy
Lead and foster creativity around frameworks/models; collaborate closely with product, engineering, and client stakeholders

Qualifications / Experience

Relevant primary level degree and ideally MSc or PhD
Proven expertise in mathematics and classical ML algorithms, plus deep knowledge of LLMs (prompting, fine-tuning, RAG/tool use, evaluation)
Hands-on with AWS and Azure services for data/ML (e.g., Bedrock/SageMaker, Azure OpenAI/Azure ML)
Strong engineering: Python, APIs, containers, Git; CI/CD (GitHub Actions/Azure DevOps), IaC (Terraform/CloudFormation)
Scalable Serving Infrastructure: A containerized, auto-scaling environment (e.g., using Kubernetes) to serve the chatbot model with low latency
Workflow Automation: Automate the end-to-end machine learning lifecycle, from data ingestion and preprocessing to model retraining and deployment
Live Performance Dashboard: A real-time dashboard displaying key model metrics such as intent accuracy, response latency, and error rates
Centralized Model Registry: A versioned repository for all trained models, their performance metrics, and associated training data
Documented Retraining Strategy: An automated workflow and documentation outlining the process for periodically retraining the model on new data
Experience with Kubernetes, inference optimisation, caching, vector stores, and model registries
Clear communication, stakeholder management, and a habit of writing crisp technical docs and runbooks

Personal Attributes

Personal Integrity, Stakeholder Management, Project Management, Agile Methodologies, Automation, Data Visualisation and Analysis.

Company: ISx4
Location: United Kingdom, UK
Posted: 4 days ago

Apply Now

Company: ISx4
Location: United Kingdom, UK
Posted: 4 days ago