Senior Data Scientist

Senior Data Scientist (Full-Stack ML & LLMs)

Location: Hybrid - Remote / London (UK)

Reporting to: Technical Lead (Senior Data Scientist)

About Us

We are a high-growth, venture-backed health tech startup on a mission to transform how healthcare assessments are understood and utilized. Founded by health economics academics and backed by leading investors, we are building a novel, data-driven platform that will become the definitive source of intelligence for Health Technology Assessment (HTA) and Market Access (MA) reports globally.

This is a rare opportunity to get in on the ground floor and play a pivotal role in shaping the core technology and data foundation of our company.

The Role: Build, Deploy, and Innovate

We are looking for a Senior Data Scientist who is as passionate about building robust, scalable data systems as they are about innovating with cutting-edge machine learning. You will be the second data science hire, working hand-in-hand with our Technical Lead to architect and execute the data strategy that powers our entire product.

Your primary mission will be to design, build, and own the end-to-end ML pipelines that ingest, structure, and enrich our HTA/MA database. This involves everything from scraping complex web data to experimenting with and deploying Large Language Models (LLMs) for information extraction, all while building the frameworks for human-in-the-loop validation. You will have a direct and visible impact on our product and our business.

Key Responsibilities

  • End-to-End ML Pipeline Ownership: Take charge of the entire ML lifecycle, from conceptualising impactful applications to deployment and monitoring.
  • Data Ingestion & Engineering: Design, build, and maintain robust data ingestion pipelines, including custom web-scraping of agency websites (HTML, PDF) and leveraging APIs where available. Restructure and optimise our PostgreSQL database to support our evolving data schema.
  • LLM Experimentation & Deployment: Rapidly experiment with and implement LLM-based solutions (from prompt engineering with state-of-the-art APIs to fine-tuning open-source models) to extract, consolidate, and categorise key variables from complex text.
  • Human-in-the-Loop System Design: Design and implement the frameworks and tools that facilitate efficient validation and input from our in-house clinical experts, turning their domain knowledge into training data and model improvements.
  • Model Evaluation & "LLM-as-a-Judge": Develop rigorous, systematic evaluation frameworks to interrogate model performance, including the innovative use of LLMs to assist in prioritisation and automation of manual tasks.
  • Cloud Deployment & MLOps: Deploy models as scalable API endpoints on AWS. Evolve our current CI/CD (GitHub Actions) and MLOps practices to ensure reliability and reproducibility as we scale.
  • Collaboration & Influence: Work closely with the founders and non-technical team members to communicate complex ideas, secure buy-in for technical direction, and align on product roadmap priorities. Mentor and guide future junior hires.

Who You Are

We are looking for a pragmatic builder who thrives in ambiguity and is motivated by creating tangible value from unstructured data.

Must-Have Experience & Skills:

  • Proven experience as a Senior or Full-Stack Data Scientist, with a track record of taking ML projects from conception to deployment in a cloud environment (AWS preferred).
  • Strong proficiency in  Python for data science (Pandas, NumPy, Scikit-learn) and  SQL (PostgreSQL is a plus).
  • Hands-on experience with the full data lifecycle: data ingestion (e.g., web-scraping with BeautifulSoup, Scrapy, or Selenium), data wrangling, model development, and evaluation.
  • Demonstrable experience experimenting with and deploying applications using  Large Language Models (e.g., via OpenAI, Anthropic, or open-source models via Hugging Face).
  • Experience building and maintaining data pipelines and a comfort with software engineering best practices (version control, CI/CD, testing).
  • Exceptional communication skills, with the ability to explain technical concepts to non-technical stakeholders and drive alignment.
  • A proactive, curious, and resilient mindset. You see ambiguous problems as opportunities.

Nice-to-Have Experience:

  • Experience designing or working with Human-in-the-Loop systems for data annotation or model validation.
  • Familiarity with MLOps tools and practices (Docker, Kubernetes, MLflow, etc.).
  • Previous experience in a startup or high-growth environment.
  • A background in healthcare, life sciences, or a related domain, but this is not required.

What We Offer

  • The opportunity to be a foundational technical pillar in a high-potential, venture-backed startup.
  • Significant autonomy and ownership to shape our technology stack, data strategy, and product direction.
  • Competitive salary and EMI scheme.
  • Collaborative and supportive environment with direct access to founders and domain experts.
  • Budget for learning, development, and conference attendance.

How to Apply

If you are a builder who is excited by the challenge of structuring the world's HTA data and pioneering the use of LLMs in healthcare, we would love to hear from you.

Please send your CV and a brief note explaining what motivates you about this role to Finlay McIntyre at F.Mcintyre@hiveoptimum.com

Application Deadline: November 7th, 2026

Company
HTA-Hive
Location
London, UK
Hybrid / WFH Options
Posted
Company
HTA-Hive
Location
London, UK
Hybrid / WFH Options
Posted