Research Scientist

Research Scientist (LLM/RL)

Frontier AI | London or Paris

Compensation: Up to £250k + equity package.

About

This is a well-funded frontier AI startup building state-of-the-art agentic systems that automate complex, multi-step tasks normally done by humans.

The team combines deep research (in computer use and proprietary models) with forward-deployed implementation alongside enterprise clients.

The Models team builds the core LLMs and vision-language models behind these agentic systems. The focus is on training models that work well for agents in practice: strong instruction following, reliable tool use, and good decision-making at a given inference cost.

What you'll do

  • Research post-training methods for large multimodal language models, with a focus on RL and feedback-driven learning
  • Design reward models and large-scale reinforcement learning setups for instruction following and tool use
  • Build automated data collection pipelines using human and machine feedback
  • Develop evaluations that capture real capability gains (not just benchmark improvements)
  • Translate concrete product failures and use cases into new training signals

What you'll need

  • Strong research background combined with hands-on experience with LLM post-training, alignment, or reinforcement learning
  • Proficiency in Python and at least one major DL framework (PyTorch, JAX, or TensorFlow)
  • Experience training large models on distributed systems
  • Publications at top-tier conferences (NeurIPS, ICML, ICLR, ACL, CVPR, etc.)
  • Comfortable working in fast-moving, loosely specified research problems

Shortlisted candidates will be contacted within 48 hours.

Job Details

Company
Axiōma Search
Location
London Area, United Kingdom
Posted