Research Scientist
Research Scientist (LLM/RL)
Frontier AI | London or Paris
Compensation: Up to £250k + equity package.
About
This is a well-funded frontier AI startup building state-of-the-art agentic systems that automate complex, multi-step tasks normally done by humans.
The team combines deep research (in computer use and proprietary models) with forward-deployed implementation alongside enterprise clients.
The Models team builds the core LLMs and vision-language models behind these agentic systems. The focus is on training models that work well for agents in practice: strong instruction following, reliable tool use, and good decision-making at a given inference cost.
What you'll do
- Research post-training methods for large multimodal language models, with a focus on RL and feedback-driven learning
- Design reward models and large-scale reinforcement learning setups for instruction following and tool use
- Build automated data collection pipelines using human and machine feedback
- Develop evaluations that capture real capability gains (not just benchmark improvements)
- Translate concrete product failures and use cases into new training signals
What you'll need
- Strong research background combined with hands-on experience with LLM post-training, alignment, or reinforcement learning
- Proficiency in Python and at least one major DL framework (PyTorch, JAX, or TensorFlow)
- Experience training large models on distributed systems
- Publications at top-tier conferences (NeurIPS, ICML, ICLR, ACL, CVPR, etc.)
- Comfortable working in fast-moving, loosely specified research problems
Shortlisted candidates will be contacted within 48 hours.