4 of 4 Remote/Hybrid Reinforcement Learning Jobs in London

Artificial Intelligence Engineer

Hiring Organisation
WorkGenius Group
Location
City of London, London, United Kingdom
Role: Full-time (Permanent Role) We are building a world-class AI research team focused on advancing next-generation agentic systems and intent-aware learning architectures. Our mission is to bridge cutting-edge research in large language models, reinforcement learning, and alignment with scalable, real-world production … systems. You will operate at the intersection of research and product, shaping foundational capabilities in intent understanding, agent learning, and model alignment across distributed AI environments. This is an opportunity to influence AI systems deployed at global scale across diverse compute environments including edge and cloud. Responsibilities Define Research ...

Artificial Intelligence Researcher

Hiring Organisation
microTECH Global LTD
Location
City of London, London, United Kingdom
permanent position with candidates required to do hybrid working in either Cambridge or London. Our client are looking for AI Researchers specialising in Reinforcement Learning with Human Feedback (RLHF) and Generative AI. In this role, you will design and optimise the algorithms that align large-scale generative models … build the next generation of foundation models Responsibilities: Develop and refine RLHF algorithms for large language and generative models. Research and implement deep reinforcement learning methods (policy gradients, actor-critic, off-policy learning) for model alignment. Train, fine-tune, and evaluate LLMs and diffusion models at scale. ...

Senior Data Scientist

Hiring Organisation
Anson Mccade
Location
London, United Kingdom
Employment Type
Permanent
Responsibilities End-to-End Delivery: Lead the technical execution of AI projects, from initial problem discovery and hypothesis testing to deploying production-grade machine learning models. Strategic Advisory: Act as a "translator" between technical complexity and business value. You will work closely with C-suite stakeholders to identify … solve their most pressing strategic challenges. Technical Leadership: Architect robust, scalable data pipelines and state-of-the-art models (including LLMs, Reinforcement Learning, or Bayesian Inference) tailored to specific client needs. Mentorship: Guide and upskill junior Data Scientists, fostering a culture of rigorous peer review, clean coding standards ...

Data Scientist - Inside IR35 - Hybrid

Hiring Organisation
Halian Technology Limited
Location
Croydon, Surrey, South East, United Kingdom
Employment Type
Contract
Role We are recruiting on behalf of a mobility technology business building intelligent fleet orchestration systems. This role suits an experienced Applied Machine Learning Engineer or Data Scientist comfortable working with messy real-world data, operational constraints, and production systems. Youll join a small, high-calibre team solving complex … years Geospatial data experience (H3, GeoPandas, PostGIS or similar) Optimisation/operations research exposure Logistics/mobility/marketplace domain experience Nice to Have Reinforcement learning Simulation modelling Experience deploying models into cloud environments Experimentation frameworks (A/B testing, model validation at scale) How to Apply ...