Employment Type

Remote Jobs

Hybrid/WFH 66

Sort By

Relevance
Date

Locations

Job Titles

126 to 150 of 190 Reinforcement Learning Jobs in London

Research Scientist

City of London, London, United Kingdom

Adamas Knight

team. This lab is composed of researchers from elite institutions and industry labs including OpenAI, Google DeepMind, and Microsoft AI, and is focused on advancing the SOTA in LLMs, reinforcement learning, and deep learning for complex systems. The team is building the models with the goal of powering the next generation of AI supercomputing systems. The Role … As a Research Scientist, you’ll be at the heart of groundbreaking research, working on core problems in deep learning, generative modelling, and RL. With larger compute resources per capita than any other tech company, you’ll also have the opportunity to conduct and publish your research in top-tier conferences, such as NeurIPS, ICML, and ICLR, etc. and … collaborate with world-renowned researchers across multiple DL disciplines. You might be a fit if you have... PhD in Machine Learning, Computer Science, Mathematics or related field; Strong track-record in research in Big Tech; Deep technical expertise in training, fine-tuning, or scaling deep learning models; Experience developing models in language modelling, reinforcement learning, or More ❯

Posted: 2 days ago

Research Scientist

London Area, United Kingdom

Adamas Knight

Posted: 2 days ago

Research Scientist

London, England, United Kingdom

Adamas Knight

Posted: 4 days ago

Head of Applied AI - Robotics

London Area, United Kingdom
Hybrid / WFH Options

Acquired Talent Ltd

opening up possibilities to build far richer, more capable intelligent behaviours. This is a unique opportunity to help build a dedicated AI function focused on embedding cutting-edge multimodal learning models into next-gen robotic platforms. The role will involve defining the long-term roadmap for how intelligence is deployed across the stack, from perception through to action—combining … unified, responsive behaviours. To be successful in this role, you’ll need to bring: Strong technical leadership across applied AI/ML, with deep hands-on experience in robotic learning or embodied intelligence A solid background in multimodal model development—especially in areas that combine computer vision, language understanding, and interactive learning (LLM, VLM or VLA) Real-world … deployment experience of learning-based systems, ideally within robotic or physical environments (embodied systems (AI) & reinforcement learning) Comfort collaborating across functions, especially with engineering, hardware, and system design teams Solid programming and prototyping skills using modern deep learning frameworks (e.g. PyTorch, TensorFlow, JAX) Location: London (hybrid, with 4 days onsite and flexible hours) This is a More ❯

Posted: 3 days ago

Head of Applied AI - Robotics

City of London, London, United Kingdom
Hybrid / WFH Options

Acquired Talent Ltd

Posted: 3 days ago

Internship: ML Engineer & Research DeepFlow, London (LLMs, Agentic Systems, Multi-Agent Collaboration)

London Area, United Kingdom
Hybrid / WFH Options

DeepFlow

/human teams, transforming complex coordination into a significant human advantage. This endeavour involves cutting-edge research and innovative integration across several highly active fields of AI and machine learning, including high-capacity multi-modal foundation models, reinforcement learning, and multi-agent collaboration frameworks. Internship Opportunity - Join the ML Team at DeepFlow We are looking for a … work Ability to work independently and in a team Clear communication at meetings Desired skills: Experience in building/using AI/ML methods (e.g. LLMs, multi-agent systems, reinforcement learning, agents SDKs) Knowledge of agentic systems, reasoning models, tool use, etc. Experience contributing to empirical AI research projects (e.g. demonstrated via publications at top AI/ML More ❯

Posted: 2 days ago

Internship: ML Engineer & Research DeepFlow, London (LLMs, Agentic Systems, Multi-Agent Collaboration)

City of London, London, United Kingdom
Hybrid / WFH Options

DeepFlow

Posted: 2 days ago

All you need to know about becoming a Generative AI Engineer

London, England, United Kingdom

Techloy, Inc

the demand for Generative AI Engineers is skyrocketing. A Generative AI Engineer specializes in building and fine-tuning AI models that generate new content. These professionals work with machine learning (ML), deep learning (DL), and neural networks to develop AI systems capable of producing human-like text, realistic images, and even synthetic voices. Unlike traditional AI engineers who … massive datasets to enhance their ability to produce realistic content. Optimizing AI algorithms for efficiency, accuracy, and ethical AI development. Working with Natural Language Processing (NLP), Computer Vision, and Reinforcement Learning to advance AI capabilities. Collaborating with cross-functional teams, including data scientists, software engineers, and product teams, to integrate AI models into applications. #J-18808-Ljbffr More ❯

Posted: Today

Member of Technical Staff

London, United Kingdom

Microsoft

design for AI, prompt engineering methodologies, and AI systems design. Demonstrated experience in one or more of the following areas: prompt engineering, experimental design, language model evaluations, fine tuning, reinforcement learning/direct preference optimization, data curation, and classic machine learning principles. Required/Minimum Qualifications Bachelor's Degree in Computer Science, or related technical discipline AND … open source contributions, and/or on-the-job work experience. Deeper expertise in one or more parts of the AI stack, including prompt engineering, pre-training, fine-tuning, reinforcement learning and direct preference optimization, data curation, LLM inference, orchestration, evaluation pipelines, and deployment. Additional or Preferred Qualifications Bachelor's/Master's Degree in Computer Science or … AI and its deployment. Demonstrated written and verbal communication skills with the ability to work closely with cross-functional teams, including product managers, designers, and other engineers. Passion for learning new technologies and staying up to date with industry trends, best practices, and emerging technologies in AI. Proven ability to collaborate and contribute to a positive, inclusive work environment More ❯

Employment Type: Permanent

Salary: GBP Annual

Posted: 4 days ago

Robotic Manipulation Engineer London, United Kingdom

London, United Kingdom

Lodestar

Are you excited by the challenge of teaching robots how to understand and manipulate the physical world? Do you want to build real systems that use machine learning and perception to grasp and interact in unstructured and unknown environments? Do you thrive in fast-paced, multidisciplinary teams solving hard problems? If so, you'll fit right in at Lodestar … space infrastructure, and space domain defence. Our robotic systems are designed to perform autonomous manipulation in the harshest and most complex environments imaginable. We're looking for a Machine Learning Robotic Manipulation Engineer to lead development of the core ML-based grasping stack for our robotic capture system. In this role, you'll research, design, and implement models and … flight. If you're passionate about robotics and want to see your work in action, this is the role for you. What You'll Do Lead development of machine learning systems for real world grasping and manipulation of unknown objects in space. Design and train models for arbitrary grasp candidate prediction and real-time grasp quality evaluation. Build high More ❯

Employment Type: Permanent

Salary: GBP Annual

Posted: 2 days ago

Software Engineer, Inference Scalability and Capability

London, United Kingdom
Hybrid / WFH Options

Menlo Ventures

bias towards flexibility and impact Pick up slack, even if it goes outside your job description Enjoy pair programming (we love to pair!) Want to learn more about machine learning research Care about the societal impacts of your work Strong candidates may also have experience with: Implementing and deploying machine learning systems at scale LLM optimization batching and … many of the directions our team worked on prior to Anthropic, including: GPT-3, Circuit-Based Interpretability, Multimodal Neurons, Scaling Laws, AI & Compute, Concrete Problems in AI Safety, and Learning from Human Preferences. Come work with us! Anthropic is a public benefit corporation headquartered in San Francisco. We offer competitive compensation and benefits, optional equity donation matching, generous vacation … train more aligned (helpful, honest, and harmless) models and does "alignment science" to understand how alignment techniques work and try to extrapolate to uncover and address new failure modes. Reinforcement Learning - Reinforcement Learning is used by a variety of different teams, both for alignment and to teach models to be more capable at specific tasks. Platform More ❯

Employment Type: Permanent

Salary: GBP Annual

Posted: 15 days ago

Research Scientist - Financial Inclusion at Scale

London, United Kingdom
Hybrid / WFH Options

M-KOPA

to customer behaviour modeling Publish research at top conferences (NeurIPS, ICML, ICLR, AAAI etc) while solving real-world problems Work with teams across Africa and Europe to deploy Machine Learning solutions that expand financial access What You Need Hands-on experience or Academic Publication in generative models (VAEs, GANs, diffusion models, or related methods) Advanced expertise in mathematics, statistics … and theoretical computer science Experience with deep learning, reinforcement learning, or specialised Machine Learning domains Experience with structured or unstructured data Proven ability to translate research into scalable algorithms Ideally a publication record in Machine Learning research (NeurIPS, ICML, ICLR, AAAI etc preferred) Our Mission We create financial inclusion for the traditionally excluded through a More ❯

Employment Type: Permanent

Salary: GBP Annual

Posted: 7 days ago

Founding AI Agent Engineer

London, England, United Kingdom

Eloquent AI (YC X25)

the ability to work closely with customers to refine AI solutions. Bonus Points If... You have experience with prompt engineering, parameter-efficient fine-tuning (PEFT), retrieval-augmented generation (RAG), reinforcement learning for LLMs. You have experience with frontend or backend development (React, Node.js, NestJS) to enhance AI-driven applications. You have a background in information retrieval or recommender More ❯

Posted: Yesterday

Bio AI Research Scientist

London, United Kingdom

Prima Mente

the freedom to propose and implement state-of-the-art infrastructure solutions. Exceptional Team: Collaborate with talented colleagues from diverse backgrounds across ML, bioinformatics, and engineering. Growth Opportunities: Continuous learning and growth opportunities in a rapidly advancing technical field. Culture Insight What we are doing is extremely hard. Prima Mente is for great people. We are team players who … PyTorch, JAX, TensorFlow) and familiarity with scalable training frameworks. Experience managing large-scale distributed training across GPUs/TPUs. Background in generative AI, large language models, mechanistic interpretability, or reinforcement learning. Interview Process You will interact with co-founders Ravi and Hannah, before meeting the technical team across a deep dive on your CV and then a round of More ❯

Employment Type: Permanent

Salary: GBP Annual

Posted: 3 hours ago

Software Engineer (React / Typescript / Python)

City of London, London, United Kingdom
Hybrid / WFH Options

TalentCo

although proximity to London would be advantageous. As Software Engineer you will: Build AI-first features: Design and ship cutting-edge platform features using personalisation, LLMs, recommender systems, and reinforcement learning. Work across the stack: Own features end-to-end with React (TypeScript) and Python (FastAPI), blending UX and ML into production-ready code. Move fast with the latest More ❯

Posted: 2 days ago

Software Engineer (React / Typescript / Python)

London Area, United Kingdom
Hybrid / WFH Options

TalentCo

Posted: 2 days ago

Research Engineer – Learning-Based Animation & Simulation

London, England, United Kingdom
Hybrid / WFH Options

Search Technology

Research Engineer – Learning-Based Animation & Simulation Research Engineer – Learning-Based Animation & Simulation Direct message the job poster from Search Technology Location: London (hybrid strongly preferred); remote in EU possible for exceptional candidates Type: Full-time Seniority: Open from postdoctoral level to senior/staff level About the Position Our client is reimagining how virtual agents move, behave, and … interact. By combining real-time simulation, reinforcement learning, and generative models, they’re building a new foundation for intelligent animation in digital worlds; from characters to physically grounded agents. This isn’t just an ML research role - you’ll be implementing novel algorithms, working directly in simulation, and tuning systems that will run in production pipelines. We’re … from papers, then go further and make them work Work closely with a small, tight-knit team of engineers and researchers shipping iterative prototypes Experience Required: Proven experience in reinforcement learning for control, robotics, or animation Strong coding ability in Python and comfort with performance-aware development Familiarity with simulation environments (Isaac Sim/Lab, MuJoCo, Unity, PyBullet More ❯

Posted: Today

Senior Machine Learning Engineer

London, England, United Kingdom

Wayve

paced environment big problems ignite us—we embrace uncertainty, leaning into complex challenges to unlock groundbreaking solutions. We aim high and stay humble in our pursuit of excellence, constantly learning and evolving as we pave the way for a smarter, safer future. At Wayve, your contributions matter. We value diversity, embrace new perspectives, and foster an inclusive work environment … we back each other to deliver impact. Make Wayve the experience that defines your career! About the Role Our team is seeking a talented Machine Learning Engineer to propel our ambitious research forward. We're not just another team; we're a dynamic blend of Applied Scientists, Machine Learning Engineers, and Software Engineers united together to apply state … of the art research to the road. From pioneering advancements in Offline Reinforcement Learning (RL) and Reward Learning from Human Feedback (RLHF) to developing groundbreaking, large-scale, embodied Foundation Models, our projects are designed to dramatically enhance our product's capabilities. But it's not just about what we do—it's how we do it. We More ❯

Posted: Yesterday

Senior Technology Expert of AI Models

London, England, United Kingdom

microTECH Global Limited

research experience • Over 3 years of experience in training, inference, data science research, or engineering in large models, intelligent agents, or related fields, with a solid foundation in machine learning, artificial intelligence, and reinforcement learning. • Extensive and deep knowledge of cutting-edge technologies in large models, with strong learning abilities to continuously monitor new trends and solutions. … You should also have a keen sense for commercial applications. Desired: • Experience in foundational natural language processing, reinforcement learning, or other interdisciplinary applications and research is a plus. • Strong logical thinking and excellent skills in analysis, abstraction, synthesis, and communication. • At least one representative work achievement that demonstrates your expertise in the field. #J-18808-Ljbffr More ❯

Posted: 2 days ago

Research Engineer / Research Scientist, Multimodal

London, United Kingdom
Hybrid / WFH Options

Menlo Ventures

develop new architectures for modeling multimodal data and study how they interact with text-only models at scale. Building Infrastructure We work on many infrastructure projects including: Complex multimodal reinforcement learning environments. High-performance RPC servers for processing image inputs. Sandboxing infrastructure for securely collecting data. Data Ingestion We are more interested in running simple experiments at large … bias towards flexibility and impact Pick up slack, even if it goes outside your job description Enjoy pair programming (we love to pair!) Want to learn more about machine learning research Care about the societal impacts of your work Strong candidates may also have experience with: High performance, large-scale ML systems GPUs, Kubernetes, Pytorch, or OS internals Language … modeling with transformers Reinforcement learning Large-scale ETL The expected salary range for this position is: Annual Salary: £250,000-£270,000 GBP Logistics Education requirements: We require at least a Bachelor's degree in a related field or equivalent experience. Location-based hybrid policy: Currently, we expect all staff to be in one of our offices at More ❯

Employment Type: Permanent

Salary: GBP Annual

Posted: 15 days ago

Senior AI Engineer

London, England, United Kingdom
Hybrid / WFH Options

GSMA

and contribute to algorithm development. Dr G.A.McHale, Technical Director, AI & Data Science About the Team The team is led by someone with significant AI experience in bio-inspired architectures, reinforcement learning, expert systems, scheduling, meta-heuristics, robotics and natural language processing (including LLMs). We have recruited an experienced scientific computing developer with a strong mathematics background in … and contribute to algorithm development. Dr G.A.McHale, Technical Director, AI & Data Science About the Team The team is led by someone with significant AI experience in bio-inspired architectures, reinforcement learning, expert systems, scheduling, meta-heuristics, robotics and natural language processing (including LLMs). We have recruited an experienced scientific computing developer with a strong mathematics background in … and industry-changing projects and a stimulating and dynamic environment designed to enable you to flourish. In addition to architect-designed offices and competitive compensation, our benefits include fantastic learning & development opportunities, generous holiday allowances, four additional days off for professional development and many others. To learn more about the GSMA, visit our career site , our LinkedIn page and More ❯

Posted: Today

Senior Robotics control Engineer Locomotion

London, England, United Kingdom

ZipRecruiter

robust controllers for walking, balancing while manipulating, fall recovery, and other advanced mobility tasks. The ideal candidate will have strong expertise in classic locomotion pipeline, whole-body control and reinforcement learning. Key Responsibilities : Design and implement control algorithms (classic or RL-based policies) for locomotion tasks, including walking, balancing while manipulating, squatting, stair climbing, fall recovery, and other dynamic … and implementation of control systems for biped robots, focusing on locomotion. Proficiency with model predictive control (MPC), optimal control, and feedback control loops in dynamic robotic systems. Expertise in reinforcement learning for robotics. Deep understanding of humanoid robot dynamics and balance control. Strong experience with hardware-in-the-loop testing and deployment on physical humanoid robots. Strong hands … platforms such as Mujoco, Isaac Sim or similar environments. Proficiency in Python and C++ for algorithm development, testing, and deployment. Knowledge of advanced topics like model-free RL, imitation learning, or hybrid control systems that combine classic and modern methods. Familiarity with real-time control systems and integration with hardware, including actuators and sensors. Expertise in sensor fusion for More ❯

Posted: Today

Research Scientist - FAIR

London, England, United Kingdom

Senior AI Engineer

London, England, United Kingdom
Hybrid / WFH Options

GSMA

Posted: Yesterday

Research Engineer, Knowledge Team

London, United Kingdom
Hybrid / WFH Options

Menlo Ventures

new architectures for how information is organized, and train language models to optimally use those architectures. Responsibilities: Designing and implementing from scratch new information architecture strategies Performing finetuning and reinforcement learning to teach language models how to interact with new information architectures Building "hard" knowledge base eval sets to help identify failure modes of how language models work … may be a good fit if you: Are a very experienced Python programmer who can quickly produce reliable, high quality code that your teammates love using Have good machine learning research experience Have experience developing software that utilizes Large Language Models such as Claude Are results-oriented, with a bias towards flexibility and impact Pick up slack, even if … many of the directions our team worked on prior to Anthropic, including: GPT-3, Circuit-Based Interpretability, Multimodal Neurons, Scaling Laws, AI & Compute, Concrete Problems in AI Safety, and Learning from Human Preferences. Come work with us! Anthropic is a public benefit corporation headquartered in San Francisco. We offer competitive compensation and benefits, optional equity donation matching, generous vacation More ❯

Employment Type: Permanent

Salary: GBP Annual

Posted: 15 days ago

3 4 567 8

Salary Guide

Reinforcement Learning
London

10th Percentile: £92,500
25th Percentile: £125,000
Median: £150,000
75th Percentile: £175,000

More Reinforcement Learning insights »