Remote Reinforcement Learning Jobs in the UK

1 to 25 of 192 Remote Reinforcement Learning Jobs in the UK

Applied Scientist New London

London, England, United Kingdom
Hybrid / WFH Options
Wayve Technologies Ltd
us—we embrace uncertainty, leaning into complex challenges to unlock groundbreaking solutions. We aim high and stay humble in our pursuit of excellence, constantly learning and evolving as we pave the way for a smarter, safer future. At Wayve, your contributions matter. We value diversity, embrace new perspectives, and … to autonomous driving or similar robotics or decision making domain, inclusive, but not limited to the following specific areas: Model-free and model-based reinforcement learning Offline reinforcement learning Planning with learned models, model predictive control and tree search Imitation learning, inverse reinforcement learning … of real-world driving data How to architect our models to best employ the latest advances in foundation models, transformers, world models, etc. Which learning algorithms to use (e.g. reinforcement learning, behavioural cloning) How to leverage simulation for controlled experimental insight, training data augmentation, and re-simulation More ❯
Posted:

Research Engineer, Machine Learning (Horizons)

London, England, United Kingdom
Hybrid / WFH Options
Anthropic
Research Engineer, Machine Learning (Horizons) London, UK About Anthropic Anthropic’s mission is to create reliable, interpretable, and steerable AI systems. We aim for AI to be safe and beneficial for users and society. Our team is a growing group of researchers, engineers, policy experts, and business leaders working … together to build beneficial AI systems. About the role: As a Research Engineer on the Reinforcement Learning Fundamentals team, you will collaborate with researchers and engineers to advance the capabilities and safety of large language models through fundamental research in reinforcement learning, enhancing reasoning abilities in … areas like code generation and mathematics, and exploring reinforcement learning for agentic/open-ended tasks. Representative projects: Develop and implement novel reinforcement learning techniques to improve the performance and safety of large language models. Create tools and environments for models to interact with, enabling complex More ❯
Posted:

Research Engineer, Machine Learning (Horizons)

London, England, United Kingdom
Hybrid / WFH Options
Anthropic
Research Engineer, Machine Learning (Horizons) London, UK About Anthropic Anthropic’s mission is to create reliable, interpretable, and steerable AI systems. We want AI to be safe and beneficial for our users and for society as a whole. Our team is a quickly growing group of committed researchers, engineers … policy experts, and business leaders working together to build beneficial AI systems. About the role: As a Research Engineer on the Reinforcement Learning Fundamentals team, you will collaborate with a diverse group of researchers and engineers to advance the capabilities and safety of large language models through fundamental … research in reinforcement learning, improving reasoning abilities in areas such as code generation and mathematics, and exploring reinforcement learning for agentic/open-ended tasks. Representative projects: Develop and implement novel reinforcement learning techniques to improve the performance and safety of large language models. More ❯
Posted:

Research Engineer, Machine Learning (Horizons) London, UK

London, England, United Kingdom
Hybrid / WFH Options
Alcides Fonseca
Research Engineer, Machine Learning (Horizons) London, UK About Anthropic Anthropic’s mission is to create reliable, interpretable, and steerable AI systems. We want AI to be safe and beneficial for our users and for society as a whole. Our team is a quickly growing group of committed researchers, engineers … policy experts, and business leaders working together to build beneficial AI systems. About the role: As a Research Engineer on the Reinforcement Learning Fundamentals team, you will collaborate with a diverse group of researchers and engineers to advance the capabilities and safety of large language models through fundamental … research in reinforcement learning, improving reasoning abilities in areas such as code generation and mathematics, and exploring reinforcement learning for agentic/open-ended tasks. Representative projects: Develop and implement novel reinforcement learning techniques to improve the performance and safety of large language models. More ❯
Posted:

Machine Learning Engineer

London, United Kingdom
Hybrid / WFH Options
InstaDeep Ltd
AI revolution! About DeepPCB: DeepPCB is InstaDeep's AI-powered Place & Route PCB (Printed Circuit Board) design tool. We use a combination of deep reinforcement learning and high-performance computing to automate and scale PCB place-and-route workflows, accelerating hardware innovation globally. We are looking for a … Machine Learning Engineer to join the DeepPCB team and help push the boundaries of AI for electronic design automation (EDA). You will develop, optimize, and deploy cutting-edge machine learning and reinforcement learning models focused on automating complex PCB design problems, working closely with researchers … and engineers to bring ideas to life. Responsibilities: Develop scalable and efficient machine learning algorithms to tackle PCB place-and-route challenges. Adapt and optimize ML models for large-scale distributed computing environments (e.g., GPUs, multi-node clusters). Build, test, and deploy robust production-level ML systems integrated More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Research Scientist / Research Engineer, Pre-training

London, England, United Kingdom
Hybrid / WFH Options
Anthropic
Contribute to the entire stack, from low-level optimizations to high-level model design Qualifications: Advanced degree (MS or PhD) in Computer Science, Machine Learning, or a related field Strong software engineering skills with a proven track record of building complex systems Expertise in Python and experience with deep … learning frameworks (PyTorch preferred) Familiarity with large-scale machine learning, particularly in the context of language models Ability to balance research goals with practical engineering constraints Strong problem-solving skills and a results-oriented mindset Excellent communication skills and ability to work in a collaborative environment Care about … Work on high-performance, large-scale ML systems Familiarity with GPUs, Kubernetes, and OS internals Experience with language modeling using transformer architectures Knowledge of reinforcement learning techniques Background in large-scale ETL processes You'll thrive in this role if you: Have significant software engineering experience Are results More ❯
Posted:

ML Research Engineer

London, England, United Kingdom
Hybrid / WFH Options
JR United Kingdom
join our dynamic team. In this role, you will be responsible for building and optimizing complex simulation environments to facilitate the training of machine learning models. The ideal candidate will have a strong background in programming, modelling and machine learning, with optional expertise in reinforcement learning. About … Aeris Aeris-UK is an applied AI company working on real-world problems that require creativity, rigour and solid engineering. We build machine learning systems that are efficient, understandable and ready to operate in the complexity of real environments, whether that involves supporting infrastructure resilience, enabling autonomous decision-making … help people reason under uncertainty. Our projects are practical in focus but intellectually demanding, drawing on ideas from simulation, human-AI interaction, multi-agent learning and model-based reasoning. We are a small team with a strong research culture and a shared interest in solving meaningful and challenging problems. More ❯
Posted:

ML Research Engineer

Slough, England, United Kingdom
Hybrid / WFH Options
JR United Kingdom
join our dynamic team. In this role, you will be responsible for building and optimizing complex simulation environments to facilitate the training of machine learning models. The ideal candidate will have a strong background in programming, modelling and machine learning, with optional expertise in reinforcement learning. About … Aeris Aeris-UK is an applied AI company working on real-world problems that require creativity, rigour and solid engineering. We build machine learning systems that are efficient, understandable and ready to operate in the complexity of real environments, whether that involves supporting infrastructure resilience, enabling autonomous decision-making … help people reason under uncertainty. Our projects are practical in focus but intellectually demanding, drawing on ideas from simulation, human-AI interaction, multi-agent learning and model-based reasoning. We are a small team with a strong research culture and a shared interest in solving meaningful and challenging problems. More ❯
Posted:

Lead Applied Scientist, TinyML London

London, England, United Kingdom
Hybrid / WFH Options
Wayve Technologies Ltd
will directly impact the future of autonomous vehicles, enhancing their adaptability, reliability, and efficiency through innovative approaches such as model-free and model-based reinforcement learning, efficient vision-language models, and more. In this role, you will be at the forefront of designing and optimizing foundation models that … optimize ultra-efficient foundation models specifically tailored for autonomous systems and embodied AI. Develop and refine techniques such as model-free and model-based reinforcement learning, and efficient vision-language models to improve the adaptability, reliability, and efficiency of autonomous systems. Collaborate with world-class researchers and engineers … autonomous technologies. Essential 7+ years of ML engineering/applied science experience in an industrial research environment Experience in GenAI, EfficientAI, LLMs, World Models, Reinforcement Learning, or Autonomous Driving Passion for working in a team on research ideas that have real-world impact Strong programming skills in Python More ❯
Posted:

Machine Learning Engineer - AI for Grid Innovation & Energy Transition (Energy Sector Experienc[...]

Stafford, England, United Kingdom
Hybrid / WFH Options
Energy Job Search
Machine Learning Engineer - AI for Grid Innovation & Energy Transition (Energy Sector Experience Required) Join to apply for the Machine Learning Engineer - AI for Grid Innovation & Energy Transition (Energy Sector Experience Required) role at Energy Job Search Machine Learning Engineer - AI for Grid Innovation & Energy Transition (Energy Sector … Experience Required) 2 days ago Be among the first 25 applicants Join to apply for the Machine Learning Engineer - AI for Grid Innovation & Energy Transition (Energy Sector Experience Required) role at Energy Job Search Get AI-powered advice on this job and more exclusive features. Job Description Summary GE … life. Are you excited at the opportunity to electrify and decarbonize the world? We are looking for a passionate, creative, and results-oriented Machine Learning (ML) Engineer with substantial experience in the energy, smart infrastructure, or industrial automation sectors to join our AI & Grid Innovation team. In this role More ❯
Posted:

Deep Learning Researcher

London, England, United Kingdom
Hybrid / WFH Options
MediaTek
seeking a highly motivated and talented Research Scientist to join our AI research team. The ideal candidate will have a strong background in machine learning, artificial intelligence, computer science, mathematics, or physics, and a proven track record of research excellence. As a Research Scientist, you will work on cutting … research that supports both our applications and the broader scientific community. Current areas of interest include large language models (LLMs), optimization methods for deep learning, reinforcement learning (RL), and generative models. Responsibilities •Conduct innovative research in machine learning and artificial intelligence •Develop and implement algorithms •Collaborate … top-tier conferences and journals •Stay up to date with the latest advancements in AI and related fields RequirementResponsibilities •Conduct innovative research in machine learning and artificial intelligence •Develop and implement algorithms •Collaborate with cross-functional teams to integrate AI solutions •Publish research findings in top-tier conferences and More ❯
Posted:

Robotics Control Engineer - Locomotion

London, England, United Kingdom
Hybrid / WFH Options
ZipRecruiter
the frontier of what legged machines can do. As part of this growth, they’re hiring Robotics Control Engineers with deep expertise in locomotion , reinforcement learning , and dynamic control systems to join their R&D headquarters. The Role: You’ll design and implement locomotion control policies — from walking … and stair climbing to fall recovery and manipulation-balanced motion. You’ll work at the intersection of classical control theory and reinforcement learning, deploying your work on humanoid platforms in the wild. Key Details: Location : Hybrid or Onsite – US or EU HQs Salary : Highly competitive + equity + … Mechatronics , or similar 2+ years experience in control systems for biped or humanoid robots Strong understanding of: Model Predictive Control (MPC) , optimal & feedback control Reinforcement learning in physical systems Humanoid dynamics , balance control, and full-body coordination Proficiency in Python and C++ for real-time algorithm development Experience More ❯
Posted:

AI Research Residency

London Area, United Kingdom
Hybrid / WFH Options
MediaTek
with experienced researchers and engineers on innovative AI projects Research Focus: Engage in areas such as large language models (LLMs), optimization methods for deep learning, reinforcement learning (RL), and generative models Professional Development: Gain hands-on experience in AI research with real-world applications, contributing to both … the broader scientific community Work Arrangement: Benefit from a hybrid work model, combining remote and on-site collaboration Responsibilities Conduct innovative research in machine learning and artificial intelligence Develop and implement algorithms Collaborate with cross-functional teams to integrate AI solutions Publish research findings in top-tier conferences and … journals Stay up to date with the latest advancements in AI and related fields Qualifications Required: PhD in Machine Learning, Artificial Intelligence, Mathematics, Computer Science, Physics, or a related field Proficiency in programming languages such as Python, C++, or similar Strong problem-solving skills and the ability to work More ❯
Posted:

AI Research Residency

City of London, London, United Kingdom
Hybrid / WFH Options
MediaTek
with experienced researchers and engineers on innovative AI projects Research Focus: Engage in areas such as large language models (LLMs), optimization methods for deep learning, reinforcement learning (RL), and generative models Professional Development: Gain hands-on experience in AI research with real-world applications, contributing to both … the broader scientific community Work Arrangement: Benefit from a hybrid work model, combining remote and on-site collaboration Responsibilities Conduct innovative research in machine learning and artificial intelligence Develop and implement algorithms Collaborate with cross-functional teams to integrate AI solutions Publish research findings in top-tier conferences and … journals Stay up to date with the latest advancements in AI and related fields Qualifications Required: PhD in Machine Learning, Artificial Intelligence, Mathematics, Computer Science, Physics, or a related field Proficiency in programming languages such as Python, C++, or similar Strong problem-solving skills and the ability to work More ❯
Posted:

AI Research Residency

South East London, England, United Kingdom
Hybrid / WFH Options
MediaTek
with experienced researchers and engineers on innovative AI projects Research Focus: Engage in areas such as large language models (LLMs), optimization methods for deep learning, reinforcement learning (RL), and generative models Professional Development: Gain hands-on experience in AI research with real-world applications, contributing to both … the broader scientific community Work Arrangement: Benefit from a hybrid work model, combining remote and on-site collaboration Responsibilities Conduct innovative research in machine learning and artificial intelligence Develop and implement algorithms Collaborate with cross-functional teams to integrate AI solutions Publish research findings in top-tier conferences and … journals Stay up to date with the latest advancements in AI and related fields Qualifications Required: PhD in Machine Learning, Artificial Intelligence, Mathematics, Computer Science, Physics, or a related field Proficiency in programming languages such as Python, C++, or similar Strong problem-solving skills and the ability to work More ❯
Posted:

AI Research Residency

london, south east england, united kingdom
Hybrid / WFH Options
MediaTek
with experienced researchers and engineers on innovative AI projects Research Focus: Engage in areas such as large language models (LLMs), optimization methods for deep learning, reinforcement learning (RL), and generative models Professional Development: Gain hands-on experience in AI research with real-world applications, contributing to both … the broader scientific community Work Arrangement: Benefit from a hybrid work model, combining remote and on-site collaboration Responsibilities Conduct innovative research in machine learning and artificial intelligence Develop and implement algorithms Collaborate with cross-functional teams to integrate AI solutions Publish research findings in top-tier conferences and … journals Stay up to date with the latest advancements in AI and related fields Qualifications Required: PhD in Machine Learning, Artificial Intelligence, Mathematics, Computer Science, Physics, or a related field Proficiency in programming languages such as Python, C++, or similar Strong problem-solving skills and the ability to work More ❯
Posted:

AI Research Residency

london (city of london), south east england, united kingdom
Hybrid / WFH Options
MediaTek
with experienced researchers and engineers on innovative AI projects Research Focus: Engage in areas such as large language models (LLMs), optimization methods for deep learning, reinforcement learning (RL), and generative models Professional Development: Gain hands-on experience in AI research with real-world applications, contributing to both … the broader scientific community Work Arrangement: Benefit from a hybrid work model, combining remote and on-site collaboration Responsibilities Conduct innovative research in machine learning and artificial intelligence Develop and implement algorithms Collaborate with cross-functional teams to integrate AI solutions Publish research findings in top-tier conferences and … journals Stay up to date with the latest advancements in AI and related fields Qualifications Required: PhD in Machine Learning, Artificial Intelligence, Mathematics, Computer Science, Physics, or a related field Proficiency in programming languages such as Python, C++, or similar Strong problem-solving skills and the ability to work More ❯
Posted:

AI Research Residency

slough, south east england, united kingdom
Hybrid / WFH Options
MediaTek
with experienced researchers and engineers on innovative AI projects Research Focus: Engage in areas such as large language models (LLMs), optimization methods for deep learning, reinforcement learning (RL), and generative models Professional Development: Gain hands-on experience in AI research with real-world applications, contributing to both … the broader scientific community Work Arrangement: Benefit from a hybrid work model, combining remote and on-site collaboration Responsibilities Conduct innovative research in machine learning and artificial intelligence Develop and implement algorithms Collaborate with cross-functional teams to integrate AI solutions Publish research findings in top-tier conferences and … journals Stay up to date with the latest advancements in AI and related fields Qualifications Required: PhD in Machine Learning, Artificial Intelligence, Mathematics, Computer Science, Physics, or a related field Proficiency in programming languages such as Python, C++, or similar Strong problem-solving skills and the ability to work More ❯
Posted:

Senior Research Scientist (LLM post training) United Kingdom

United Kingdom
Hybrid / WFH Options
PolyAI
This is not just about applying standard fine-tuning techniques - it's about building the future of dialogue systems with novel approaches to reasoning, reinforcement learning, audio-first LLMs, and more. As a Senior Research Scientist at PolyAI, you'll lead impactful research projects from ideation through to … we train and adapt LLMs for real-world conversations - spanning voice, text, and multimodal contexts. You'll work on frontier techniques such as: Conversational reinforcement learning Streaming and continuous turn-taking Audio-native LLMs Distillation of reasoning models Long-context You'll also play a key role in … use of data and models. Stay current with academic and industry advances in LLMs, ASR, TTS, RLHF, and multimodal learning. Requirements: PhD in Machine Learning, Natural Language Processing, Computer Science, or a related field. 5+ years of hands-on experience in deep learning. Proven track record of research innovation More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Senior Research Scientist (LLM post training) United Kingdom

London, England, United Kingdom
Hybrid / WFH Options
PolyAI
This is not just about applying standard fine-tuning techniques - it's about building the future of dialogue systems with novel approaches to reasoning, reinforcement learning, audio-first LLMs, and more. As a Senior Research Scientist at PolyAI, you’ll lead impactful research projects from ideation through to … we train and adapt LLMs for real-world conversations - spanning voice, text, and multimodal contexts. You'll work on frontier techniques such as: Conversational reinforcement learning Streaming and continuous turn-taking Audio-native LLMs Distillation of reasoning models Long-context You’ll also play a key role in … use of data and models. Stay current with academic and industry advances in LLMs, ASR, TTS, RLHF, and multimodal learning. Requirements: PhD in Machine Learning, Natural Language Processing, Computer Science, or a related field. 5+ years of hands-on experience in deep learning. Proven track record of research innovation More ❯
Posted:

Senior Data Scientist Data and Insights London Hybrid Remote

London, United Kingdom
Hybrid / WFH Options
loveholidays
Data Scientists, four Data Scientists, and the Head of Data Science. We specialise in various areas such as Recommender Systems, Time Series Forecasting, Deep Learning, and Reinforcement Learning, fostering a collaborative learning environment. Our focus is on modelling and problem-solving, leveraging advanced machine learning … planning/prioritisation to delivery including monitoring and alerting Designing experiments and modelling to generate actionable insights and enhance business performance Proficient in machine learning and statistical methods for predictive modelling and forecasting Experience deploying ML models to production at scale Solid understanding of SQL Proficiency in unit testing … CI/CD, model management and experiment tracking Desirable Experience with Deep Learning, Generative AI and Reinforcement Learning Experience with Time Series Forecasting and Recommender Systems Previous experience working in e-commerce, retail, or the travel industry. Conducted and analysed large scale A/B experiments Experience More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Software Engineer, Inference Scalability and Capability

London, England, United Kingdom
Hybrid / WFH Options
Anthropic
Pick up slack, even if it goes outside your job description Enjoy pair programming (we love to pair!) Want to learn more about machine learning research Care about the societal impacts of your work Strong candidates may also have experience with: Implementing and deploying machine learning systems at … team worked on prior to Anthropic, including: GPT-3, Circuit-Based Interpretability, Multimodal Neurons, Scaling Laws, AI & Compute, Concrete Problems in AI Safety, and Learning from Human Preferences. Come work with us! Anthropic is a public benefit corporation headquartered in San Francisco. We offer competitive compensation and benefits, optional … and harmless) models and does “alignment science” to understand how alignment techniques work and try to extrapolate to uncover and address new failure modes. Reinforcement LearningReinforcement Learning is used by a variety of different teams, both for alignment and to teach models to be more More ❯
Posted:

Senior AI Software Engineer (Research and Development)

Didcot, England, United Kingdom
Hybrid / WFH Options
Luffy AI
with other developers on innovative code bases.Experience of genetic algorithms, low level neural network execution and concepts such as neuroplasticity, recurrent neural networks and reinforcement learning frameworks like OpenAI Gym would be a huge advantage. This role requires specific experience with Python and familiarity with C/C++ … organisational skills. Qualifications and Experience At least BSc in Computer Science or relevant discipline 3-5 years professional software development experience Some experience with Reinforcement Learning Well versed with industry standard development practices, testing frameworks, source control (git), CI, etc Experience of agile development practices, especially Scrum Master … C, C++, or Rust Experience with genetic algorithms or neuroevolution Experience with neural network concepts such as neuroplasticity and recurrent neural networks Experience of reinforcement learning frameworks like OpenAI Gym Experience with software optimisation or high performance computing, Fluent in English with excellent written and verbal communication skills More ❯
Posted:

Senior AI Software Engineer (Research and Development)

Didcot, England, United Kingdom
Hybrid / WFH Options
ZipRecruiter
with other developers on innovative code bases.Experience of genetic algorithms, low level neural network execution and concepts such as neuroplasticity, recurrent neural networks and reinforcement learning frameworks like OpenAI Gym would be a huge advantage. This role requires specific experience with Python and familiarity with C/C++ … Experience Essentials: At least BSc in Computer Science or relevant discipline 3-5 years professional software development experience Strong Python skills Some experience with Reinforcement Learning Solid grounding in API design, algorithms, design principles Well versed with industry standard development practices, testing frameworks, source control (git), CI, etc … C, C++, or Rust Experience with genetic algorithms or neuroevolution Experience with neural network concepts such as neuroplasticity and recurrent neural networks Experience of reinforcement learning frameworks like OpenAI Gym Experience with software optimisation or high performance computing, Fluent in English with excellent written and verbal communication skills More ❯
Posted:

Senior AI Software Engineer (Research and Development)

Oxford, England, United Kingdom
Hybrid / WFH Options
JR United Kingdom
with other developers on innovative code bases.Experience of genetic algorithms, low level neural network execution and concepts such as neuroplasticity, recurrent neural networks and reinforcement learning frameworks like OpenAI Gym would be a huge advantage. This role requires specific experience with Python and familiarity with C/C++ … organisational skills. Qualifications and Experience At least BSc in Computer Science or relevant discipline 3-5 years professional software development experience Some experience with Reinforcement Learning Well versed with industry standard development practices, testing frameworks, source control (git), CI, etc Experience of agile development practices, especially Scrum Master … C, C++, or Rust Experience with genetic algorithms or neuroevolution Experience with neural network concepts such as neuroplasticity and recurrent neural networks Experience of reinforcement learning frameworks like OpenAI Gym Experience with software optimisation or high performance computing, Fluent in English with excellent written and verbal communication skills More ❯
Posted:
Reinforcement Learning
10th Percentile
£76,727
25th Percentile
£90,801
Median
£150,000
75th Percentile
£175,000