Reinforcement Learning Jobs in the UK

401 to 425 of 436 Reinforcement Learning Jobs in the UK

Chief Technology Officer

Coventry, England, United Kingdom
JR United Kingdom
small technical teams, worked directly with product, clients, and stakeholders. A solid background in AI (Engineering rather than Research) Possess a deep understanding of reinforcement learning, distributed training and Agentic AI Not for you if: You’re more strategist than builder. Your AI experience is just post-ChatGPT. More ❯
Posted:

Chief Technology Officer

Hemel Hempstead, England, United Kingdom
JR United Kingdom
small technical teams, worked directly with product, clients, and stakeholders. A solid background in AI (Engineering rather than Research) Possess a deep understanding of reinforcement learning, distributed training and Agentic AI Not for you if: You’re more strategist than builder. Your AI experience is just post-ChatGPT. More ❯
Posted:

Chief Technology Officer

Crawley, England, United Kingdom
JR United Kingdom
small technical teams, worked directly with product, clients, and stakeholders. A solid background in AI (Engineering rather than Research) Possess a deep understanding of reinforcement learning, distributed training and Agentic AI Not for you if: You’re more strategist than builder. Your AI experience is just post-ChatGPT. More ❯
Posted:

Chief Technology Officer

London, England, United Kingdom
MBN Solutions
small technical teams, worked directly with product, clients, and stakeholders. A solid background in AI (Engineering rather than Research) Possess a deep understanding of reinforcement learning, distributed training and Agentic AI Not for you if: You’re more strategist than builder. Your AI experience is just post-ChatGPT. More ❯
Posted:

Senior Code Reviewer for LLM Data Training (GO)

London, England, United Kingdom
SME Work
B2, C1, C2, or Native level. Preferred Qualifications Experience in AI training, LLM evaluation, or model alignment. Familiarity with annotation platforms. Exposure to RLHF (Reinforcement Learning from Human Feedback) pipelines. Compensation : $40 Hourly Why Join Us? Join a high-impact team working at the intersection of AI and More ❯
Posted:

Process Lead with Portuguese, Alexa Shopping OPTIMA

London, United Kingdom
Amazon
Large Language Models (LLMs), enabling Amazon to deliver a superior shopping experience to customers worldwide. Our mission is to empower Amazon's LLMs through Reinforcement Learning from Human Feedback (RLHF) across various categories at high speed. We aspire to provide an end-to-end data solution for the More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Process Lead with French, Alexa Shopping OPTIMA

London, United Kingdom
Amazon
Large Language Models (LLMs), enabling Amazon to deliver a superior shopping experience to customers worldwide. Our mission is to empower Amazon's LLMs through Reinforcement Learning from Human Feedback (RLHF) across various categories at high speed. We aspire to provide an end-to-end data solution for the More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Senior AI Engineer

London Area, United Kingdom
Nume
integration, data processing) Methods for agent monitoring, logging, and performance analysis Experience with model fine-tuning and evaluation for domain-specific applications Background in reinforcement learning or agent training methodologies Why you? You dream big You want to be part of a highly skilled and passionate team striving More ❯
Posted:

Senior AI Engineer

City of London, London, United Kingdom
Nume
integration, data processing) Methods for agent monitoring, logging, and performance analysis Experience with model fine-tuning and evaluation for domain-specific applications Background in reinforcement learning or agent training methodologies Why you? You dream big You want to be part of a highly skilled and passionate team striving More ❯
Posted:

AI Editorial Lead (Italian), Rufus, AI Shopping

London, England, United Kingdom
Amazon
causes, identify error patterns, and propose solutions to enhance the quality of evaluation and annotation tasks - Create and use frameworks for Prompt Generation and Reinforcement Learning with Human Feedback (RLHF) to improve the fluency of our AI models - Work closely with product, science and engineering stakeholders to prioritise More ❯
Posted:

Robotics Lead Engineer (Humanoid)

London, England, United Kingdom
Barrington James
locomotion, manipulation, perception, and human-robot interaction. Oversee the development of AI algorithms for vision, control, decision-making, and autonomy. Optimize and integrate machine learning models for robotics applications, ensuring real-world scalability and efficiency. Build and manage a multidisciplinary team of engineers, researchers, and designers. Collaborate with hardware … Computer Science, or a related field. 10+ years in AI and robotics, with at least 5 years in leadership roles. Deep expertise in: Machine Learning and Deep Learning frameworks (e.g., TensorFlow, PyTorch). Robotic Operating Systems (e.g., ROS, ROS2). Computer Vision, Natural Language Processing (NLP), and Reinforcement More ❯
Posted:

Senior Account Executive

Harrow on the Hill, England, United Kingdom
Anima
exclusive features. Anima saves lives every day Hey! Shun here, I’m the CEO and co-founder of Anima. We’re building an active learning OS for all of healthcare and life sciences towards maximising human wellbeing globally. My entire life, I’ve been pulling on a thread that … the 3 existing product lines we have, that millions of patients use, and build out new ones at the very cutting edge of healthcare reinforcement learning and agentic AI. Your work will save countless lives. Top 1% growth. We grew 450% in 2024, are cash flow positive and … Greatest show ever imo.] It started with me. I self taught and wrote a lot of the Anima 1.0 code, and Anima’s active learning patent. I run most of the hiring tech chats to this day. I first and foremost see myself as an IC and builder, and More ❯
Posted:

Lead AI Consultant

London, England, United Kingdom
Newton
SQL, Spark, TensorFlow, PyTorch, etc. Experience in working with large and complex datasets, both structured and unstructured, and applying advanced techniques such as deep learning, natural language processing, computer vision, or reinforcement learning. Expert knowledge of regression, classification and other machine learning techniques. A passion for innovation … and continuous learning, and a willingness to share knowledge and mentor others. A proven track record of delivering high-quality and impactful data-driven solutions to real-world problems. A working knowledge of the state-of-the-art methods and best practices in data science, such as data preprocessing More ❯
Posted:

Artificial Intelligence Engineer

London, England, United Kingdom
Hybrid / WFH Options
Birdie
ensuring seamless data flow using AWS SageMaker/Vertex AI. Evaluate Emerging Technologies: Research and implement cutting-edge AI tech, including generative AI and reinforcement learning, to enhance platform capabilities. Build Evaluation and Testing Frameworks: Collaborate with product and design teams to define evaluation methodologies and validation approaches … in-person and online events. Birdie does not support visa applications. We invest in your growth with an annual development budget, coaching, and continuous learning opportunities. Additional Benefits: Work From Home equipment budget 33 days holiday (including bank holidays), plus extra days for birthdays and volunteering Private health insurance More ❯
Posted:

Hardware Engineer (electronics)

England, United Kingdom
helsing.ai
a meaningful impact in the world. Our work frequently takes us right up to the state of the art in technical innovation, be it reinforcement learning, distributed systems, generative AI, or deployment infrastructure. The defence industry is entering the most exciting phase of the technological development curve. Advances More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Deployed Product Designer

London, England, United Kingdom
Helsing
a meaningful impact in the world Our work frequently takes us right up to the state of the art in technical innovation, be it reinforcement learning, distributed systems, generative AI, or deployment infrastructure. The defence industry is entering the most exciting phase of the technological development curve. Advances More ❯
Posted:

Talent Lead, GTM

London, United Kingdom
Anima
Anima saves lives every day Hey! Shun here, I'm the CEO and co-founder of Anima. We're building an active learning OS for all of healthcare and life sciences towards maximising human wellbeing globally. My entire life, I've been pulling on a thread that's affected … the 3 existing product lines we have, that millions of patients use, and build out new ones at the very cutting edge of healthcare reinforcement learning and agentic AI. Your work will save countless lives. Top 1% growth We grew 450% in 2024, are cash flow positive and … on coaching and teaching. It started with me. I self taught and wrote a lot of the Anima 1.0 code, and Anima's active learning patent. I run most of the hiring tech chats to this day. I first and foremost see myself as an IC and builder, and More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Lead Recruiter, GTM

London, United Kingdom
Anima
Anima saves lives every day Hey! Shun here, I'm the CEO and co-founder of Anima. We're building an active learning OS for all of healthcare and life sciences towards maximising human wellbeing globally. My entire life, I've been pulling on a thread that's affected … the 3 existing product lines we have, that millions of patients use, and build out new ones at the very cutting edge of healthcare reinforcement learning and agentic AI. Your work will save countless lives. Top 1% growth. We grew 450% in 2024, are cash flow positive and … on coaching and teaching. It started with me. I self-taught and wrote a lot of the Anima 1.0 code, and Anima's active learning patent. I run most of the hiring tech chats to this day. I first and foremost see myself as an IC and builder, and More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Artificial Intelligence Engineer

Manchester, England, United Kingdom
Manchester Digital
unlock personalised cognitive insights from gameplay and biosignals. About the Role We’re on the hunt for a bold, curious, and technically brilliant machine learning engineer to build and deploy AI systems at the heart of our BCI platform. You’ll be working with rich, multimodal datasets—speech, neural … on experience with foundation models Passion for building tools that enhance human health, cognition, and agency The Cherry on Top Experience with games, psychometrics, reinforcement learning, or wearables Familiarity with real-time ML or edge deployment Background in digital health, neuroscience, or behavioural research Experience with adaptive game More ❯
Posted:

Graphic Designer

London, England, United Kingdom
Helsing
proposals. Are known for a meticulous, pixel perfect approach. Capable of independently managing multiple projects, anticipating challenges, and formulating intelligent solutions. Possess a quick learning curve and intellectual curiosity to learn new tools such as AI image generation, and explore new areas such as radar systems, military doctrine, cognitive … a meaningful impact in the world. Our work frequently takes us right up to the state of the art in technical innovation, be it reinforcement learning, distributed systems, generative AI, or deployment infrastructure. The defence industry is entering the most exciting phase of the technological development curve. Advances More ❯
Posted:

New Trading Team's 1st C++ Quant Developer | HFT

London Area, United Kingdom
Augmentti
frequency, low-latency trading. Work with a Humble Leader : You’ll work closely with a brilliant PM who has a strong technical background (from reinforcement learning strategies to low-latency C++ coding) and a pragmatic, collaborative approach. This is someone who’s not only mastered complex trading strategies More ❯
Posted:

New Trading Team's 1st C++ Quant Developer | HFT

City of London, London, United Kingdom
Augmentti
frequency, low-latency trading. Work with a Humble Leader : You’ll work closely with a brilliant PM who has a strong technical background (from reinforcement learning strategies to low-latency C++ coding) and a pragmatic, collaborative approach. This is someone who’s not only mastered complex trading strategies More ❯
Posted:

New Trading Team's 1st C++ Quant Developer | HFT

Slough, England, United Kingdom
JR United Kingdom
frequency, low-latency trading. Work with a Humble Leader : You’ll work closely with a brilliant PM who has a strong technical background (from reinforcement learning strategies to low-latency C++ coding) and a pragmatic, collaborative approach. This is someone who’s not only mastered complex trading strategies More ❯
Posted:

EA Team Lead

London, United Kingdom
helsing.ai
a meaningful impact in the world Our work frequently takes us right up to the state of the art in technical innovation, be it reinforcement learning, distributed systems, generative AI, or deployment infrastructure. The defence industry is entering the most exciting phase of the technological development curve. Advances More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Senior Software Engineer

London Area, United Kingdom
Humanoid
simulation and real hardware environments. You will be part of a focused team responsible for the application level software that connects control, navigation, perception, learning, and platform systems. Your work will ensure that these components operate as a coherent and reliable system that users can interact with seamlessly. This … You Will Do You will develop and maintain application level software for humanoid robots You will integrate software components from controls, navigation, computer vision, reinforcement learning, and platform teams You will contribute to the structure and evolution of the application architecture and its interfaces You will work closely … highly proficient in C++ and have experience delivering production grade software You have a solid understanding of robotic subsystems including control, perception, navigation, and learning You are familiar with ROS or ROS2 or equivalent middleware platforms You are comfortable reading, understanding, and integrating code from a range of other More ❯
Posted:
Reinforcement Learning
10th Percentile
£76,727
25th Percentile
£90,801
Median
£130,000
75th Percentile
£175,000