12 of 12 Permanent Reinforcement Learning Jobs in Central London

Artificial Intelligence Researcher

Hiring Organisation
Cubiq Recruitment
Location
City of London, London, United Kingdom
Robot Learning/Embodied AI We’re partnering with a venture-backed robotics startup building systems that allow humans to extend their physical capabilities through intelligent robotic platforms. The company recently secured new funding and is expanding its AI research team to develop learning systems for dexterous manipulation … real-world robotic autonomy. This role sits at the intersection of robot learning, multimodal models, and real-world deployment. The focus is not simulation research alone. The work is about taking cutting-edge robot learning approaches and making them function reliably on physical systems. Why this opportunity ...

Artificial Intelligence Engineer

Hiring Organisation
WorkGenius Group
Location
City of London, London, United Kingdom
Role: Full-time (Permanent Role) We are building a world-class AI research team focused on advancing next-generation agentic systems and intent-aware learning architectures. Our mission is to bridge cutting-edge research in large language models, reinforcement learning, and alignment with scalable, real-world production … systems. You will operate at the intersection of research and product, shaping foundational capabilities in intent understanding, agent learning, and model alignment across distributed AI environments. This is an opportunity to influence AI systems deployed at global scale across diverse compute environments including edge and cloud. Responsibilities Define Research ...

Applied AI Research Engineer - £300k + bens - London

Hiring Organisation
Transparent Technology
Location
City of London, London, United Kingdom
Employment Type
Permanent
success. What You'll Do * Design and implement state-of-the-art instruction tuning methods * Fine-tune and deploy LLMs in production environments * Apply reinforcement learning techniques (SFT, PPO, DPO, GRPO) * Run hands-on experimentation to outperform closed-source models * Break down ambiguous research ideas into structured roadmaps … speech systems. Ideal Background * 5-7+ years in applied AI/ML (exceptional 3+ years considered) * Deep experience in fine-tuning + reinforcement learning * Experience shipping ML systems from research into production * Open-source LLM experience essential * Product-driven engineering mindset (Apple, LinkedIn, Amazon style environments ideal ...

Data Scientist

Hiring Organisation
Synergetic
Location
City of London, London, United Kingdom
with software engineering capabilities to build end-to-end AI solutions. The ideal candidate will have a strong foundation in both developing sophisticated machine learning models and implementing them within production systems. You will work closely with cross-functional teams to transform concepts into scalable AI-powered products. … looking for candidates that can combine technical expertise with a true consulting approach. Responsibilities Design, develop, and implement advanced machine learning models and AI capabilities Build and maintain knowledge graphs and causal inference systems Create probabilistic models to address complex business problems Scale AI solutions from proof-of-concept ...

Lead ML Engineer (London)

Hiring Organisation
Glite Tech
Location
City of London, London, United Kingdom
English to intermediate and advanced learners. We’re on the verge of solving one of the biggest challenges in education – making high-quality, personalised learning accessible to everyone . We are building a fundamental model for education - one that can accurately predict student knowledge and orchestrate lessons, adapting … models to production, to own the ML team in our growing company. What you will do 🚀 Build fundamental models for education - solving the ultimate learning task of predicting student knowledge and optimal ‘next task’ Work with a vast amount of unique data - we have data from over 1M language ...

Senior Data Scientist

Hiring Organisation
Vitality
Location
City of London, London, United Kingdom
Full time, 37.5 hours per week. We are happy to discuss flexible working! Top 3 skills needed for this role: Deep Expertise in Machine Learning, Data Science & Technical Tooling Strategic Project Leadership & Business Impact Delivery High Level Stakeholder Engagement & Communication What this role is all about: Vitality is entering … members live healthier, happier, longer lives. As a Senior Data Scientist , you will play a pivotal role in designing, building, and executing advanced machine learning and AI solutions that sit at the heart of Vitality’s transformation. Your work will help shape the next generation of personalised health insurance ...

Senior Data Scientist

Hiring Organisation
Vitality Corporate Services Limited - Tech
Location
Central London, London, United Kingdom
Employment Type
Permanent
Salary
£95,000
Office.Full time, 37.5 hours per week. We are happy to discuss flexible working! Top 3 skills needed for this role: Deep Expertise in Machine Learning, Data Science & Technical Tooling Strategic Project Leadership & Business Impact Delivery High Level Stakeholder Engagement & Communication What this role is all about: Vitality is entering … members live healthier, happier, longer lives. As a Senior Data Scientist , you will play a pivotal role in designing, building, and executing advanced machine learning and AI solutions that sit at the heart of Vitalitys transformation. Your work will help shape the next generation of personalised health insurance ...

AI Engineer

Hiring Organisation
DXC
Location
City of London, London, United Kingdom
Employment Type
Permanent
data pipelines and infrastructure. Partnering with cross-functional teams to understand data needs and shape solutions. Contributing to data quality, governance, and security initiatives. Learning directly from specialists in AI and data engineering. Helping to continuously improve and optimise data processes. Staying current with emerging tools, trends, and technologies. … Mistral, Claude). Skills in fine-tuning, prompt engineering, and building RAG pipelines. Familiarity with Agent Frameworks (LangChain, LlamaIndex, CrewAI, AutoGen). Knowledge of reinforcement learning methods or tools (Q-learning, policy gradients, RLlib). Why Join Us? Work on AI solutions that make a meaningful impact ...

Senior NLP Engineer (London)

Hiring Organisation
Glite Tech
Location
City of London, London, United Kingdom
English to intermediate and advanced learners. We’re on the verge of solving one of the biggest challenges in education – making high-quality, personalised learning accessible to everyone . We are building a fundamental model for education - one that can accurately predict student knowledge and orchestrate lessons, adapting … models to production, to join the ML team in our growing company. What you will do 🚀 Build fundamental models for education - solving the ultimate learning task of predicting student knowledge and optimal ‘next task’ Build fully-automated pipelines for dictionary building; including span identification, word sense distribution, and sense ...

Research Engineer (Agents)

Hiring Organisation
Native
Location
City of London, London, United Kingdom
structured data representations (tables, graphs, schemas) Build training, simulation, and evaluation environments for long-horizon, multi-step agent behavior Develop self-supervised and reinforcement learning objectives for improving agent reliability and correctness Integrate foundation model embeddings and symbolic components into agent workflows Own agent systems ...

Senior Data Engineer

Hiring Organisation
develop
Location
City of London, London, United Kingdom
industry standards AI Enablement & Data Serving Build high-quality datasets for retrieval pipelines (RAG), embeddings and conversational agents Create data foundations supporting decision engines, reinforcement learning and value measurement Partner with AI engineers to operationalise pipelines for LLM workflows and agentic systems Standards, Documentation & Reusability Produce clear documentation ...

Head of Decision Science Consulting, UK based

Hiring Organisation
Staffworx Limited
Location
Central London, London, United Kingdom
Employment Type
Permanent
models in live business decision environments. Strong fluency in Python, modern ML tools, and decision optimisation frameworks. Deep understanding of statistical modelling and machine learning with experience deploying models into production-scale systems. Beneficial: pricing and revenue optimisation, forecasting and supply chain, risk and fraud modelling, reinforcement learning ...