3 of 3 Remote/Hybrid Reinforcement Learning Jobs in London

Artificial Intelligence Engineer

Hiring Organisation
WorkGenius Group
Location
City of London, London, United Kingdom
Role: Full-time (Permanent Role) We are building a world-class AI research team focused on advancing next-generation agentic systems and intent-aware learning architectures. Our mission is to bridge cutting-edge research in large language models, reinforcement learning, and alignment with scalable, real-world production … systems. You will operate at the intersection of research and product, shaping foundational capabilities in intent understanding, agent learning, and model alignment across distributed AI environments. This is an opportunity to influence AI systems deployed at global scale across diverse compute environments including edge and cloud. Responsibilities Define Research ...

Machine Learning Engineer PyTorch LLM

Hiring Organisation
Client Server
Location
East London, London, United Kingdom
Employment Type
Permanent, Work From Home
Machine Learning Engineer (PyTorch LLM) London onsite to £110k Do you have expertise with Machine Learning in production? You could be progressing your career at a London based tech start-up with £5 million in recent pre-seed funding, in an impactful role that you'll shape. … Holidays) Daily lunch, monthly breakfasts Dog friendly office Pension Monthly socials Impactful role that you can shape and influence Your role: As a Machine Learning Engineer you'll take open-source LLMs (code and general models) and turn them into high-performance software engineer agents using supervised fine tuning ...

Data Scientist - Inside IR35 - Hybrid

Hiring Organisation
Halian Technology Limited
Location
Croydon, Surrey, South East, United Kingdom
Employment Type
Contract
Role We are recruiting on behalf of a mobility technology business building intelligent fleet orchestration systems. This role suits an experienced Applied Machine Learning Engineer or Data Scientist comfortable working with messy real-world data, operational constraints, and production systems. Youll join a small, high-calibre team solving complex … years Geospatial data experience (H3, GeoPandas, PostGIS or similar) Optimisation/operations research exposure Logistics/mobility/marketplace domain experience Nice to Have Reinforcement learning Simulation modelling Experience deploying models into cloud environments Experimentation frameworks (A/B testing, model validation at scale) How to Apply ...