5 of 5 Remote/Hybrid Reinforcement Learning Jobs in London

Artificial Intelligence Researcher

Hiring Organisation
microTECH Global LTD
Location
City of London, London, United Kingdom
permanent position with candidates required to do hybrid working in either Cambridge or London. Our client are looking for AI Researchers specialising in Reinforcement Learning with Human Feedback (RLHF) and Generative AI. In this role, you will design and optimise the algorithms that align large-scale generative models … build the next generation of foundation models Responsibilities: Develop and refine RLHF algorithms for large language and generative models. Research and implement deep reinforcement learning methods (policy gradients, actor-critic, off-policy learning) for model alignment. Train, fine-tune, and evaluate LLMs and diffusion models at scale. ...

AI / ML Architect

Hiring Organisation
Stackstudio Digital Ltd
Location
London, United Kingdom
Employment Type
Contract, Work From Home
Contract Rate
From £450 to £500 per day
Hybrid): 2 days from office Number of Positions: 4 The Role An AI/ML Developer is responsible for designing, building, and deploying machine learning models and AI solutions that solve business problems. This role focuses on coding, data preparation, and integrating models into production systems. Your Responsibilities … Model Development Design, build, and train machine learning models for predictive analytics, classification, NLP, computer vision, or other AI applications. Experiment with algorithms and optimize hyperparameters for performance. Data Preparation Collect, clean, and preprocess large datasets for training and validation. Implement feature engineering and data augmentation techniques. Integration & Deployment ...

Senior Data Scientist

Hiring Organisation
Anson Mccade
Location
London, United Kingdom
Employment Type
Permanent
Responsibilities End-to-End Delivery: Lead the technical execution of AI projects, from initial problem discovery and hypothesis testing to deploying production-grade machine learning models. Strategic Advisory: Act as a "translator" between technical complexity and business value. You will work closely with C-suite stakeholders to identify … solve their most pressing strategic challenges. Technical Leadership: Architect robust, scalable data pipelines and state-of-the-art models (including LLMs, Reinforcement Learning, or Bayesian Inference) tailored to specific client needs. Mentorship: Guide and upskill junior Data Scientists, fostering a culture of rigorous peer review, clean coding standards ...

Data Scientist – Peer‐to‐Peer Renewable Energy Trading Platform

Hiring Organisation
The Green Recruitment Company
Location
City of London, London, United Kingdom
Responsibilities Modelling and Forecasting Develop time‐series models for generation, consumption, and market price forecasting. Build probabilistic and scenario‐based forecasting capabilities. Apply machine learning to optimise matching, pairing, and routing algorithms within the P2P marketplace. Trading and Optimisation Intelligence Create algorithms that optimise buyer–seller matching, pricing … learn, PyTorch/TensorFlow). Strong experience with time‐series modelling (ARIMA, Prophet, LSTMs or similar). Understanding of optimisation methods (linear, mixed‐integer, reinforcement learning desirable). Strong SQL and practical experience with production‐ready data pipelines. Experience working with cloud environments (AWS, GCP, or Azure). ...

AI Architect (Wealth)

Hiring Organisation
Teksystems
Location
Central London, London, United Kingdom
Employment Type
Contract, Work From Home
Title: AI Architect (Wealth) Job Description This position is pivotal in designing AI and Machine Learning solutions on cloud-based platforms, exploring emerging AI trends, developing proof-of-concepts, and collaborating with internal and external ecosystems to advance these concepts to production. The role demands expertise in designing … least 6-10 years of hands-on development and architectural experience. Proficiency in Python, PyTorch, TensorFlow, or similar frameworks. experience with supervised, unsupervised, and reinforcement learning. Solid grounding in Natural Language Processing (NLP) concepts such as tokenisation, embeddings, semantic search, text classification, and summarisation. Strong understanding of Large Language ...