Remote Reinforcement Learning Jobs in the UK

17 of 17 Remote Reinforcement Learning Jobs in the UK

Power Platform - London, UK

London, United Kingdom
Hybrid / WFH Options
Randstad Technologies Recruitment
experience in Investment Banking environment would be a plus Spanish would be a plus Mandatory Skills : Python, ServiceNow Orchestrator, Azure Cognitive Services, GenAI - LLMOps, RPA - Microsoft Power Automate, Machine Learning - AIOPS, Deep Learning - AIOPS, Reinforcement Learning - AIOPS Randstad Technologies Ltd is a leading specialist recruitment business for the IT & Engineering industries. Please note that due to More ❯
Employment Type: Permanent
Salary: £60000 - £65000/annum
Posted:

Power Platform - London, UK

London, South East, England, United Kingdom
Hybrid / WFH Options
Randstad Technologies
experience in Investment Banking environment would be a plus Spanish would be a plus Mandatory Skills : Python, ServiceNow Orchestrator, Azure Cognitive Services, GenAI - LLMOps, RPA - Microsoft Power Automate, Machine Learning - AIOPS, Deep Learning - AIOPS, Reinforcement Learning - AIOPS Randstad Technologies Ltd is a leading specialist recruitment business for the IT & Engineering industries. Please note that due to More ❯
Employment Type: Full-Time
Salary: £60,000 - £65,000 per annum
Posted:

Staff Machine Learning Scientist

London, United Kingdom
Hybrid / WFH Options
Intercom
service. Driven by our core values, we push boundaries, build with speed and intensity, and consistently deliver incredible value to our customers. What's the opportunity? Intercom's Machine Learning team is responsible for defining new ML features, researching appropriate algorithms and technologies, and rapidly getting first prototypes in our customers' hands. We are an extremely product focussed team. … dedicated ML product engineers enable us to move to production fast, often shipping to beta in weeks after a successful offline test. We are very passionate about applying machine learning technology, and have productized everything from classic supervised models, to cutting-edge unsupervised clustering algorithms, to novel applications of transformer neural networks. We test and measure the real customer … field (e.g. MSc) Scientific thinking skills Track record shipping ML products PhD or other experience in a research environment Deep experience in an applicable ML area - E.g. NLP, Deep learning, Bayesian methods, Reinforcement learning, clustering Strong stats or math background Benefits We are a well treated bunch, with awesome benefits! If there's something important to you More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Data Scientist - Fixed Term Contract

London, United Kingdom
Hybrid / WFH Options
Faculty
roundtables, or by contributing to large-scale open-source projects. You will also have the opportunity to teach on the fellowship about topics that range from basic statistics to reinforcement learning, and to mentor the fellows through their 6-week project. Thanks to Faculty platform, you will have access to powerful computational resources, and you will enjoy the … become a fluent Python programmer in a short timeframe An excellent command of the basic libraries for data science (e.g. NumPy, Pandas, Scikit-Learn) and familiarity with a deep-learning framework (e.g. TensorFlow, PyTorch, Caffe) A high level of mathematical competence and proficiency in statistics A solid grasp of essentially all of the standard data science techniques, for example … supervised/unsupervised machine learning, model cross validation, Bayesian inference, time-series analysis, simple NLP, effective SQL database querying, or using/writing simple APIs for models. We regard the ability to develop new algorithms when an innovative solution is needed as a fundamental skill An appreciation for the scientific method as applied to the commercial world; a talent More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Senior Data Scientist - Fixed Term Contract

London, United Kingdom
Hybrid / WFH Options
Faculty
roundtables, or by contributing to large-scale open-source projects. You will also have the opportunity to teach on the fellowship about topics that range from basic statistics to reinforcement learning, and to mentor the fellows through their 6-week project. Thanks to Faculty platform, you will have access to powerful computational resources, and you will enjoy the … become a fluent Python programmer in a short timeframe An excellent command of the basic libraries for data science (e.g. NumPy, Pandas, Scikit-Learn) and familiarity with a deep-learning framework (e.g. TensorFlow, PyTorch, Caffe) A high level of mathematical competence and proficiency in statistics A solid grasp of essentially all of the standard data science techniques, for example … supervised/unsupervised machine learning, model cross validation, Bayesian inference, time-series analysis, simple NLP, effective SQL database querying, or using/writing simple APIs for models. We regard the ability to develop new algorithms when an innovative solution is needed as a fundamental skill A leadership mindset focussed on growing the technical capabilities of the team; a caring More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Senior Research Scientist: Data Science and Machine Learning AIP

Chelmsford, Essex, United Kingdom
Hybrid / WFH Options
NLP PEOPLE
of interest to you. The Data and Decision Support Capability has teams working across AI/ML areas such as RF, EW, radar, sonar, distributed sensing-processing, data fusion, reinforcement learning, autonomy, image analysis and computer vision, generative AI, NLP, knowledge graphs and more. You will work with these colleagues in multi-disciplinary teams. Typical Responsibilities Lead technical … and/or statistical signal processing to sequential data and decision-making post PhD. Experience in software development for proof of concept in Python. Experience with machine and deep learning frameworks: TensorFlow, PyTorch, scikit-learn, etc. Domains of Particular Interest RF communications and CEMA Electronic or Electromagnetic Warfare (EW) Tracking and sensor data fusion Radar signal processing Acoustic data … and Project Management teams that design and implement defence solutions and digital transformation projects. Company BAE Systems Experience and Education Senior (5+ years of experience) Tagged as: Industry, Machine Learning, NLP, United Kingdom More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Lead, Vision-Language-Action VLA, Behaviour Learning - Hybrid

West London, London, United Kingdom
Hybrid / WFH Options
Skillsbay Limited
Role: Lead, Vision-Language-Action (VLA)/Behaviour Learning About the Client Our client is a pioneering robotics startup developing the worlds most advanced, reliable, and commercially scalable humanoid robots. Their mission is to create safe, next-generation robots that integrate seamlessly into daily life and amplify human capacity. Their first robot, HMND 01 , is designed for industrial automation … understand, and act in complex real-world environments. The role combines cutting-edge AI research with practical deployment in robotics. What Youll Do Define and drive strategy for representation learning, behaviour cloning, and reinforcement learning (RL) . Lead large-scale training of multi-modal LLM/VLM/VLA systems integrating inputs such as vision, audio, proprioception … optimise models for real-time deployment . Hire, mentor, and lead a high-calibre team of research scientists and engineers. What Were Looking For 6+ years experience building deep learning systems, including 2+ years in technical team leadership. Hands-on expertise with LLM/VLM architecture design, billion-parameter training, and fine-tuning . Proven track record applying RL More ❯
Employment Type: Permanent
Posted:

Lead, Vision-Language-Action VLA, Behaviour Learning - Hybrid

london, south east england, united kingdom
Hybrid / WFH Options
Skillsbay Limited
Role: Lead, Vision-Language-Action (VLA)/Behaviour Learning About the Client Our client is a pioneering robotics startup developing the worlds most advanced, reliable, and commercially scalable humanoid robots. Their mission is to create safe, next-generation robots that integrate seamlessly into daily life and amplify human capacity. Their first robot, HMND 01 , is designed for industrial automation … understand, and act in complex real-world environments. The role combines cutting-edge AI research with practical deployment in robotics. What Youll Do Define and drive strategy for representation learning, behaviour cloning, and reinforcement learning (RL) . Lead large-scale training of multi-modal LLM/VLM/VLA systems integrating inputs such as vision, audio, proprioception … optimise models for real-time deployment . Hire, mentor, and lead a high-calibre team of research scientists and engineers. What Were Looking For 6+ years experience building deep learning systems, including 2+ years in technical team leadership. Hands-on expertise with LLM/VLM architecture design, billion-parameter training, and fine-tuning . Proven track record applying RL More ❯
Posted:

Lead, Vision-Language-Action VLA, Behaviour Learning - Hybrid

west london, south east england, united kingdom
Hybrid / WFH Options
Skillsbay Limited
Role: Lead, Vision-Language-Action (VLA)/Behaviour Learning About the Client Our client is a pioneering robotics startup developing the worlds most advanced, reliable, and commercially scalable humanoid robots. Their mission is to create safe, next-generation robots that integrate seamlessly into daily life and amplify human capacity. Their first robot, HMND 01 , is designed for industrial automation … understand, and act in complex real-world environments. The role combines cutting-edge AI research with practical deployment in robotics. What Youll Do Define and drive strategy for representation learning, behaviour cloning, and reinforcement learning (RL) . Lead large-scale training of multi-modal LLM/VLM/VLA systems integrating inputs such as vision, audio, proprioception … optimise models for real-time deployment . Hire, mentor, and lead a high-calibre team of research scientists and engineers. What Were Looking For 6+ years experience building deep learning systems, including 2+ years in technical team leadership. Hands-on expertise with LLM/VLM architecture design, billion-parameter training, and fine-tuning . Proven track record applying RL More ❯
Posted:

Python Developer

Glasgow, Scotland, United Kingdom
Hybrid / WFH Options
Venesky Brown
programming, code reviews, system design and requirements analysis/refinement, etc. - Coaching and mentoring other team members, as appropriate. Essential Skills: - OCR, Object Detection and LLM analysis implementation - Machine Learning & AI Libraries including Transformers/Hugging Face for working with pre-trained LLMs, fine tuning, and inference, PyTorch for deep learning model development and training, OpenCV for computer … Desirable Skills: - Custom model architecture design and implementation - Advanced fine-tuning techniques including LoRA, QLoRA, and parameter efficient methods - Multi-modal AI systems combining text, image, and structured data - Reinforcement Learning from Human Feedback (RLHF) for model alignment - Apache Airflow/Dagster for ML workflow orchestration and ETL pipeline management - Model versioning and experiment tracking (MLflow, Weights & Biases More ❯
Posted:

Python Developer

milton, central scotland, united kingdom
Hybrid / WFH Options
Venesky Brown
programming, code reviews, system design and requirements analysis/refinement, etc. - Coaching and mentoring other team members, as appropriate. Essential Skills: - OCR, Object Detection and LLM analysis implementation - Machine Learning & AI Libraries including Transformers/Hugging Face for working with pre-trained LLMs, fine tuning, and inference, PyTorch for deep learning model development and training, OpenCV for computer … Desirable Skills: - Custom model architecture design and implementation - Advanced fine-tuning techniques including LoRA, QLoRA, and parameter efficient methods - Multi-modal AI systems combining text, image, and structured data - Reinforcement Learning from Human Feedback (RLHF) for model alignment - Apache Airflow/Dagster for ML workflow orchestration and ETL pipeline management - Model versioning and experiment tracking (MLflow, Weights & Biases More ❯
Posted:

Python Developer

paisley, central scotland, united kingdom
Hybrid / WFH Options
Venesky Brown
programming, code reviews, system design and requirements analysis/refinement, etc. - Coaching and mentoring other team members, as appropriate. Essential Skills: - OCR, Object Detection and LLM analysis implementation - Machine Learning & AI Libraries including Transformers/Hugging Face for working with pre-trained LLMs, fine tuning, and inference, PyTorch for deep learning model development and training, OpenCV for computer … Desirable Skills: - Custom model architecture design and implementation - Advanced fine-tuning techniques including LoRA, QLoRA, and parameter efficient methods - Multi-modal AI systems combining text, image, and structured data - Reinforcement Learning from Human Feedback (RLHF) for model alignment - Apache Airflow/Dagster for ML workflow orchestration and ETL pipeline management - Model versioning and experiment tracking (MLflow, Weights & Biases More ❯
Posted:

Lead AI Engineer New Remote, UK

United Kingdom
Hybrid / WFH Options
Prolific
FastAPI, Django). You have a strong command of system design, infrastructure, and CI/CD pipelines, and you're comfortable taking a feature from concept to deployment. Machine Learning Engineering & Research: A comprehensive background in machine learning, with deep experience in ML frameworks (e.g., PyTorch, TensorFlow, Hugging Face) and a strong grasp of the underlying theory behind … LangChain, LlamaIndex, DSPy, LangGraph). Experience in agentic system design and tool calling (Adk, A2A, MCP, etc ) Prior work on human-in-the-loop systems, data annotation platforms, or reinforcement learning from human feedback (RLHF). An interest or experience in synthetic data, human-in-the-loop systems, or AI alignment. Experience building robust monitoring and observability for More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Senior Data Scientist

London, United Kingdom
Hybrid / WFH Options
ECM Selection (Holdings) Limited
experimental, and it is understood that not all projects succeed, even failed projects contain valuable insights. You will be building upon cutting-edge ML techniques such as transformers and reinforcement learning to create novel multi-modal solutions. Examples include sensor fusion systems, physics-informed neural networks for simulations, and multi-purpose autonomous robots. Projects will be defence focused … surrounding area. Initially this is an 18-month contract with the expectation of extending this as more funding is released. Keywords: AI, ML, RF, EM, GNN, Transformer, Autoencoder, Reinforced Learning, Multi-Modal AI, Sensor Fusion, Python, PyTorch, Radio Frequency, RF Another top job from ECM, the high-tech recruitment experts. Even if this job's not quite right, do More ❯
Employment Type: Permanent
Salary: £50000 - £60000/annum DoE + Benefits
Posted:

AI Architect

City of London, London, United Kingdom
Hybrid / WFH Options
Omnis Partners
and build strong partnerships with major tech vendors. 🛠️ What We Need Proven experience designing and deploying agentic AI systems in production Strong understanding of AI/ML techniques including reinforcement learning, LLM agents, and simulations Proficient in Python and Infrastructure as Code for cloud environments Familiarity with AI orchestration platforms like LangChain, AutoGen, CrewAI, or similar Experience supporting More ❯
Posted:

AI Architect

London Area, United Kingdom
Hybrid / WFH Options
Omnis Partners
and build strong partnerships with major tech vendors. 🛠️ What We Need Proven experience designing and deploying agentic AI systems in production Strong understanding of AI/ML techniques including reinforcement learning, LLM agents, and simulations Proficient in Python and Infrastructure as Code for cloud environments Familiarity with AI orchestration platforms like LangChain, AutoGen, CrewAI, or similar Experience supporting More ❯
Posted:

LLM Researcher

London, South East, England, United Kingdom
Hybrid / WFH Options
MicroTECH Global Ltd
and regulatory requirements in fintech (SOC2, PCI-DSS, GDPR). Ability to thrive in a fast-moving startup environment. Desirables: Background in fintech, payments, or treasury systems. Experience with reinforcement learning with human feedback (RLHF). More ❯
Employment Type: Full-Time
Salary: Salary negotiable
Posted:
Reinforcement Learning
10th Percentile
£68,075
25th Percentile
£75,000
Median
£95,000
75th Percentile
£121,250
90th Percentile
£175,000