Remote Reinforcement Learning Jobs

22 of 22 Remote Reinforcement Learning Jobs

Senior Technology Manager, AI & ML

richmond, virginia, united states
Hybrid / WFH Options
CarMax
CarMax, the way your career should be! About this job At CarMax, a Senior Technology Manager, AI & ML is a key leader in providing reliable and scalable machine-learning capabilities across the organization. The Senior Technology Manager will be responsible for overseeing multiple portfolios of ML and AI capabilities and solutions. In this role you will support managers and … and guiding CarMax customers through their optimal buying journey by providing predictive inputs at key moments. The ideal candidate will have a passion and understanding of Data Science, Machine Learning and AI and will have the substantial experience in software engineering and cloud engineering that is necessary to turn those models into highly visible, mission-critical capabilities that drive … teams’ achievements Create the strategic roadmap that will guide the direction and goals for the team, both in terms of individual project impacts but also overall standards for machine learning and AI utilization in the organization Empower your direct reports to lead their teams by providing them with the resources, training, feedback, and a sounding-board to be successful More ❯
Posted:

Power Platform - London, UK

London, United Kingdom
Hybrid / WFH Options
Randstad Technologies Recruitment
experience in Investment Banking environment would be a plus Spanish would be a plus Mandatory Skills : Python, ServiceNow Orchestrator, Azure Cognitive Services, GenAI - LLMOps, RPA - Microsoft Power Automate, Machine Learning - AIOPS, Deep Learning - AIOPS, Reinforcement Learning - AIOPS Randstad Technologies Ltd is a leading specialist recruitment business for the IT & Engineering industries. Please note that due to More ❯
Employment Type: Permanent
Salary: £60000 - £65000/annum
Posted:

Power Platform - London, UK

London, South East, England, United Kingdom
Hybrid / WFH Options
Randstad Technologies
experience in Investment Banking environment would be a plus Spanish would be a plus Mandatory Skills : Python, ServiceNow Orchestrator, Azure Cognitive Services, GenAI - LLMOps, RPA - Microsoft Power Automate, Machine Learning - AIOPS, Deep Learning - AIOPS, Reinforcement Learning - AIOPS Randstad Technologies Ltd is a leading specialist recruitment business for the IT & Engineering industries. Please note that due to More ❯
Employment Type: Full-Time
Salary: £60,000 - £65,000 per annum
Posted:

Staff Machine Learning Scientist

London, United Kingdom
Hybrid / WFH Options
Intercom
service. Driven by our core values, we push boundaries, build with speed and intensity, and consistently deliver incredible value to our customers. What's the opportunity? Intercom's Machine Learning team is responsible for defining new ML features, researching appropriate algorithms and technologies, and rapidly getting first prototypes in our customers' hands. We are an extremely product focussed team. … dedicated ML product engineers enable us to move to production fast, often shipping to beta in weeks after a successful offline test. We are very passionate about applying machine learning technology, and have productized everything from classic supervised models, to cutting-edge unsupervised clustering algorithms, to novel applications of transformer neural networks. We test and measure the real customer … field (e.g. MSc) Scientific thinking skills Track record shipping ML products PhD or other experience in a research environment Deep experience in an applicable ML area - E.g. NLP, Deep learning, Bayesian methods, Reinforcement learning, clustering Strong stats or math background Benefits We are a well treated bunch, with awesome benefits! If there's something important to you More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Senior Machine Learning Scientist

Dublin, Ireland
Hybrid / WFH Options
Intercom
service. Driven by our core values, we push boundaries, build with speed and intensity, and consistently deliver incredible value to our customers. What's the opportunity? Intercom's Machine Learning team is responsible for defining new ML features, researching appropriate algorithms and technologies, and rapidly getting first prototypes in our customers' hands. We are an extremely product focussed team. … dedicated ML product engineers enable us to move to production fast, often shipping to beta in weeks after a successful offline test. We are very passionate about applying machine learning technology, and have productized everything from classic supervised models, to cutting-edge unsupervised clustering algorithms, to novel applications of transformer neural networks. We test and measure the real customer … Plan, measure & socialize learnings to inform iteration Partner deeply with the rest of team, and others, to build excellent ML products What skills might I need? Broad applied machine learning knowledge 3-5 years applied ML experience Practical stats knowledge (experiment design, dealing with confounding etc) Strong communication skills, both within engineering teams and across disciplines. Comfort with ambiguity More ❯
Employment Type: Permanent
Salary: EUR 125,000 - 150,000 Annual
Posted:

AI Engineer

Vienna, Virginia, United States
Hybrid / WFH Options
ALTA IT Services
Title: A.I. Engineer Location: Hybrid Work Model Reporting to Vienna, VA Pay Rate: Open to Both C2C and W2 options Position Type: Multiyear Contract Responsibility: Build and enhance machine learning models through all phases of development including design, training, validation, and implementation etc. Unlock insights by analyzing large scale of complex numerical and textual data and identifying trends. Partner … Qualifications: Advanced degree in in computer science, mathematics, physics, statistics, or related field. Strong experience with applying expertise in model design, training, validation, and monitoring. Excellent understanding of machine learning, statistical modeling, and algorithms as well as their benefits and drawbacks. Experience with cloud computing infrastructure. Experience with Computer Vision, image processing and video analytics. Experience with Natural Language … Processing/Natural Language Understanding. Experience with deep learning framework and infrastructure like TensorFlow or PyTorch. Experience and/or willing to learn advanced techniques in Agentic A.I. framework and Large Language Models (LLMs). Experience and/or willing to research, develop, implement, and fine-tuning LLMs in terms of specific domains knowledge and user cases. Ability to More ❯
Employment Type: Permanent
Salary: USD Annual
Posted:

Senior Research Scientist: Data Science and Machine Learning AIP

Chelmsford, Essex, United Kingdom
Hybrid / WFH Options
NLP PEOPLE
of interest to you. The Data and Decision Support Capability has teams working across AI/ML areas such as RF, EW, radar, sonar, distributed sensing-processing, data fusion, reinforcement learning, autonomy, image analysis and computer vision, generative AI, NLP, knowledge graphs and more. You will work with these colleagues in multi-disciplinary teams. Typical Responsibilities Lead technical … and/or statistical signal processing to sequential data and decision-making post PhD. Experience in software development for proof of concept in Python. Experience with machine and deep learning frameworks: TensorFlow, PyTorch, scikit-learn, etc. Domains of Particular Interest RF communications and CEMA Electronic or Electromagnetic Warfare (EW) Tracking and sensor data fusion Radar signal processing Acoustic data … and Project Management teams that design and implement defence solutions and digital transformation projects. Company BAE Systems Experience and Education Senior (5+ years of experience) Tagged as: Industry, Machine Learning, NLP, United Kingdom More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Lead, Vision-Language-Action VLA, Behaviour Learning - Hybrid

West London, London, United Kingdom
Hybrid / WFH Options
Skillsbay Limited
Role: Lead, Vision-Language-Action (VLA)/Behaviour Learning About the Client Our client is a pioneering robotics startup developing the worlds most advanced, reliable, and commercially scalable humanoid robots. Their mission is to create safe, next-generation robots that integrate seamlessly into daily life and amplify human capacity. Their first robot, HMND 01 , is designed for industrial automation … understand, and act in complex real-world environments. The role combines cutting-edge AI research with practical deployment in robotics. What Youll Do Define and drive strategy for representation learning, behaviour cloning, and reinforcement learning (RL) . Lead large-scale training of multi-modal LLM/VLM/VLA systems integrating inputs such as vision, audio, proprioception … optimise models for real-time deployment . Hire, mentor, and lead a high-calibre team of research scientists and engineers. What Were Looking For 6+ years experience building deep learning systems, including 2+ years in technical team leadership. Hands-on expertise with LLM/VLM architecture design, billion-parameter training, and fine-tuning . Proven track record applying RL More ❯
Employment Type: Permanent
Posted:

Lead, Vision-Language-Action VLA, Behaviour Learning - Hybrid

London, United Kingdom
Hybrid / WFH Options
Skillsbay Limited
Role: Lead, Vision-Language-Action (VLA)/Behaviour Learning About the Client Our client is a pioneering robotics startup developing the worlds most advanced, reliable, and commercially scalable humanoid robots. Their mission is to create safe, next-generation robots that integrate seamlessly into daily life and amplify human capacity. Their first robot, HMND 01 , is designed for industrial automation … understand, and act in complex real-world environments. The role combines cutting-edge AI research with practical deployment in robotics. What Youll Do Define and drive strategy for representation learning, behaviour cloning, and reinforcement learning (RL) . Lead large-scale training of multi-modal LLM/VLM/VLA systems integrating inputs such as vision, audio, proprioception … optimise models for real-time deployment . Hire, mentor, and lead a high-calibre team of research scientists and engineers. What Were Looking For 6+ years experience building deep learning systems, including 2+ years in technical team leadership. Hands-on expertise with LLM/VLM architecture design, billion-parameter training, and fine-tuning . Proven track record applying RL More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Lead, Vision-Language-Action VLA, Behaviour Learning - Hybrid

london, south east england, united kingdom
Hybrid / WFH Options
Skillsbay Limited
Role: Lead, Vision-Language-Action (VLA)/Behaviour Learning About the Client Our client is a pioneering robotics startup developing the worlds most advanced, reliable, and commercially scalable humanoid robots. Their mission is to create safe, next-generation robots that integrate seamlessly into daily life and amplify human capacity. Their first robot, HMND 01 , is designed for industrial automation … understand, and act in complex real-world environments. The role combines cutting-edge AI research with practical deployment in robotics. What Youll Do Define and drive strategy for representation learning, behaviour cloning, and reinforcement learning (RL) . Lead large-scale training of multi-modal LLM/VLM/VLA systems integrating inputs such as vision, audio, proprioception … optimise models for real-time deployment . Hire, mentor, and lead a high-calibre team of research scientists and engineers. What Were Looking For 6+ years experience building deep learning systems, including 2+ years in technical team leadership. Hands-on expertise with LLM/VLM architecture design, billion-parameter training, and fine-tuning . Proven track record applying RL More ❯
Posted:

Lead, Vision-Language-Action VLA, Behaviour Learning - Hybrid

west london, south east england, united kingdom
Hybrid / WFH Options
Skillsbay Limited
Role: Lead, Vision-Language-Action (VLA)/Behaviour Learning About the Client Our client is a pioneering robotics startup developing the worlds most advanced, reliable, and commercially scalable humanoid robots. Their mission is to create safe, next-generation robots that integrate seamlessly into daily life and amplify human capacity. Their first robot, HMND 01 , is designed for industrial automation … understand, and act in complex real-world environments. The role combines cutting-edge AI research with practical deployment in robotics. What Youll Do Define and drive strategy for representation learning, behaviour cloning, and reinforcement learning (RL) . Lead large-scale training of multi-modal LLM/VLM/VLA systems integrating inputs such as vision, audio, proprioception … optimise models for real-time deployment . Hire, mentor, and lead a high-calibre team of research scientists and engineers. What Were Looking For 6+ years experience building deep learning systems, including 2+ years in technical team leadership. Hands-on expertise with LLM/VLM architecture design, billion-parameter training, and fine-tuning . Proven track record applying RL More ❯
Posted:

Python Developer

Glasgow, Scotland, United Kingdom
Hybrid / WFH Options
Venesky Brown
programming, code reviews, system design and requirements analysis/refinement, etc. - Coaching and mentoring other team members, as appropriate. Essential Skills: - OCR, Object Detection and LLM analysis implementation - Machine Learning & AI Libraries including Transformers/Hugging Face for working with pre-trained LLMs, fine tuning, and inference, PyTorch for deep learning model development and training, OpenCV for computer … Desirable Skills: - Custom model architecture design and implementation - Advanced fine-tuning techniques including LoRA, QLoRA, and parameter efficient methods - Multi-modal AI systems combining text, image, and structured data - Reinforcement Learning from Human Feedback (RLHF) for model alignment - Apache Airflow/Dagster for ML workflow orchestration and ETL pipeline management - Model versioning and experiment tracking (MLflow, Weights & Biases More ❯
Posted:

Python Developer

paisley, central scotland, united kingdom
Hybrid / WFH Options
Venesky Brown
programming, code reviews, system design and requirements analysis/refinement, etc. - Coaching and mentoring other team members, as appropriate. Essential Skills: - OCR, Object Detection and LLM analysis implementation - Machine Learning & AI Libraries including Transformers/Hugging Face for working with pre-trained LLMs, fine tuning, and inference, PyTorch for deep learning model development and training, OpenCV for computer … Desirable Skills: - Custom model architecture design and implementation - Advanced fine-tuning techniques including LoRA, QLoRA, and parameter efficient methods - Multi-modal AI systems combining text, image, and structured data - Reinforcement Learning from Human Feedback (RLHF) for model alignment - Apache Airflow/Dagster for ML workflow orchestration and ETL pipeline management - Model versioning and experiment tracking (MLflow, Weights & Biases More ❯
Posted:

Python Developer

milton, central scotland, united kingdom
Hybrid / WFH Options
Venesky Brown
programming, code reviews, system design and requirements analysis/refinement, etc. - Coaching and mentoring other team members, as appropriate. Essential Skills: - OCR, Object Detection and LLM analysis implementation - Machine Learning & AI Libraries including Transformers/Hugging Face for working with pre-trained LLMs, fine tuning, and inference, PyTorch for deep learning model development and training, OpenCV for computer … Desirable Skills: - Custom model architecture design and implementation - Advanced fine-tuning techniques including LoRA, QLoRA, and parameter efficient methods - Multi-modal AI systems combining text, image, and structured data - Reinforcement Learning from Human Feedback (RLHF) for model alignment - Apache Airflow/Dagster for ML workflow orchestration and ETL pipeline management - Model versioning and experiment tracking (MLflow, Weights & Biases More ❯
Posted:

Lead AI Engineer New Remote, UK

United Kingdom
Hybrid / WFH Options
Prolific
FastAPI, Django). You have a strong command of system design, infrastructure, and CI/CD pipelines, and you're comfortable taking a feature from concept to deployment. Machine Learning Engineering & Research: A comprehensive background in machine learning, with deep experience in ML frameworks (e.g., PyTorch, TensorFlow, Hugging Face) and a strong grasp of the underlying theory behind … LangChain, LlamaIndex, DSPy, LangGraph). Experience in agentic system design and tool calling (Adk, A2A, MCP, etc ) Prior work on human-in-the-loop systems, data annotation platforms, or reinforcement learning from human feedback (RLHF). An interest or experience in synthetic data, human-in-the-loop systems, or AI alignment. Experience building robust monitoring and observability for More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Staff Machine Learning Engineer

Ireland
Hybrid / WFH Options
Genesys
together. Location: We are open to this person working hybrid from our office in Galway or remote within Ireland Description: The Conversational AI group are seeking a Staff Machine Learning Engineer to join the team and drive the innovation and evolution of our Conversational AI solutions with a focus on Generative AI systems at global scale (reaching over … problem-solving skills and ability to work in a fast-paced, collaborative environment. Independent judgment in developing novel approaches to challenging AI problems. Desirable Experience or an Interest in Learning : Experience with Large Language Models (LLMs) and their application in conversational systems. Hands-on experience with fine-tuning and prompt engineering. Knowledge of multi-agent systems/frameworks and … agentic AI architectures. Familiarity with voice AI technologies and multimodal conversational interfaces. Experience with MLOps and deploying AI systems at scale. Background in reinforcement learning for dialogue optimization. Understanding of enterprise AI governance and responsible AI practices. Interest in emerging areas, such as tool-using AI agents and autonomous decision-making systems. Experience with cloud platforms (particularly AWS More ❯
Employment Type: Permanent
Salary: EUR 125,000 - 150,000 Annual
Posted:

Senior Machine Learning Engineer with Security Clearance

Hampton, Virginia, United States
Hybrid / WFH Options
Iron EagleX, Inc
solutions that empower organizations and end users to operate smarter, faster, and more securely in dynamic environments. Responsibilities Job Description: Iron EagleX is seeking a highly skilled Senior Machine Learning Engineer to join our dynamic team in Crystal City, VA. The ideal candidate will specialize in AI-driven automation, prompt engineering, and machine learning model optimization to deliver … project needs. Job Duties Include (but not limited to): Develop, optimize, and implement AI workflows with a strong focus on prompt engineering and automation. Design and fine-tune machine learning models to enhance performance and effectiveness. Research and apply best practices for large language model (LLM) utilization, including prompt orchestration and fine-tuning techniques. Develop and integrate AI agents … into data workflows, leveraging state-of-the-art frameworks. Apply advanced prompt engineering techniques such as few-shot learning, chain-of-thought reasoning, self-consistency, and retrieval-augmented generation. Utilize frameworks like LangChain and DSPy to design, test, and optimize prompts. Address token limitations, cost trade-offs, and model-specific constraints across various LLMs (ChatGPT, Claude, Llama, Mistral, etc. More ❯
Employment Type: Permanent
Salary: USD Annual
Posted:

Senior Data Scientist

London, United Kingdom
Hybrid / WFH Options
ECM Selection (Holdings) Limited
experimental, and it is understood that not all projects succeed, even failed projects contain valuable insights. You will be building upon cutting-edge ML techniques such as transformers and reinforcement learning to create novel multi-modal solutions. Examples include sensor fusion systems, physics-informed neural networks for simulations, and multi-purpose autonomous robots. Projects will be defence focused … surrounding area. Initially this is an 18-month contract with the expectation of extending this as more funding is released. Keywords: AI, ML, RF, EM, GNN, Transformer, Autoencoder, Reinforced Learning, Multi-Modal AI, Sensor Fusion, Python, PyTorch, Radio Frequency, RF Another top job from ECM, the high-tech recruitment experts. Even if this job's not quite right, do More ❯
Employment Type: Permanent
Salary: £50000 - £60000/annum DoE + Benefits
Posted:

AI Architect

London Area, United Kingdom
Hybrid / WFH Options
Omnis Partners
and build strong partnerships with major tech vendors. 🛠️ What We Need Proven experience designing and deploying agentic AI systems in production Strong understanding of AI/ML techniques including reinforcement learning, LLM agents, and simulations Proficient in Python and Infrastructure as Code for cloud environments Familiarity with AI orchestration platforms like LangChain, AutoGen, CrewAI, or similar Experience supporting More ❯
Posted:

AI Architect

City of London, London, United Kingdom
Hybrid / WFH Options
Omnis Partners
and build strong partnerships with major tech vendors. 🛠️ What We Need Proven experience designing and deploying agentic AI systems in production Strong understanding of AI/ML techniques including reinforcement learning, LLM agents, and simulations Proficient in Python and Infrastructure as Code for cloud environments Familiarity with AI orchestration platforms like LangChain, AutoGen, CrewAI, or similar Experience supporting More ❯
Posted:

LLM Researcher

London, South East, England, United Kingdom
Hybrid / WFH Options
MicroTECH Global Ltd
and regulatory requirements in fintech (SOC2, PCI-DSS, GDPR). Ability to thrive in a fast-moving startup environment. Desirables: Background in fintech, payments, or treasury systems. Experience with reinforcement learning with human feedback (RLHF). More ❯
Employment Type: Full-Time
Salary: Salary negotiable
Posted:

Business Co-Founder (Stealth AI Startup)

Düsseldorf, Nordrhein-Westfalen, Germany
Hybrid / WFH Options
Stealth AI Germany
We are a stealth-mode AI startup founded by a team of PhDs with deep expertise in machine learning, NLP, and reinforcement learning. Backed by advisors from top global research institutions, we've built our MVP and are preparing to launch our first enterprise pilot. We're now looking for a Business Co-Founder to join the founding More ❯
Employment Type: Permanent
Salary: EUR Annual
Posted:
Reinforcement Learning
Work from Home
10th Percentile
£79,085
25th Percentile
£90,801
Median
£150,000
75th Percentile
£175,000