13 of 13 Reinforcement Learning Jobs in the South West

Software Engineer - Large Language Models

Hiring Organisation
Fastino Labs
Location
Swindon, UK
Employment Type
Full-time
overall performance metrics Architect data processing pipelines, implementing filtering, balancing, and captioning systems to ensure training data quality across diverse content categories Implement reinforcement learning techniques including Direct Preference Optimization and Generalized Reward Preference Optimization to align model outputs with human preferences and quality standards Build robust … Required - Great velocity for building and shipping agents/AI products. Optional - Advanced degree (Master's or PhD) in Computer Science, Artificial Intelligence, Machine Learning, or related technical discipline with concentrated study in deep learning and computer vision methodologies Optional - Demonstrated ability to do independent research in Academic ...

Software Engineer - Large Language Models

Hiring Organisation
Fastino Labs
Location
Bristol, UK
Employment Type
Full-time
overall performance metrics Architect data processing pipelines, implementing filtering, balancing, and captioning systems to ensure training data quality across diverse content categories Implement reinforcement learning techniques including Direct Preference Optimization and Generalized Reward Preference Optimization to align model outputs with human preferences and quality standards Build robust … Required - Great velocity for building and shipping agents/AI products. Optional - Advanced degree (Master's or PhD) in Computer Science, Artificial Intelligence, Machine Learning, or related technical discipline with concentrated study in deep learning and computer vision methodologies Optional - Demonstrated ability to do independent research in Academic ...

Software Engineer - Large Language Models

Hiring Organisation
Fastino Labs
Location
Gloucester, Gloucestershire, UK
Employment Type
Full-time
overall performance metrics Architect data processing pipelines, implementing filtering, balancing, and captioning systems to ensure training data quality across diverse content categories Implement reinforcement learning techniques including Direct Preference Optimization and Generalized Reward Preference Optimization to align model outputs with human preferences and quality standards Build robust … Required - Great velocity for building and shipping agents/AI products. Optional - Advanced degree (Master's or PhD) in Computer Science, Artificial Intelligence, Machine Learning, or related technical discipline with concentrated study in deep learning and computer vision methodologies Optional - Demonstrated ability to do independent research in Academic ...

Software Engineer - Large Language Models

Hiring Organisation
Fastino Labs
Location
Plymouth, Devon, UK
Employment Type
Full-time
overall performance metrics Architect data processing pipelines, implementing filtering, balancing, and captioning systems to ensure training data quality across diverse content categories Implement reinforcement learning techniques including Direct Preference Optimization and Generalized Reward Preference Optimization to align model outputs with human preferences and quality standards Build robust … Required - Great velocity for building and shipping agents/AI products. Optional - Advanced degree (Master's or PhD) in Computer Science, Artificial Intelligence, Machine Learning, or related technical discipline with concentrated study in deep learning and computer vision methodologies Optional - Demonstrated ability to do independent research in Academic ...

Software Engineer - Large Language Models

Hiring Organisation
Fastino Labs
Location
Bath, Somerset, UK
Employment Type
Full-time
overall performance metrics Architect data processing pipelines, implementing filtering, balancing, and captioning systems to ensure training data quality across diverse content categories Implement reinforcement learning techniques including Direct Preference Optimization and Generalized Reward Preference Optimization to align model outputs with human preferences and quality standards Build robust … Required - Great velocity for building and shipping agents/AI products. Optional - Advanced degree (Master's or PhD) in Computer Science, Artificial Intelligence, Machine Learning, or related technical discipline with concentrated study in deep learning and computer vision methodologies Optional - Demonstrated ability to do independent research in Academic ...

Software Engineer - Large Language Models

Hiring Organisation
Fastino Labs
Location
Bournemouth, Dorset, UK
Employment Type
Full-time
overall performance metrics Architect data processing pipelines, implementing filtering, balancing, and captioning systems to ensure training data quality across diverse content categories Implement reinforcement learning techniques including Direct Preference Optimization and Generalized Reward Preference Optimization to align model outputs with human preferences and quality standards Build robust … Required - Great velocity for building and shipping agents/AI products. Optional - Advanced degree (Master's or PhD) in Computer Science, Artificial Intelligence, Machine Learning, or related technical discipline with concentrated study in deep learning and computer vision methodologies Optional - Demonstrated ability to do independent research in Academic ...

Lead AI Engineer

Hiring Organisation
Akixi
Location
Swindon, UK
Employment Type
Full-time
similar conversational-AI platforms. Deep understanding of prompt engineering and fine-tuning of large language models. Strong grounding in ML concepts — supervised, unsupervised, and reinforcement learning. Familiarity with cloud AI/ML services (e.g. Azure Cognitive Services, AWS SageMaker, and/or GCP Vertex AI). Experience deploying ...

Lead AI Engineer

Hiring Organisation
Akixi
Location
Bristol, UK
Employment Type
Full-time
similar conversational-AI platforms. Deep understanding of prompt engineering and fine-tuning of large language models. Strong grounding in ML concepts — supervised, unsupervised, and reinforcement learning. Familiarity with cloud AI/ML services (e.g. Azure Cognitive Services, AWS SageMaker, and/or GCP Vertex AI). Experience deploying ...

Lead AI Engineer

Hiring Organisation
Akixi
Location
Cheltenham, Gloucestershire, UK
Employment Type
Full-time
similar conversational-AI platforms. Deep understanding of prompt engineering and fine-tuning of large language models. Strong grounding in ML concepts — supervised, unsupervised, and reinforcement learning. Familiarity with cloud AI/ML services (e.g. Azure Cognitive Services, AWS SageMaker, and/or GCP Vertex AI). Experience deploying ...

Lead AI Engineer

Hiring Organisation
Akixi
Location
Plymouth, Devon, UK
Employment Type
Full-time
similar conversational-AI platforms. Deep understanding of prompt engineering and fine-tuning of large language models. Strong grounding in ML concepts — supervised, unsupervised, and reinforcement learning. Familiarity with cloud AI/ML services (e.g. Azure Cognitive Services, AWS SageMaker, and/or GCP Vertex AI). Experience deploying ...

Lead AI Engineer

Hiring Organisation
Akixi
Location
Bath, Somerset, UK
Employment Type
Full-time
similar conversational-AI platforms. Deep understanding of prompt engineering and fine-tuning of large language models. Strong grounding in ML concepts — supervised, unsupervised, and reinforcement learning. Familiarity with cloud AI/ML services (e.g. Azure Cognitive Services, AWS SageMaker, and/or GCP Vertex AI). Experience deploying ...

Lead AI Engineer

Hiring Organisation
Akixi
Location
Bournemouth, Dorset, UK
Employment Type
Full-time
similar conversational-AI platforms. Deep understanding of prompt engineering and fine-tuning of large language models. Strong grounding in ML concepts — supervised, unsupervised, and reinforcement learning. Familiarity with cloud AI/ML services (e.g. Azure Cognitive Services, AWS SageMaker, and/or GCP Vertex AI). Experience deploying ...

Software Engineer (Applied AI)

Hiring Organisation
Euphoric
Location
Bournemouth, Dorset, UK
Employment Type
Full-time
iteration of our next-generation benefits platform features that leverage personalization, experimentation, and AI/ML methods (e.g. agents/LLMs, recommender systems, reinforcement learning) to enhance user experience in a meaningful business domain. Contribute across the tech stack: You'll work in React (JavaScript/TypeScript … against important business goals that help the entire team win Pragmatic Best Practices: An overarching desire to build efficient, scalable, and maintainable code, while learning the tradeoffs between technical debt and delivery speed What we look for: We're a great bunch but we have some \"Euph\" cultural ...