Reinforcement Learning Jobs in Watford

Software Engineer - Large Language Models

Hiring Organisation: Fastino Labs
Location: Watford, Hertfordshire, UK
Employment Type: Full-time

overall performance metrics Architect data processing pipelines, implementing filtering, balancing, and captioning systems to ensure training data quality across diverse content categories Implement reinforcement learning techniques including Direct Preference Optimization and Generalized Reward Preference Optimization to align model outputs with human preferences and quality standards Build robust … Required - Great velocity for building and shipping agents/AI products. Optional - Advanced degree (Master's or PhD) in Computer Science, Artificial Intelligence, Machine Learning, or related technical discipline with concentrated study in deep learning and computer vision methodologies Optional - Demonstrated ability to do independent research in Academic ...

1 of 1 Reinforcement Learning Jobs in Watford