Research Engineer, Machine Learning (Horizons) London, UK (London)
London, UK
Anthropic’s mission is to create reliable, interpretable, and steerable AI systems. We want AI to be safe and beneficial for our users and for society as a whole. Our team is a quickly growing group of committed researchers, engineers, policy experts, and business leaders working together to build beneficial AI systems.
As a Research Engineer on the Reinforcement Learning Fundamentals team, you will collaborate with a diverse group of researchers and engineers to advance the capabilities and safety of large language models through fundamental research in reinforcement learning, improving reasoning abilities in areas such as code generation and mathematics, and exploring reinforcement learning for agentic / open-ended tasks.
- Develop and implement novel reinforcement learning techniques to improve the performance and safety of large language models.
- Create tools and environments for models to interact with, enabling them to perform complex, open-ended tasks.
- Design and run experiments to enhance models' reasoning capabilities, particularly in code generation and mathematics.
- Have 5+ years of industry-related experience.
- Are proficient in Python and have experience with deep learning frameworks such as PyTorch or Jax.
- Have a strong software engineering background and are interested in working closely with researchers and engineers.
- Enjoy pair programming.
- Care about code quality, testing, and performance.
- Are passionate about AI's potential impact and committed to developing safe, beneficial systems.
- Have a background in machine learning, reinforcement learning, or high-performance computing.
- Experience with virtualization and sandboxed code environments.
- Experience with Kubernetes.
- Contributed to open-source projects or published relevant research.
- Formal certifications or educational credentials.
- Experience with LLMs or machine learning research prior.
Deadline to apply: None. Applications will be reviewed on a rolling basis.
The expected salary range for this position is:
Education requirements: Bachelor’s degree in a related field or equivalent experience.
Location-based hybrid policy: Currently, all staff are expected to be in the office at least 25% of the time, with some roles requiring more.
Visa sponsorship: We sponsor visas! We will make every effort to assist with visa processes if we make an offer.
We encourage you to apply even if you do not meet every qualification. Diversity and representation are important to us, and we value different perspectives in our team.
We believe impactful AI research is big science, focusing on large-scale efforts with high impact, akin to empirical sciences like physics and biology. We value collaboration, impact, and communication, hosting frequent discussions to pursue high-impact work.
Our recent research includes GPT-3, interpretability, multimodal neurons, scaling laws, AI & compute, safety, and human preferences.
Anthropic is headquartered in San Francisco, offering competitive compensation, benefits, equity donation matching, generous leave, flexible hours, and a collaborative office environment.
* indicates a required field
- First Name *
- Last Name *
- Email *
- Phone
- Resume/CV (Accepted formats: pdf, doc, docx, txt, rtf)
- Personal Preferences
- Pronunciation of your name
- Website
- Publications URL
- Other URLs
- Are you open to working in-office 25% of the time? *
- Preferred location for in-person work *
- Earliest start date
- Timeline considerations
- AI Policy acknowledgment *
- Why Anthropic? *
- Example of aligned values or meaningful work *
- Preferred weekly time breakdown
- Require visa sponsorship now or in future? *
- Additional information or cover letter
- LinkedIn profile or Resume (at least one required)
- Open to relocation? *
- Work address or
- Company
- Alcides Fonseca
- Location
- London, UK
Hybrid / WFH Options - Employment Type
- Full-time
- Posted
- Company
- Alcides Fonseca
- Location
- London, UK
Hybrid / WFH Options - Employment Type
- Full-time
- Posted