Senior ML Engineer

Company Description

Voice-Swap is building the future of AI voice technology for the creative industries - with ethics, artist partnership, and cutting-edge engineering at the core. We work directly with musicians, voice-over artists, and media partners to develop ethically licensed, production-grade AI voice models with uncompromising speaker likeness and perceptual quality.

We are now looking for a Senior Machine Learning Engineer (Speech AI) to help us push high-fidelity speech synthesis and voice conversion systems to production scale. As an early-stage, fast-moving company, we value people who take ownership, move quickly, and are comfortable operating with both autonomy and responsibility.

Learn more at https://www.voice-swap.ai.

Role Description

This is a full-time remote role for a Senior Machine Learning Engineer at Voice-Swap.

You will:

Implement neural speech synthesis models, prioritising speaker likeness and naturalness
Write model inference API scripts for product deployment
Write scripts for data preprocessing and model evaluation
Work directly with clients on text-to-speech and/or voice conversion model projects
Script and support professional voiceover data collection sessions
Reimplement and adapt architectures from scientific papers into production-ready systems
Contribute to improving training efficiency and deployment performance

This role requires someone comfortable moving between research papers, GPU training runs, and production APIs.

Qualifications

Solid understanding of the fundamental concepts of Machine Learning and Deep Learning (Transformers, CNNs, RNNs)
Strong grounding in mathematics, audio signal processing, speech processing, or NLP
Experience with ML frameworks (PyTorch or TensorFlow)
Experience training and deploying models on cloud services (AWS, GCP, etc.)
Experience reimplementing architectures from scientific papers
Comfortable with Git & GitHub workflows
Strong software engineering discipline and attention to reproducibility

Bonus Skills

Experience in speech synthesis (text-to-speech and/or voice conversion)
Training and inference optimisation (e.g., quantisation techniques)
MS or PhD in Computer Science or Machine Learning, or 3+ years of relevant experience
Publications in top-tier speech / NLP / signal processing conferences (Interspeech, ICASSP, ASRU, SLT, EUSIPCO, ACL, etc.)
Music production or audio engineering experience

Who Thrives Here

You enjoy working in a startup environment where priorities can evolve quickly
You are proactive and don’t wait to be told what to do
You are comfortable owning problems from research to production
You care about audio quality and technical excellence
You’re collaborative, reliable, and enjoyable to work with

Note: With your CV please provide a brief info of your proudest project (GitHub repo, arxiv paper link, short description).

Apply Now

Senior ML Engineer

Job Details