Job title:	Data Machine Learning Engineer
Job type:	Permanent
Emp type:	Full-time
Industry:	Generative AI
Functional Expertise:	Data Gen-Speech/TTS Speech-to-Speech
Salary type:	Annual
Salary:	negotiable
Location:	Remote, worldwide
Job published:	04/05/2026
Job ID:	35965

Job Description

Want to own the data infrastructure behind some of the most naturalistic voice models in production?

You'll be joining a well-funded speech AI startup — just closed their Series A — with strong enterprise traction and revenue that more than doubled last quarter. They're building ultra-realistic voice technology that handles natural laughter, breathing, seamless language switching, and accurate pronunciation across languages and accents. Their models are powering hundreds of millions of conversations monthly.

Before training a single model, they built their own corpus — full-duplex, studio-quality conversational speech annotated by PhD linguists. As their MLE, you'll own the pipelines that turn that raw material into clean, training-ready data.

What you'll do

Own end-to-end data pipelines from raw audio ingestion through to versioned, training-ready datasets
Build quality systems that catch annotation errors and alignment issues before they reach a training run
Maintain the training infrastructure that keeps GPUs fed — dataloaders, streaming datasets, multi-modal batching
Build and iterate on tooling across speech representations including neural codecs, semantic tokens and mel features
Handle full- and half-duplex pipeline work including two-channel alignment and overlap handling

What you'll bring

Strong engineering fundamentals with experience building ML data pipelines at scale
Hands-on experience with speech or audio data
Solid understanding of speech representations and the tradeoffs between them
Experience with multi-channel audio data including diarisation and alignment

Nice to have

Experience with multilingual data pipelines
Large-scale training infrastructure experience — FSDP, DeepSpeed, Ray
Annotation tooling and human-in-the-loop systems

Remote-friendly. Competitive base plus stock.

Location:	San Francisco
Job type:	Permanent
Emp type:	Full-time
Salary type:	Annual
Salary:	negotiable
Job published:	30/04/2026
Job ID:	34047

Location:	Remote
Job type:	Permanent
Emp type:	Full-time
Salary type:	Annual
Salary:	negotiable
Job published:	30/04/2026
Job ID:	33350

Job Description

Our use of cookies