Job Description
Looking to tackle novel speech challenges at scale?
You'll be joining a small but mighty speech AI company building proprietary speech tech from the ground up. With a strong customer base, your research will directly impact production systems serving enterprise customers, with the opportunity to see your work deployed at scale in real-world voice applications.
They're a well-funded startup with healthy revenue streams and immediate opportunities for high-impact research.
Your research
You'll be working on breakthrough speech research that push the boundaries of naturalness and real-time performance. The company has achieved ultra-low latency and is now advancing toward unified speech-to-speech architectures.
You'll develop emotional expression and natural speech generation, advance multilingual support across 30+ languages, and enhance voice cloning robustness.
Your focus
- Lead cutting-edge research in SOTA speech models (TTS, ASR, or speech-to-speech)
- Design, execute and iterate on experiments end-to-end
- Drive speech controllability and naturalness improvements
- Develop evaluation methodologies for speech quality assessment
What you'll bring
- Deep understanding of cutting-edge speech models with end-to-end pipeline experience
- Experience with large-scale model training
- Strong background in speech model development and optimisation
- Published work with demonstrable results in industry or academic settings
Nice to have
- Performance optimisation experience for latency and compute efficiency
- Experience with model fusion and unified architectures
This is a remote role, either in US or Europe. Competitive comp based on experience.