Want to build speech AI that actually sounds human?
You'll be joining a well-funded speech AI startup with strong customer traction. They're building ultra-realistic voice technology that handles natural laughter, breathing, seamless language switching, and accurate pronunciation across languages and accents.
As a Staff Research Engineer, you'll work hands-on to expand their foundation models and push the boundaries of what's possible in speech AI: exploring multilingual capabilities, long-context generation, full-duplex modeling for natural conversations with interruptions, and novel architectures that balance speed with control.
What you'll do
- Conduct research to advance their core speech models and extend product capabilities
- Develop and experiment with new model architectures and training approaches
- Work on large-scale model training and data systems
- Collaborate with the team to take research from concept to deployed systems
What you'll bring
- 3+ years of experience in speech synthesis, audio generation, or generative modeling
- Experience with audio generation using LLMs
- Solid background in modern language model architectures
- Proven ability to ship research into production systems
- Experience training large-scale models
Nice to have
- Published research in speech or generative modeling
- Experience with real-time speech systems or multimodal models
Ideally in SF, but can also consider remote worldwide. Comp is up to $250K base DOE, plus equity.
| Location: | San Francisco, CA |
|---|---|
| Job type: | Permanent |
| Emp type: | Full-time |
| Salary type: | Annual |
| Salary: | negotiable |
| Job published: | 23/12/2025 |
| Job ID: | 34579 |