Job Description
Do you want to create AI that converses as naturally as humans do? A pioneering healthtech unicorn is building AI digital health agents designed to safely and empathetically assist patients.
As the Staff Research Scientist, you'll play a key part in making this a reality - building end-to-end foundational speech models capable of full-duplex communication. This isn't just about taking turns speaking; it's about creating AI that can listen and respond simultaneously with human-like conversation, emotions, and natural language that healthcare demands.
What you'll do
- Design and develop novel speech foundation models for healthcare conversations, working end-to-end from research through to productionizing models
- Work on post-training LLMs for speech to enhance their conversational capabilities
- Tackle unique challenges including response time optimization, maintaining alignment between text and speech outputs, and operating in noisy environments
- Create innovative approaches to synthetic conversational data generation
- Have the opportunity to publish your groundbreaking research
What you'll bring
- PhD with 8+ years in speech technologies or related field
- Experience with Speech LLMs
- Experience training large datasets
- Strong publication record at top-tier conferences in speech/multimodal AI
- Ability to implement research papers from scratch
Bonus points for
- Experience pre-training foundation models with speech (HuBERT, Wav2Vec, or similar)
- Multimodal experience
- Experience with inference technologies (vLLM, CUDA)
You'll be based in the Bay Area and will receive highly competitive comp (up to $350K base DOE) with substantial equity.
If you're excited about creating the next generation of speech AI that will revolutionize healthcare communication, click apply!