Your search has found 2 jobs

Do you want to create AI that converses as naturally as humans do? A pioneering healthtech unicorn is building AI digital health agents designed to safely and empathetically assist patients.

As the Staff Research Scientist, you'll play a key part in making this a reality - building end-to-end foundational speech models capable of full-duplex communication. This isn't just about taking turns speaking; it's about creating AI that can listen and respond simultaneously with human-like conversation, emotions, and natural language that healthcare demands.

What you'll do

  • Design and develop novel speech foundation models for healthcare conversations, working end-to-end from research through to productionizing models
  • Work on post-training LLMs for speech to enhance their conversational capabilities
  • Tackle unique challenges including response time optimization, maintaining alignment between text and speech outputs, and operating in noisy environments
  • Create innovative approaches to synthetic conversational data generation
  • Have the opportunity to publish your groundbreaking research

What you'll bring

  • PhD with 8+ years in speech technologies or related field
  • Experience with Speech LLMs
  • Experience training large datasets
  • Strong publication record at top-tier conferences in speech/multimodal AI
  • Ability to implement research papers from scratch

Bonus points for

  • Experience pre-training foundation models with speech (HuBERT, Wav2Vec, or similar)
  • Multimodal experience
  • Experience with inference technologies (vLLM, CUDA)

You'll be based in the Bay Area and will receive highly competitive comp (up to $350K base DOE) with substantial equity.

If you're excited about creating the next generation of speech AI that will revolutionize healthcare communication, click apply!

Location: Bay Area
Job type: Permanent
Emp type: Full-time
Salary type: Annual
Salary: negotiable
Job published: 16/04/2025
Job ID: 33086

The future of communication should be bias and barrier-free. That's the vision behind this well-funded start-up pioneering real-time speech algorithms.
 
You'll join a research team on tech that is the first of its kind, improving how we communicate in the real world. By offering clear, natural-sounding conversations regardless of accent or environment. Their ground-breaking technology is already providing impressive results, so it's no wonder they're growing x4 annually.
 
They are creating the whitespace in speech research, and you'll play a key role.
 
As an Senior ML Scientist, you'll work within a talented R&D team advancing core speech algorithms and audio AI models.
 
The role
 
- Contribute to cutting-edge R&D advancing core speech algorithms and Generative Audio models. Continually push boundaries to the next level.
- Tackle unsolved problems in Generative Speech and Audio such as preserving naturalness and performance in noisy environments.
- End-to-end ownership of models, from data collection to training on the cloud.
- Develop novel architectures balancing cutting-edge performance with real-time efficiency & low-latency
- Collaborate with top scientists in this field 
 
You'll have
 
- 4+ years of industry experience developing and implementing either of the following: TTS, Voice Conversion/Cloning, Speech Synthesis, Speech Translation, Accent Translation, Speech Restoration
- Proven background contributing to well-known research publications and/or products in these areas
- PhD or degree in Computer Science, ML, or related field. 
- Proven experience with PyTorch, TensorFlow and modern DL techniques such as GANs, VAEs, diffusion or flow models, etc.
- Familiar with cloud-based technologies and production environments
 
What you'll get in return
 
- Benefits include a competitive salary, share options, unlimited PTO health coverage, and a VPO plan.
- Contributing to whitespace in speech technology research, you'll have control over the direction of your work with no friction whatsoever. You're the expert after all.
 
If you're looking to make an impact, there are few better places to do it. Your work here has the power to improve communication, eliminate confusion, and create a more connected world.
 
If you want the freedom to shape the future of speech AI, apply now.

Location: Bay Area
Job type: Permanent
Emp type: Full-time
Salary type: Annual
Salary: USD $300,000.00
Job published: 26/02/2025
Job ID: 32723