Job title: Staff Research Scientist
Job type: Permanent
Emp type: Full-time
Industry: Generative AI
Functional Expertise: Gen-Speech/TTS Speech-to-Speech
Salary type: Annual
Salary: negotiable
Location: Bay Area
Job published: 16/04/2025
Job ID: 33086

Job Description

Do you want to create emotionally expressive AI that transforms healthcare conversations?

A pioneering healthtech unicorn is building AI digital health agents designed to safely and empathetically assist patients. Their immediate focus is developing conversational AI with genuine emotional intelligence, with longer-term vision for full-duplex communication capabilities.

As the Staff Research Scientist, you'll play a key part in making this a reality - building foundational speech models that understand and respond with human-like emotion and natural conversation that healthcare demands.

What you'll do

  • Design and develop emotionally expressive speech models for healthcare conversations, working end-to-end from research through to productionizing models
  • Build conversational AI systems that can interpret and respond with appropriate emotional intelligence
  • Work on post-training techniques to enhance speech models' conversational and emotional capabilities
  • Tackle unique challenges including response time optimization, maintaining emotional consistency, and operating in noisy healthcare environments
  • Have the opportunity to publish your groundbreaking research

What you'll bring

  • 5+ years in speech technologies or related field
  • Hands-on experience with speech-to-speech systems (highly preferred), or strong experience in Text-to-Speech, Speech LLMs, emotional/expressive speech synthesis, or similar
  • Experience training large speech datasets
  • Ability to implement research papers from scratch

Bonus points for

  • Experience pre-training foundation models with speech (HuBERT, Wav2Vec, or similar)
  • Multimodal experience
  • Experience with inference technologies (vLLM, CUDA)

You'll be based in the Bay Area or willing to relocate. You'll receive highly competitive comp (up to $350K base DOE) with substantial equity.

If you're excited about creating the next generation of emotionally intelligent speech AI that will revolutionise healthcare communication, click apply!