Your search has found 3 jobs

Ready to pioneer the speech intelligence behind the next generation of embodied AI?

Join a pioneering startup developing foundational technology for natural conversation in embodied agents. You'll advance the speech systems that power avatars with authentic behaviours, real-time expression, and conversational intelligence that handles interruptions and turn-taking just like humans.

This Lead Research Scientist role focuses on advancing real-time speech systems for interactive avatars. You'll develop full-duplex dialogue models and speech-to-speech architectures that enable natural conversational flow, interruption handling, and emotional expression.

Founded by ex-Googlers, they're building proprietary behaviour models that learn from two-way interactions, creating systems where speech timing, prosody, and contextual responses work in harmony with facial expressions and physical behaviours to drive authentic embodied intelligence.

Your focus:

  • Research & develop full-duplex speech systems with natural interruption handling
  • Develop expressive voice models with controllable prosody and timing
  • Build speech-to-speech architectures preserving identity and emotion
  • Create real-time audio generation systems for conversational avatars
  • Publish research while deploying systems in production
  • Collaborate across teams integrating speech with visual behaviour

Requirements:

  • PhD in Speech, Machine Learning, or related field
  • First-author publications at top conferences (Interspeech, ICASSP, NeurIPS, ICLR, etc)
  • Expertise in text-to-speech, speech-to-speech models, or voice cloning
  • Large-scale training experience
  • Experience in prosody modelling or real-time audio generation

Nice to have:

  • Experience with full-duplex speech research
  • Speech-visual alignment expertise (lip sync, expressions)
  • Real-time audio deployment optimisation

Package:

  • Competitive salary $200k- $300k base (based on experience)
  • Meaningful equity package
  • Comprehensive healthcare (90% covered)
  • Unlimited PTO
  • Fully remote work with regular team offsites
  • Life insurance and disability coverage

Location: Fully remote position, globally, with preference for Pacific Time alignment.

Ready to make AI conversations feel authentically human?

Contact Allys at Techire AI. All applicants will receive a response.

 
Location: Remote
Job type: Permanent
Emp type: Full-time
Salary type: Annual
Salary: negotiable
Job published: 02/07/2025
Job ID: 33482

Ready to architect the future of human-computer voice interaction?

Join a well established conversational AI company as they transition from traditional cascaded speech systems to cutting-edge E2E speech-to-speech technology. You'll lead this transformation, building multimodal systems that will redefine how millions interact with AI.

The opportunity

You'll be developing technology that directly impacts real users at massive scale. The company processes millions of daily interactions across major enterprise clients, meaning your research will shape real-world conversational experiences.

You'll spearhead the development of proprietary full-duplex speech systems, creating truly natural AI conversations that go far beyond current capabilities.

Your impact

  • Design and build next-generation speech language models from the ground up
  • Drive breakthroughs in speech-to-speech modeling and full-duplex conversation systems
  • Tackle turn-taking, interruption handling, and simultaneous speech processing
  • Bridge cutting-edge research with enterprise-grade production systems
  • Lead a growing team focused on speech-to-speech breakthroughs

What you'll bring

  • Deep understanding of SOTA speech models and neural audio processing
  • Experience building speech language models/multimodal systems
  • Strong background in speech AI research and modern speech architectures

With their established market position and proven track record, you'll have the resources and real-world testing ground to make transformative impact with your research.

The company has built everything in-house, giving you complete technical control and the freedom to explore any approach that delivers value.

Location

Remote (Must be close to EU timezone)

Location: Remote
Job type: Permanent
Emp type: Full-time
Salary type: Annual
Salary: negotiable
Job published: 28/06/2025
Job ID: 33350

Do you want to create emotionally expressive AI that transforms healthcare conversations?

A pioneering healthtech unicorn is building AI digital health agents designed to safely and empathetically assist patients. Their immediate focus is developing conversational AI with genuine emotional intelligence, with longer-term vision for full-duplex communication capabilities.

As the Staff Research Scientist, you'll play a key part in making this a reality - building foundational speech models that understand and respond with human-like emotion and natural conversation that healthcare demands.

What you'll do

  • Design and develop emotionally expressive speech models for healthcare conversations, working end-to-end from research through to productionizing models
  • Build conversational AI systems that can interpret and respond with appropriate emotional intelligence
  • Work on post-training techniques to enhance speech models' conversational and emotional capabilities
  • Tackle unique challenges including response time optimization, maintaining emotional consistency, and operating in noisy healthcare environments
  • Have the opportunity to publish your groundbreaking research

What you'll bring

  • 5+ years in speech technologies or related field
  • Hands-on experience with speech-to-speech systems (highly preferred), or strong experience in Text-to-Speech, Speech LLMs, emotional/expressive speech synthesis, or similar
  • Experience training large speech datasets
  • Ability to implement research papers from scratch

Bonus points for

  • Experience pre-training foundation models with speech (HuBERT, Wav2Vec, or similar)
  • Multimodal experience
  • Experience with inference technologies (vLLM, CUDA)

You'll be based in the Bay Area or willing to relocate. You'll receive highly competitive comp (up to $350K base DOE) with substantial equity.

If you're excited about creating the next generation of emotionally intelligent speech AI that will revolutionise healthcare communication, click apply!

Location: Bay Area
Job type: Permanent
Emp type: Full-time
Salary type: Annual
Salary: negotiable
Job published: 16/04/2025
Job ID: 33086