Job Description
Ready to pioneer the speech intelligence behind the next generation of embodied AI?
Join a pioneering startup developing foundational technology for natural conversation in embodied agents. You'll advance the speech systems that power avatars with authentic behaviours, real-time expression, and conversational intelligence that handles interruptions and turn-taking just like humans.
This Lead Research Scientist role focuses on advancing real-time speech systems for interactive avatars. You'll develop full-duplex dialogue models and speech-to-speech architectures that enable natural conversational flow, interruption handling, and emotional expression.
Founded by ex-Googlers, they're building proprietary behaviour models that learn from two-way interactions, creating systems where speech timing, prosody, and contextual responses work in harmony with facial expressions and physical behaviours to drive authentic embodied intelligence.
Your focus:
Research & develop full-duplex speech systems with natural interruption handling
Develop expressive voice models with controllable prosody and timing
Build speech-to-speech architectures preserving identity and emotion
Create real-time audio generation systems for conversational avatars
Publish research while deploying systems in production
Collaborate across teams integrating speech with visual behaviour
Requirements:
PhD in Speech, Machine Learning, or related field
First-author publications at top conferences (Interspeech, ICASSP, NeurIPS, ICLR, etc)
Expertise in text-to-speech, speech-to-speech models, or voice cloning
Large-scale training experience
Experience in prosody modelling or real-time audio generation
Nice to have:
Experience with full-duplex speech research
Speech-visual alignment expertise (lip sync, expressions)
Real-time audio deployment optimisation
Package:
Competitive salary $200k- $300k base (based on experience)
Meaningful equity package
Comprehensive healthcare (90% covered)
Unlimited PTO
Fully remote work with regular team offsites
Life insurance and disability coverage
Location: Fully remote position, globally, with preference for Pacific Time alignment.
Ready to make AI conversations feel authentically human?
Contact Allys at Techire AI. All applicants will receive a response.
Questionnaire
Do you have a PhD in speech, machine learning or related field? Please select Yes No
Do you have first author publications at top speech conferences? Please select Yes No
Do you have experience with TTS, speech-to-speech, or voice cloning models? Please select Yes No