New Job Opening: Lead Research Scientist in Remote

Job title:	Lead Research Scientist
Job type:	Permanent
Emp type:	Full-time
Industry:	Generative AI
Functional Expertise:	AI Avatars Foundation Models Gen-Speech/TTS Multimodal AI Speech-to-Speech Voice Cloning
Salary type:	Annual
Salary:	negotiable
Location:	Remote
Job published:	02/07/2025
Job ID:	33482

Job Description

Ready to pioneer the speech intelligence behind the next generation of embodied AI?

Join a pioneering startup developing foundational technology for natural conversation in embodied agents. You'll advance the speech systems that power avatars with authentic behaviours, real-time expression, and conversational intelligence that handles interruptions and turn-taking just like humans.

This Lead Research Scientist role focuses on advancing real-time speech systems for interactive avatars. You'll develop full-duplex dialogue models and speech-to-speech architectures that enable natural conversational flow, interruption handling, and emotional expression.

Founded by ex-Googlers, they're building proprietary behaviour models that learn from two-way interactions, creating systems where speech timing, prosody, and contextual responses work in harmony with facial expressions and physical behaviours to drive authentic embodied intelligence.

Your focus:

Research & develop full-duplex speech systems with natural interruption handling
Develop expressive voice models with controllable prosody and timing
Build speech-to-speech architectures preserving identity and emotion
Create real-time audio generation systems for conversational avatars
Publish research while deploying systems in production
Collaborate across teams integrating speech with visual behaviour

Requirements:

PhD in Speech, Machine Learning, or related field
First-author publications at top conferences (Interspeech, ICASSP, NeurIPS, ICLR, etc)
Expertise in text-to-speech, speech-to-speech models, or voice cloning
Large-scale training experience
Experience in prosody modelling or real-time audio generation

Nice to have:

Experience with full-duplex speech research
Speech-visual alignment expertise (lip sync, expressions)
Real-time audio deployment optimisation

Package:

Competitive salary $200k- $300k base (based on experience)
Meaningful equity package
Comprehensive healthcare (90% covered)
Unlimited PTO
Fully remote work with regular team offsites
Life insurance and disability coverage

Location: Fully remote position, globally, with preference for Pacific Time alignment.

Ready to make AI conversations feel authentically human?

Contact Allys at Techire AI. All applicants will receive a response.

Questionnaire

Do you have a PhD in speech, machine learning or related field?

Do you have first author publications at top speech conferences?

Do you have experience with TTS, speech-to-speech, or voice cloning models?

Apply with indeed

Upload Resume | Portfolio

File types (doc, docx, pdf, rtf, png, jpeg, jpg, bmp, jng, ppt, pptx, csv, gif) size up to 5MB

First name

Last name

Phone number

Location

By checking this box, you agree to our Terms of Service

Job Description

Questionnaire

Our use of cookies