New Job Opening: Research Scientist - Embodied AI in Remote

Job title:	Research Scientist - Embodied AI
Job type:	Permanent
Emp type:	Full-time
Industry:	AI Agents
Salary type:	Annual
Salary:	negotiable
Location:	Remote
Job published:	18/06/2025
Job ID:	33449

Job Description

Ready to create foundation models for Embodied AI?

Join a pioneering startup developing the foundation layer for the next big AI unlock, naturalness of conversation, from the speech visual element, the interruptions, turn taking. This in turn will change the game for embodied agents with natural behaviours, real-time expression, and conversational intelligence that goes far beyond current avatar technology.

This Research Scientist role focuses on advancing embodied AI through groundbreaking research on Audio to Video models. While existing solutions rely on looped animations with basic lip-sync, this company is building behaviour driven models that drive authentic, real-time interactions capable of natural conversation flow, interruption handling, and emotional expression.

Founded 18 months ago by an exceptional team where 7 out of 12 members hold AI PhDs, they're solving fundamental challenges in Embodied Intelligence. Their beta platform already demonstrates sophisticated real-time avatar systems with proprietary voice models and behaviour engines working in harmony.

The company is building foundational technology that learn from two-way video interactions, creating systems that understand and respond to both verbal and non-verbal cues. Their research sits at the intersection of computer vision, conversational AI, and real-time generation.

Your focus:

Conduct cutting-edge research in avatar modelling, behaviour generation, and style transfer
Develop sophisticated facial and body dynamics systems for expressive avatars
Create conversational AI systems that drive natural avatar behaviour through LLMs
Build real-time multimodal generation pipelines integrating visual, audio, and text
Contribute to Behaviour model development for authentic interaction patterns
Collaborate with engineering to productionise research into real-time systems
Publish findings at top-tier conferences while deploying in real-world applications

Technical challenges: You'll work with cutting-edge techniques including diffusion models, flow matching, and Gaussian splatting. The focus is on dyadic conversational avatar development and natural behaviour modelling, emphasising authentic real-time interaction over static visual perfection.

Requirements:

PhD in Computer Vision, Machine Learning, or related field
Strong publication record at top conferences (CVPR, NeurIPS, ICCV, ECCV, ICML, ICLR, SIGGRAPH, etc). Recent avatar research publications within the past 2 years (essential)
Expertise in flow matching and diffusion models
Experience with one or more: conversational avatars or behaviour modelling or real-time multimodal generation
PyTorch proficiency and large-scale training experience

Nice to have:

Industry experience deploying ML models in real-time applications
Voice research publications
Background in interactive systems or conversational AI

Environment: You'll join a distributed team working primarily in Pacific Time zones, collaborating with specialists in avatar development, voice research, and behaviour modelling. The culture emphasises high ownership, velocity with purpose, and collaborative problem-solving in a fast-moving research environment.

Package:

Competitive salary, $200k- $300k base (based on experience)
Meaningful equity package
Comprehensive healthcare (90% covered)
Unlimited PTO
Fully remote work with regular team offsites
Life insurance and disability coverage

Location: Fully remote position, globally, with preference for Pacific Time alignment.

If you're excited about conducting pioneering research in the next challenge of embodied intelligence while shaping the future of human-AI interaction, this offers an exceptional opportunity to work on genuinely transformative technology.

Ready to help create AI that feels present, not just functional?

Contact Marc Powell at Techire AI. All applicants will receive a response.

Questionnaire

Do you have recent papers on Avatars, Embodied AI or Multimodal generation? Must be in the past 2-3 years and at top conferences NeurIPS, ICML, CVPR, ICCV, ECCV, ICLR etc)

Apply with indeed

Upload Resume | Portfolio

File types (doc, docx, pdf, rtf, png, jpeg, jpg, bmp, jng, ppt, pptx, csv, gif) size up to 5MB

First name

Last name

Phone number

Location

By checking this box, you agree to our Terms of Service

Job Description

Questionnaire

Our use of cookies