Job title: Machine Learning Research Engineer
Job type: Permanent
Emp type: Full-time
Industry: Generative AI
Functional Expertise: Diffusion Model Gen-Speech/TTS Multimodal AI
Salary type: Annual
Salary: negotiable
Location: Remote
Job published: 26/01/2026
Job ID: 34280

Job Description

Looking to push the boundaries of generative AI for real-time interaction?

You'll be joining a well-funded startup working on Multimodal AI where voice, vision, and language come together. 

They're building generative models for natural conversational experiences that need to perform in real-time.

Your mission

You'll be building and optimising diffusion or flow-matching models that power their speech and audio generation. 

This means developing production-ready architectures that can generate controllable, high-quality output at scale.

You'll own the full research-to-production pipeline - from architecture design and training through deployment and optimisation. 

Your work will directly impact how millions of AI characters sound and interact.

Your focus

  • Design and train large-scale diffusion or flow-matching models
  • Develop novel architectures and training techniques to improve controllability and quality
  • Build evaluation systems to measure generation quality and model behaviour
  • Work from low-level performance optimisations to high-level model design

What you'll bring

  • Proven track record building diffusion models or flow-matching systems
  • Experience training large models (3B+ parameters) with distributed systems

Nice to have

  • Experience with audio or speech generation
  • Publications or open-source contributions in diffusion models or generative AI

Remote in Europe with competitive comp + stock.