techire ai Engagement Hub

Training builds capability. Post-training decides what it becomes.

This team are rethinking how large multimodal models learn after pre-training — developing post-training and reinforcement learning methods that help models reason, plan, and interact in real time.

Founded by the researchers behind several of the most influential modern AI architectures, this lab are pushing alignment and learning efficiency beyond standard RLHF. They’re scaling preference-based training (RLHF, DPO, hybrid feedback loops) to new model types and creating systems that learn from interaction rather than static data.

You’ll work at the intersection of post-training, RL, and model architecture — designing reward models, scalable evaluation frameworks, and training strategies that make large-scale learning measurable and reliable. It’s applied research with direct impact, supported by serious compute and a tight researcher-to-GPU ratio.

You’ll bring experience in large-scale post-training or reinforcement learning (RLHF, DPO, or SFT pipelines), a solid grasp of LLM or multimodal training systems, and the curiosity to explore new optimisation and alignment methods. A publication record at top venues (NeurIPS, ICLR, ICML, CVPR, ACL) is a plus, but impact matters more than titles.

The team are based in San Francisco, working mostly in person. $1 million+ total compensation. Base salary circa $300K – $600K (negotiable) plus stock and bonus — exact package depends on experience.

If you want to work where post-training meets architecture — shaping how foundation models learn, reason, and adapt — this is that opportunity.

All applicants will receive a response.

Location:	San Francisco
Job type:	Permanent
Emp type:	Full-time
Salary type:	Annual
Salary:	negotiable
Job published:	11/02/2026
Job ID:	34012

GPU Optimization Engineer

GPU Optimisation Engineer — Real-Time Inference

Want to push GPU performance to its limits — not in theory, but in production systems handling real-time speech and multimodal workloads?

This team is building low-latency AI systems where milliseconds actually matter. The target isn’t “faster than baseline.” It’s sub-50ms time-to-first-token at 100+ concurrent requests on a single H100 — while maintaining model quality.

They’re hiring a GPU Optimisation Engineer who understands GPUs at an architectural level. Someone who knows where performance is really lost: memory hierarchy, kernel launch overhead, occupancy limits, scheduling inefficiencies, KV cache behaviour, attention paths. The work sits close to the metal, inside inference execution — not general infra, not model research.

You’ll operate across the kernel and runtime layers, profiling large-scale speech and multimodal models end-to-end and removing bottlenecks wherever they appear.

What you’ll work on

Profiling GPU bottlenecks across memory bandwidth, kernel fusion, quantisation, and scheduling
Writing and tuning custom CUDA / Triton kernels for performance-critical paths
Improving attention, decoding, and KV cache efficiency in inference runtimes
Modifying and extending vLLM-style systems to better suit real-time workloads
Optimising models to fit GPU memory constraints without degrading output quality
Benchmarking across NVIDIA GPUs (with exposure to AMD and other accelerators over time)
Partnering directly with research to turn new model ideas into fast, production-ready inference

This is hands-on optimisation work across the stack. No layers of bureaucracy. No “platform ownership” theatre. Just deep performance engineering applied to models that are actively evolving.

What tends to work well

Strong experience with CUDA and/or Triton
Deep understanding of GPU execution (memory hierarchy, scheduling, occupancy, concurrency)
Experience optimising inference latency and throughput for large generative models
Familiarity with attention kernels, decoding paths, or LLM-style runtimes
Comfort profiling with low-level GPU tooling

The company is revenue-generating, its models are used by global enterprises, and the SF R&D team is expanding following a recent raise. This is growth hiring, not backfill.

Package & location

Base salary: up to ~$300,000 (negotiable based on depth)
Equity: Meaningful stock
Location: San Francisco preferred (relocation and visa sponsorship can be provided)

If you care about real-time constraints, GPU architecture, and squeezing every last millisecond out of large models, this is worth a conversation.

All applicants will receive a response.

Location:	San Francisco, CA
Job type:	Permanent
Emp type:	Full-time
Salary type:	Annual
Salary:	negotiable
Job published:	11/02/2026
Job ID:	34843

Machine Learning Research Engineer

Looking to push the boundaries of generative AI for real-time interaction?

You'll be joining a well-funded startup working on Multimodal AI where voice, vision, and language come together.

They're building generative models for natural conversational experiences that need to perform in real-time.

Your mission

You'll be building and optimising diffusion or flow-matching models that power their speech and audio generation.

This means developing production-ready architectures that can generate controllable, high-quality output at scale.

You'll own the full research-to-production pipeline - from architecture design and training through deployment and optimisation.

Your work will directly impact how millions of AI characters sound and interact.

Your focus

Design and train large-scale diffusion or flow-matching models
Develop novel architectures and training techniques to improve controllability and quality
Build evaluation systems to measure generation quality and model behaviour
Work from low-level performance optimisations to high-level model design

What you'll bring

Proven track record building diffusion models or flow-matching systems
Experience training large models (3B+ parameters) with distributed systems

Nice to have

Experience with audio or speech generation
Publications or open-source contributions in diffusion models or generative AI

Remote in Europe with competitive comp + stock.

Location:	Remote
Job type:	Permanent
Emp type:	Full-time
Salary type:	Annual
Salary:	negotiable
Job published:	26/01/2026
Job ID:	34280

Staff Speech Scientist

Want to build speech AI that actually sounds human?

You'll be joining a well-funded speech AI startup with strong customer traction. They're building ultra-realistic voice technology that handles natural laughter, breathing, seamless language switching, and accurate pronunciation across languages and accents.

As a Staff Research Engineer, you'll work hands-on to expand their foundation models and push the boundaries of what's possible in speech AI: exploring multilingual capabilities, long-context generation, full-duplex modeling for natural conversations with interruptions, and novel architectures that balance speed with control.

What you'll do

Conduct research to advance their core speech models and extend product capabilities
Develop and experiment with new model architectures and training approaches
Work on large-scale model training and data systems
Collaborate with the team to take research from concept to deployed systems

What you'll bring

3+ years of experience in speech synthesis, audio generation, or generative modeling
Experience with audio generation using LLMs
Solid background in modern language model architectures
Proven ability to ship research into production systems
Experience training large-scale models

Nice to have

Published research in speech or generative modeling
Experience with real-time speech systems or multimodal models

Ideally in SF, but can also consider remote worldwide. Comp is up to $250K base DOE, plus equity.

Location:	San Francisco, CA
Job type:	Permanent
Emp type:	Full-time
Salary type:	Annual
Salary:	negotiable
Job published:	23/12/2025
Job ID:	34579

Research Scientist, Vision Generation

Want to define how AI generates coherent video over minutes, not seconds?

This role sits at the heart of one of the hardest open problems in generative media: long-form video generation!

You’ll join a small, research-driven team building a multi-modal foundation model that reasons jointly across image, text, and audio. Their work powers a creative platform used to generate controllable, expressive video - and the underlying model is already in production.

As a Research Scientist focused on long video generation, you’ll work on the architectural problems that emerge once sequences stop being toy-length.

You’ll spend your time pushing sequence models to handle multi-minute videos without collapse.

What you’ll work on

Architectures for long-form, auto-regressive video generation
Causal attention and long-context modelling strategies
Techniques for temporal and semantic coherence over extended sequences
Memory-efficient transformers and sequence compression
Translating research into production-grade pipelines
Publishing and presenting work internally and externally

You’ll fit well here if you’re comfortable operating at the intersection of theory and systems.

Someone who can read the latest long-context papers, prototype quickly in PyTorch, and reason about what scales when models move from experiments to real users.

The team works fully in-person in San Francisco or New York.

What you’ll bring

PhD or equivalent research/industry experience in ML or sequence modelling
Deep understanding of transformers, attention, and auto-regressive generation
Experience with long-context or memory-efficient modelling
Strong Python and PyTorch skills
Evidence of real research impact or large-scale deployment

Package

Salary: Negotiable depending on experience
Meaningful equity
Medical, dental, and vision cover
401(k)
Lunch and snacks provided
Fully in-person role (SF or NYC)

If you want to work on long-form video generation problems, this is one of the few places doing it properly - please apply now!

All applicants will receive a response.

Location:	New York, NY
Job type:	Permanent
Emp type:	Full-time
Salary type:	Annual
Salary:	negotiable
Job published:	17/12/2025
Job ID:	32690

TTS Research Scientist

Looking to tackle novel speech challenges at scale?

You'll be joining a small but mighty speech AI company building proprietary speech tech from the ground up. With a strong customer base, your research will directly impact production systems serving enterprise customers, with the opportunity to see your work deployed at scale in real-world voice applications.

They're a well-funded startup with healthy revenue streams and immediate opportunities for high-impact research.

Your research

You'll be working on breakthrough speech research that push the boundaries of naturalness and real-time performance. The company has achieved ultra-low latency and is now advancing toward unified speech-to-speech architectures.

You'll develop emotional expression and natural speech generation, advance multilingual support across 30+ languages, and enhance voice cloning robustness.

Your focus

Lead cutting-edge research in SOTA speech models (TTS, ASR, or speech-to-speech)
Design, execute and iterate on experiments end-to-end
Drive speech controllability and naturalness improvements
Develop evaluation methodologies for speech quality assessment

What you'll bring

Deep understanding of cutting-edge speech models with end-to-end pipeline experience
Experience with large-scale model training
Strong background in speech model development and optimisation
Published work with demonstrable results in industry or academic settings

Nice to have

Performance optimisation experience for latency and compute efficiency
Experience with model fusion and unified architectures

This is a remote role, either in US or Europe. Competitive comp based on experience.

Location:	Remote
Job type:	Permanent
Emp type:	Full-time
Salary type:	Annual
Salary:	negotiable
Job published:	23/09/2025
Job ID:	33913

Speech Research Scientist

Ready to build speech AI that actually works in real-time?

A well-funded AI startup has developed new model architectures that make real-time conversational AI finally viable at scale. While most voice AI still suffers from delays and computational bottlenecks, they've solved the core efficiency problems that have held the field back.

The role

As their Speech Research Scientist, you'll build the speech models that could define the next decade of voice interaction. You'll work on novel architectures that have immediate real-world impact for thousands of customers.

What you'll do

Design and implement SOTA speech synthesis models
Develop efficient algorithms for voice processing and audio understanding
Create scalable systems that handle massive audio workloads
Build comprehensive evaluation methods to validate model performance
Collaborate with engineering teams to transition research into production

What you'll bring

Deep expertise in modern speech technologies (Text-to-Speech, Speech LLMs, Voice Conversion/Cloning, Speech Synthesis, Speech Translation, Speech Restoration)
Strong background in generative modeling for audio and speech
Publications at leading conferences
Track record of implementing research ideas from concept to production

This role is based in the Bay Area.

If you're excited about building the foundational models that will power the Voice AI revolution, we'd love to hear from you.

Location:	Bay Area
Job type:	Permanent
Emp type:	Full-time
Salary type:	Annual
Salary:	negotiable
Job published:	05/06/2025
Job ID:	33251

Staff Speech Research Scientist

Do you want to create emotionally expressive AI that transforms healthcare conversations?

A pioneering healthtech unicorn is building AI digital health agents designed to safely and empathetically assist patients. Their immediate focus is developing conversational AI with genuine emotional intelligence, with longer-term vision for full-duplex communication capabilities.

As the Staff Research Scientist, you'll play a key part in making this a reality - building foundational speech models that understand and respond with human-like emotion and natural conversation that healthcare demands.

What you'll do

Design and develop emotionally expressive speech models for healthcare conversations, working end-to-end from research through to productionizing models
Build conversational AI systems that can interpret and respond with appropriate emotional intelligence
Work on post-training techniques to enhance speech models' conversational and emotional capabilities
Tackle unique challenges including response time optimization, maintaining emotional consistency, and operating in noisy healthcare environments
Have the opportunity to publish your groundbreaking research

What you'll bring

5+ years in speech technologies or related field
Hands-on experience with speech-to-speech systems (highly preferred), or strong experience in Text-to-Speech, Speech LLMs, emotional/expressive speech synthesis, or similar
Experience training large speech datasets
Ability to implement research papers from scratch

Bonus points for

Experience pre-training foundation models with speech (HuBERT, Wav2Vec, or similar)
Multimodal experience
Experience with inference technologies (vLLM, CUDA)

You'll be based in the Bay Area or willing to relocate. You'll receive highly competitive comp (up to $350K base DOE) with substantial equity.

If you're excited about creating the next generation of emotionally intelligent speech AI that will revolutionise healthcare communication, click apply!

Location:	Bay Area
Job type:	Permanent
Emp type:	Full-time
Salary type:	Annual
Salary:	negotiable
Job published:	16/04/2025
Job ID:	33086

Senior ML Scientist

The future of communication should be bias and barrier-free. That's the vision behind this well-funded start-up pioneering real-time speech algorithms.

You'll join a research team on tech that is the first of its kind, improving how we communicate in the real world. By offering clear, natural-sounding conversations regardless of accent or environment. Their ground-breaking technology is already providing impressive results, so it's no wonder they're growing x4 annually.

They are creating the whitespace in speech research, and you'll play a key role.

As an Senior ML Scientist, you'll work within a talented R&D team advancing core speech algorithms and audio AI models.

The role

- Contribute to cutting-edge R&D advancing core speech algorithms and Generative Audio models. Continually push boundaries to the next level.
- Tackle unsolved problems in Generative Speech and Audio such as preserving naturalness and performance in noisy environments.
- End-to-end ownership of models, from data collection to training on the cloud.
- Develop novel architectures balancing cutting-edge performance with real-time efficiency & low-latency
- Collaborate with top scientists in this field

You'll have

- 4+ years of industry experience developing and implementing either of the following: TTS, Voice Conversion/Cloning, Speech Synthesis, Speech Translation, Accent Translation, Speech Restoration
- Proven background contributing to well-known research publications and/or products in these areas
- PhD or degree in Computer Science, ML, or related field.
- Proven experience with PyTorch, TensorFlow and modern DL techniques such as GANs, VAEs, diffusion or flow models, etc.
- Familiar with cloud-based technologies and production environments

What you'll get in return

- Benefits include a competitive salary, share options, unlimited PTO health coverage, and a VPO plan.
- Contributing to whitespace in speech technology research, you'll have control over the direction of your work with no friction whatsoever. You're the expert after all.

If you're looking to make an impact, there are few better places to do it. Your work here has the power to improve communication, eliminate confusion, and create a more connected world.

If you want the freedom to shape the future of speech AI, apply now.

Location:	Bay Area
Job type:	Permanent
Emp type:	Full-time
Salary type:	Annual
Salary:	USD $300,000.00
Job published:	26/02/2025
Job ID:	32723

Your search query

What you’ll work on

What tends to work well

Package & location

Our use of cookies

Your search query

What you’ll work on

What tends to work well

Package & location

Send me similar jobs

Our use of cookies