Your search has found 22 jobs

Looking to define ASR strategy for the next generation of social AI?

You'll be joining a well-funded social AI company building lifelike AI characters that interact naturally across voice, video, and text. Founded by a prominent tech entrepreneur, they're creating new media formats for AI-driven interaction where agents handle group conversations, interruptions, and multi-agent dynamics.

Your mission

You'll own the ASR function from day one - starting with evaluating and implementing existing solutions, then moving toward building proprietary models as the platform scales. This means hands-on work testing APIs and open-source models, followed by developing custom systems for multi-agent group conversations and social interactions.

You'll shape the technical direction, balance short-term delivery with long-term innovation, and drive individual research initiatives while collaborating on broader team objectives.

Your focus

  • Define and execute the ASR roadmap from evaluation through production deployment
  • Build and train models that handle natural conversation dynamics
  • Develop evaluation systems to measure accuracy, speed, and reliability
  • Define data requirements and create pipelines for ASR training
  • Work from low-level performance optimizations to high-level architecture decisions

What you'll bring

  • Proven track record building and deploying ASR systems at scale
  • Strong familiarity with SOTA ASR models and architectures (Whisper, Conformer, etc.)
  • Understanding of data quality assessment for speech systems

Nice to have

  • Experience leading technical initiatives or ML teams

Remote with competitive comp + stock.

Ready to define the future of social AI interactions? Apply today.

Location: Remote
Job type: Permanent
Emp type: Full-time
Salary type: Annual
Salary: negotiable
Job published: 05/12/2025
Job ID: 34546

How do you make a large language model genuinely human-centred, capable of reasoning, empathy, and nuance rather than just pattern-matching?

This team is built to answer that question. They’re a small, focused group of researchers and engineers working on the post-training challenges that matter most: RLHF, RLAIF, continual learning, multilingual behaviour, and evaluation frameworks designed for natural, reliable interaction.

You’ll work alongside a team from NVIDIA, Meta, Microsoft, Apple, and Stanford, in an environment that combines academic rigour with production-level delivery. Backed by over $400 million in funding, they have the freedom, compute, and scale to run experiments that push beyond the limits of standard alignment research.

This is a role where your work moves directly into deployed products. The team’s models are live, meaning every insight you develop, every method you refine, and every experiment you run has immediate, measurable impact on how large-scale conversational systems behave.

What you’ll work on

  • Developing post-training methods that improve alignment, reasoning, and reliability

  • Advancing instruction-tuning, RLHF/RLAIF, and preference-learning pipelines for deployed systems

  • Designing evaluation frameworks that measure human-centred behaviour, not just accuracy

  • Exploring continual learning and multilingual generalisation for long-lived models

  • Publishing and collaborating on research that informs real-world deployment

Who this role suits

  • Researchers or recent PhDs with experience in LLM post-training, alignment, or optimisation

  • A track record of rigorous work — published papers, open-source projects, or deployed research

  • Curiosity about how large models learn and behave over time, and how to steer that behaviour safely

  • Someone who values autonomy, clarity of purpose, and research that turns into impact

You’ll find a culture driven by technical depth rather than hype — where thoughtful research is backed by meaningful compute and where the best ideas scale fast.

Location: South Bay (on-site, collaborative setup)
Compensation: $200 000 – $250 000 base + equity + bonus

If you’re ready to work on post-training research that shapes how large language models behave, we’d love to hear from you.

All applicants will receive a response.

Location: Palo Alto
Job type: Permanent
Emp type: Full-time
Salary type: Annual
Salary: negotiable
Job published: 20/11/2025
Job ID: 33284

Are you the kind of engineer who enjoys building complex systems that help models learn, not by training them directly, but by shaping the worlds they inhabit?

This team builds large-scale environments and benchmarks that frontier AI labs use to test and steer their models. Their goal is to make reinforcement learning measurable, creating rich, hyperrealistic simulations where agents can reason, act, and be safely evaluated.

You’ll work at the intersection of software engineering, reinforcement learning, and experimental research, designing the frameworks and pipelines that let agentic AI systems act, learn, and improve through interaction, not static data.

You'll Bring

  • Strong Python and software fundamentals who enjoy building ML infrastructure.

  • Experience in reinforcement learning, rewards, environment dynamics, evaluation loops.

  • Worked with browser/API simulations (Playwright, Selenium) or distributed compute.

  • Experience with open-ended problem spaces and a desire to shape the tools driving safe AGI progress.

It’s a technically deep team of ML engineers and researchers from leading labs and tech companies, developing the simulation and evaluation backbone for next-generation agents.

Compensation: $200,000–$250,000 base + equity
Location: San Francisco (on-site, relocation supported)

All applicants will receive a response.

Job type: Permanent
Emp type: Full-time
Salary type: Annual
Salary: negotiable
Job published: 18/11/2025
Job ID: 34513

Want to build the simulated worlds that test what frontier models are really capable of?

This is a chance to join a team advancing the science of post-training and scalable evaluation — building reinforcement learning environments that push reasoning, planning, and long-horizon behaviour to their limits.

Instead of static benchmarks, you’ll create dynamic simulations that measure real intelligence — not just accuracy. You’ll design new post-training algorithms (RLHF, DPO, GRPO and beyond), develop richer reward models that move past exact-match scoring, and build evaluation frameworks that define how next-generation AI is trained, aligned, and understood.

The work combines deep research with hands-on implementation — from writing papers to seeing your methods deployed in live systems. It’s ideal for researchers who care about bridging academic insight and practical impact, helping AI progress beyond metrics that no longer tell the whole story.

You’ll bring:

  • Research experience in post-training, reinforcement learning, or evaluation for LLMs.

  • Strong understanding of transformer models and experimental design.

  • Publication record at leading venues (NeurIPS, ICLR, ICML, ACL, EMNLP).

  • PhD or equivalent research experience in CS, ML, NLP, or RL.

Package: Up to $300K base (DOE) + meaningful equity + comprehensive benefits (401k, unlimited PTO, relocation and sponsorship available).
Location: On-site in New York (preferred).

If you want to shape how AI is trained, tested, and trusted — this is the place to do it.
All applicants will receive a response.

Location: New York, NY
Job type: Permanent
Emp type: Full-time
Salary type: Annual
Salary: negotiable
Job published: 05/11/2025
Job ID: 34313

Want to build speech AI that actually sounds human?

You'll be joining a well-funded speech AI startup with strong customer traction. They're building ultra-realistic voice technology that handles natural laughter, breathing, seamless language switching, and accurate pronunciation across languages and accents.

As their Speech Research Lead, you'll have the resources and real-world applications to work on frontier speech research: real-time two-way conversations with emotional awareness, novel architectures that balance speed with control, and advancing their multi-lingual capabilities.

What you'll do

  • Lead SOTA research advancing their core speech models and product capabilities

  • Oversee large-scale model training and data system development

  • Lead and grow the ML team during a critical scaling phase

What you'll bring

  • Extensive experience in speech synthesis or generative modeling across multiple modalities

  • Strong background in LLMs and modern language model architectures

  • Proven ability to take research from concept to deployed systems

  • Experience training large-scale models in production environments

Nice to have

  • Understanding of cross-lingual speech challenges and linguistic fundamentals

  • Published research in speech or generative modeling

Ideally based in San Francisco but open to remote internationally. Competitive compensation up to $400K base (depending on experience) plus substantial equity package.

 

Location: San Francisco, CA
Job type: Permanent
Emp type: Full-time
Salary type: Annual
Salary: negotiable
Job published: 01/11/2025
Job ID: 34146
Want to push the boundaries of what reinforcement learning can achieve with frontier models?

In this role you will be advancing reinforcement learning methods for large-scale AI systems. You’ll be applying RL techniques to enhance reasoning, planning, and decision-making in models that directly impact fields from biology to climate and materials science.
Your work will combine RL with large language models, experimenting with RLHF, PPO, and DPO, designing evaluation frameworks, and fine-tuning models at scale. The aim is to go beyond benchmarks and deliver models that researchers can use to accelerate discovery.

You will be a driving force in a team that is building towards a broader superintelligence platform: models that don’t just generate text or data, but drive breakthroughs across multiple domains. As part of this, you’ll collaborate with domain experts to ensure your research translates into real-world scientific progress.

You should bring:
  • Deep expertise in reinforcement learning (policy optimisation, value-based, or model-based methods).
  • Experience applying RL to large models (RLHF, PPO, DPO).
  • Hands-on experience with model training and fine-tuning at scale.
  • PhD in Computer Science, Machine Learning, Robotics, or related field, with contributions to top-tier conferences (NeurIPS, ICML, ICLR, AAAI).
  • Experience with distributed computing platforms (cloud or HPC clusters).
  • Track record of running rigorous experiments and improving models based on results.

If you have experience with multi-agent RL, hierarchical/offline RL, or domain-specific work with scientific datasets you will be an ideal candidate for this position. 

Package: $250k - $400k base + bonus + stock 
Location: SF Bay area or potential for remote with travel to office when needed.

If you want to see your RL research power the next generation of superintelligence , this is the role for you!

 All applicants will receive a response.

 

Job type: Permanent
Emp type: Full-time
Salary type: Annual
Salary: negotiable
Job published: 20/10/2025
Job ID: 33780
Head of Research – Post-Training & Reinforcement Learning

Ready to shape how the next generation of AI is trained, aligned, and supervised?

This role is about leading one of the most critical research agendas in AI today: advancing post-training and reinforcement learning methods that ensure increasingly capable models remain aligned, reliable, and safe. You’ll define the environments and frameworks where frontier models learn and set the direction for how society supervises AI as it surpasses human performance.

As Head of Research, you’ll guide a team of applied ML and research experts from FAIR, Meta Reality Labs, Airbnb, Amazon and beyond. You’ll stay hands-on with the research, designing experiments in RLHF, DPO, GRPO; developing reward models that move beyond exact-match signals; and building complex RL environments that stress-test reasoning, planning, and long-horizon behaviour. At the same time, you’ll shape the technical vision, ensuring the team’s work translates into production systems already used by leading AI labs.

You’ll also play a visible role in the broader ecosystem: publishing at top venues (NeurIPS, ICLR, ACL, EMNLP), releasing benchmarks and open-source tools, and influencing both technical standards and broader policies for AI alignment and evaluation.

You should bring:
  • Deep research experience in post-training or RL methods (RLHF, DPO, GRPO, reward modelling).
  • Strong background in training and evaluating large language models.
  • Proven publication record at top-tier venues (NeurIPS, ICLR, ICML, ACL, EMNLP).
  • Experience leading research teams and scoping high-impact projects.
  • Curiosity, creativity, and the ability to thrive in a fast-moving startup environment.

Package: $300k–$400k base + significant equity. Full benefits including health, dental, vision, 401k, unlimited PTO, and global offsites. Onsite in San Francisco preferred (relocation support available), with flexibility for exceptional candidates.

If you want to define how reinforcement learning environments and post-training frameworks shape the future of AGI, this is the role for you. 

 All applicants will receive a response.
Location: San Francisco or NYC
Job type: Permanent
Emp type: Full-time
Salary type: Annual
Salary: negotiable
Job published: 02/10/2025
Job ID: 33880

Looking to tackle novel speech challenges at scale?

You'll be joining a small but mighty speech AI company building proprietary speech tech from the ground up. With a strong customer base, your research will directly impact production systems serving enterprise customers, with the opportunity to see your work deployed at scale in real-world voice applications.

They're a well-funded startup with healthy revenue streams and immediate opportunities for high-impact research.

Your research

You'll be working on breakthrough speech research that push the boundaries of naturalness and real-time performance. The company has achieved ultra-low latency and is now advancing toward unified speech-to-speech architectures.

You'll develop emotional expression and natural speech generation, advance multilingual support across 30+ languages, and enhance voice cloning robustness.

Your focus

  • Lead cutting-edge research in SOTA speech models (TTS, ASR, or speech-to-speech)
  • Design, execute and iterate on experiments end-to-end
  • Drive speech controllability and naturalness improvements
  • Develop evaluation methodologies for speech quality assessment

What you'll bring

  • Deep understanding of cutting-edge speech models with end-to-end pipeline experience
  • Experience with large-scale model training
  • Strong background in speech model development and optimisation
  • Published work with demonstrable results in industry or academic settings

Nice to have

  • Performance optimisation experience for latency and compute efficiency
  • Experience with model fusion and unified architectures

This is a remote role, either in US or Europe. Competitive comp based on experience.

Location: Remote
Job type: Permanent
Emp type: Full-time
Salary type: Annual
Salary: negotiable
Job published: 23/09/2025
Job ID: 33913

Ready to lead the development of a foundation model that powers the next generation of Agentic AI?

This is a hands-on leadership role where you’ll train models from scratch, make the key architectural decisions, and grow your own team. The work will build a Foundation model underpinning AI systems that think, reason, and act in mission-critical environments.

You’ll join a well-backed company founded by an entrepreneur with a previous billion-dollar exit, already partnering with Fortune 100 and 500 clients. Their latest funding round is being channelled directly into compute and team buildout — a substantial commitment to creating truly differentiated AI technology, powered by the foundation model you’ll help design and train.

The challenge: build reasoning-capable foundation model to power Agents, another team in the company will fine-tune and adapt from specific industry use-cases and clients. Your team will work at scale, combining pre-training and post-training approaches while ensuring models meet regulated industry requirements. This is about more than just building models — it’s about creating infrastructure, teams, and frameworks that define the future of AI in high-stakes settings.

Your focus will include leading pre-training and post-training of foundation models, architecting reasoning-capable LLMs, scaling a 15–20 person research and engineering team over 12 months, and integrating models into domain-specific applications.

You should bring:

  • Staff/Principal-level experience with hands-on model training.

  • Deep expertise in pre-training or post-training (ideally both).

  • Track record of driving impactful projects to completion.

  • Strong understanding of LLM architectures and large-scale training.

  • Experience leading complex technical initiatives across teams.

Nice to have: experience in regulated/safety-critical AI, reasoning or planning architectures, open-source model development, or prior technical leadership.

Package: $350k base (negotiable) + substantial stock, healthcare, 401k, 20 vacation days, and flexible working.

You must be based in the SF Bay Area. US citizens and green card holders only.

This is an opportunity to lead foundation model development with the resources, autonomy, and backing to deliver groundbreaking technology.

All applicants will receive a response.

Location: San Francisco Bay Area
Job type: Permanent
Emp type: Full-time
Salary type: Annual
Salary: negotiable
Job published: 22/09/2025
Job ID: 33946

Build the foundational models that will give AI agents true 3D understanding

Want to solve the fundamental challenge of how AI systems perceive and reason about 3D geometry?

This Lead Applied Scientist role puts you at the forefront of creating perception capabilities for the next generation of agentic AI systems. You'll lead a team building discriminative and generative models introduced into agentic workflows, solving complex challenges in agentic AI for industrial applications.

You'll be joining a well-funded startup developing AI agents for advanced design and manufacturing workflows. Your work will bridge the gap between the physical world and intelligent reasoning systems, creating models that understand CAD data, meshes, and point clouds at a level that enables autonomous decision-making.

This role offers the opportunity to hands on lead a team to build 3D computer vision capabilities from the ground up. You'll be establishing an entirely new domain within the research team, with significant autonomy to define evaluation strategies, model objectives, and technical direction. Your models will form the perception backbone that enables agents to truly understand and manipulate the 3D world.

Your technical challenges:

  • Build models that understand diverse 3D data types (CAD, mesh, point cloud) and learn transferable representations across formats
  • Handle messy, lossy, or incomplete real-world data - moving beyond clean synthetic geometry to tackle industrial reality
  • Scale training across multiple 3D tasks: segmentation, classification, correspondence, and eventually generation
  • Create evaluation pipelines that meaningfully assess model performance and enable continuous production monitoring
  • Work toward a foundational 3D model supporting both discriminative and generative tasks, integrated into broader agentic AI architecture

Your expertise should include:

  • Deep specialisation in 3D computer vision (ideally including a PhD in Computer Vision)
  • Strong knowledge of modern 3D architectures (PointNet++, MeshCNN, 3D Gaussian Splatting, diffusion models, VLMs)
  • Proven ability training large-scale deep learning models with PyTorch
  • Solid applied research skills - can implement novel architectures from papers and make them work in practice
  • Experience with multimodal or vision-language model development

Nice to have:

  • Background working with CAD data or industrial design workflows
  • Experience in complex topics such as robotics, autonomous driving, or AR/VR with 3D perception focus
  • Familiarity with SLAM, pose estimation, or differentiable rendering

You'll join a research team that values ownership and rapid iteration, with the resources to pursue ambitious technical goals. The company provides abundant compute resources and the freedom to explore foundational approaches whilst ensuring practical impact.

Package includes:

  • Base salary: $300,000
  • Performance bonus up to 20%
  • Medical, dental, and vision coverage
  • 401k with up to 3% company match
  • 20+ vacation days

You'll need to be based in SF Bay Area or Miami, with a collaborative team environment that encourages innovation and technical excellence.

You must have valid right to work in the US without sponsorship (US Citizenship or Green Card).

If you're excited about creating the 3D perception capabilities that will power the next generation of intelligent agents, we'd love to hear from you.

All applicants will receive a response.

Location: United States
Job type: Permanent
Emp type: Full-time
Salary type: Annual
Salary: negotiable
Job published: 15/09/2025
Job ID: 33847