Your search has found 28 jobs

Machine Learning Researcher – World Models (Generative Video & Simulation)

Ready to build models that learn the structure and dynamics of the physical world?

This role focuses on developing world models, large-scale generative systems capable of simulating environments, actions, and interactions over extended time horizons. The goal is to move beyond short video generation toward models that can represent persistent environments and evolving dynamics.

As a Machine Learning Researcher, you’ll work on spatiotemporal generative models that learn how the world changes over time. These models aim to capture physical interactions, causality, and long-term dynamics, forming the foundation for intelligent systems that can reason about future outcomes and learn through simulation.

Your work will explore how generative models transition from producing short visual sequences to maintaining coherent simulations of environments where objects move, interact, and evolve consistently over time.

You’ll contribute across the full modelling lifecycle, from architecture design and training infrastructure through to evaluation and iteration. The role blends deep research with practical implementation, where experimental ideas are tested at scale and integrated into real systems.

This is a research-driven environment where engineers have significant ownership over model design, training strategies, and evaluation frameworks.

Your focus will include:

  • Designing and training spatiotemporal world models capable of learning long-horizon dynamics

  • Advancing video generation systems into persistent simulations that maintain coherence across time

  • Running large-scale training experiments on multi-billion parameter generative models

  • Improving temporal consistency, memory, and controllability in generative architectures

  • Developing evaluation methods for physical plausibility, causal consistency, and simulation stability

  • Working with large video datasets, including synthetic environments and real-world recordings

Hands-on experience with video generation, spatiotemporal modelling, or multimodal generative models is essential. This could include work with diffusion models, autoregressive approaches, transformers, or related architectures.

You should be comfortable implementing recent research, designing experiments, and iterating quickly on large training runs. Experience managing experiments on large GPU clusters and training large models at scale is highly valuable.

Strong coding ability in Python is required, with C++ or Rust considered beneficial.

You’ll have significant ownership over modelling decisions and the opportunity to shape how world models evolve within a small, technically ambitious AI research team.

Compensation: $200,000 – $350,000 base (negotiable depending on level) + equity + benefits

Location: San Francisco (On-site)

If you’re motivated by pushing generative models beyond video into world simulation and long-horizon reasoning, we’d like to speak with you.

All applicants will receive a response.

Job type: Permanent
Emp type: Full-time
Salary type: Annual
Salary: negotiable
Job published: 20/04/2026
Job ID: 35305

Define how large-scale AI systems for scientific discovery are actually built, trained, and run in production.

This team is building autonomous AI scientists that run full research loops — ingesting large bodies of literature, forming hypotheses, designing experiments, and producing traceable outputs already used across biotech and pharma.

The challenge isn’t just model capability. It’s building the systems that allow these models to be trained, evaluated, and deployed reliably at scale.

You’ll sit at the intersection of model training and systems — owning the infrastructure, pipelines, and experimentation platforms that make long-horizon reasoning systems possible.

This is not research in isolation. It’s building the engine that research runs on.

You’ll work closely with the wider team, translating ambiguous scientific problems into systems that can be trained, iterated on, and deployed in real-world environments.

The company comes from one of the earliest groups working seriously on AI for science, including early language agents and AI-generated biological discoveries. They’re now pushing further with systems capable of reasoning across thousands of papers and large-scale analyses, and moving toward pre-training their own models end-to-end.

The platform is already operating at scale, with tens of thousands of users and millions of queries, and is actively used in scientific workflows today.


What you’ll work on

  • Building and scaling training pipelines for large-scale LLM systems
  • Developing experimentation platforms that enable fast, reliable iteration
  • Designing data pipelines and systems for observability and reproducibility
  • Improving how training runs are orchestrated, monitored, and debugged
  • Supporting model deployment and inference for complex reasoning systems
  • Working closely with researchers to translate ideas into production systems

What they’re looking for

  • Experience building and scaling ML systems in production
  • Strong background across model training, data pipelines, and deployment
  • Experience with large-scale training or distributed systems
  • Fluency in frameworks like PyTorch, JAX, or similar
  • Strong engineering fundamentals and systems thinking
  • Ability to operate across ambiguity and own problems end-to-end

The company

  • ~$70M raised, with another round planned
  • Platform already at meaningful scale (tens of thousands of users, hundreds of millions of lines of code written by the agent)
  • Strong commercial traction 
  • Small, high-calibre team working at the intersection of AI and science

📍 San Francisco (on-site or hybrid, remote considered case by case)
💰 $250K–$400K base + equity
Levels: Senior, Staff, Principal
Roles available: ML Engineer, ML Infra, Research Engineers & Research Scientists 

All applicants will receive a response.

Location: San Francisco, CA
Job type: Permanent
Emp type: Full-time
Salary type: Annual
Salary: negotiable
Job published: 16/04/2026
Job ID: 35767
Are you looking to scale GPU infrastructure up to and beyond 10,000 GPUs?
You'll help push an already high-performing team past their current operating level, using your skills and experience to scale training workloads, improve cluster reliability/usage and build systems that hold up under real pressure.
Your focus will be on distributed training and GPU infrastructure, making large-scale training actually usable for researchers—not just possible.
You'll be working across frontier model training, scientific workloads and robotics environments. So you're dealing with high-throughput systems and real-world constraints, not just controlled experiments.
You'll join a team that owns compute end-to-end—infra, systems, and operations—working closely with researchers to make training at this scale reliable.
They've raised over $500M, have real customers, and are now integrating models directly into robotics environments and beyond.
Key experience
  • Experience scaling GPU infrastructure from 2,000 to 10,000+ GPUs
  • Experience with Ray, Slurm or similar
  • Experience supporting core model training

The culture is collaborative and hands-on:
  • Strong focus on knowledge sharing and upskilling
  • Cross-team collaboration with researchers
  • 6-week cycles to allow deep focus and meaningful impact
  • A team that works hard but also likes to keep it fun
Up to $350k base + bonus + equity DOE
Remote across the US or hybrid options available in SF

All applicants will receive a response. 
Job type: Permanent
Emp type: Full-time
Salary type: Annual
Salary: negotiable
Job published: 08/04/2026
Job ID: 35635

Want to build the interface layer for an AI scientist?

You’ll join a team building autonomous AI agents designed to accelerate scientific discovery. The goal is simple, science moves too slowly, and they’re building systems that can change that.

This isn’t a typical frontend role. The product is an integrated research environment where scientists interact directly with AI models, workflows, and generated insights. Your work defines how usable that system actually is.

You’ll sit within the Platform team, working closely with researchers and product to turn complex, often messy scientific workflows into clear, intuitive interfaces.

The challenge is translating depth into clarity without losing fidelity.

You’ll be building high-performance frontend systems where data density, responsiveness, and usability all matter. Real-time interactions, dynamic visualisations, and scalable UI patterns are core to the product.

Your focus will include:

  • Building performant React applications for data-heavy workflows
  • Designing interfaces for real-time AI interactions and streaming data
  • Creating modular, scalable design systems used across the platform
  • Translating scientific and model outputs into usable visual interfaces

You’ll need strong frontend fundamentals, but more importantly, the ability to think in systems. Understanding how users navigate complexity, how interfaces guide decision-making, and how performance impacts usability at scale.

There’s a strong emphasis on performance engineering. You’ll be profiling rendering behaviour, optimising asset loading, and ensuring smooth interaction across browsers and devices.

The product itself sits at the intersection of AI, biology, and research tooling. If you’ve worked on complex internal tools, data platforms, or visualisation-heavy applications, this will feel familiar, just at a deeper technical level.

You’ll likely have experience building production frontend systems with React (or similar), working with TypeScript, and handling real-time data flows such as WebSockets or GraphQL subscriptions. Experience with visualisation libraries like D3, Deck.gl or Three.js is highly relevant here.

The environment is highly collaborative. You’ll work closely with researchers to anticipate how the product should evolve, not just respond to specs.

This is an onsite role based in San Francisco, working with a team focused on building something that genuinely pushes forward how science gets done.

Salary: $175,000 – $240,000 + equity
Location: San Francisco, onsite

If you’re interested in shaping how scientists interact with AI systems, apply today.

Job type: Permanent
Emp type: Full-time
Salary type: Annual
Salary: negotiable
Job published: 01/04/2026
Job ID: 35602

Want to build systems that actually hold up under long-running AI workloads?

Most agentic systems for science don’t fail at the model layer. They fail because the infrastructure can’t support long-horizon execution.

You’ll join a team building autonomous AI agents that run full research cycles. Ingesting thousands of papers, forming hypotheses, running experiments, and producing traceable outputs used by real scientific teams.

The challenge is making that work in production.

You’ll own the systems behind it. APIs, data pipelines, and platform architecture designed for long-running workloads, large-scale ingestion, and iterative experimentation loops. This is full-stack in scope, but backend in depth, where system design decisions directly impact what the platform can do.

You’ll be working across:

  • Backend services in Python or Node, building scalable APIs (FastAPI/REST)
  • Data pipelines supporting agent execution and scientific workflows
  • Cloud infrastructure (AWS/GCP), containerisation (Docker, Kubernetes)
  • CI/CD, observability, and reliability for systems under continuous load

This isn’t a generalist full-stack role. You’ll need to understand how systems behave under heavy data and compute demands, and be comfortable making architectural trade-offs across distributed systems.

The team is small, high-calibre, and already running real workloads with revenue traction. Backed by $70M+, they’re building infrastructure that defines how AI is applied to scientific discovery.

 

Salary: $200,000–$350,000 + equity
Location: San Francisco (onsite)

Location: San Francisco, CA
Job type: Permanent
Emp type: Full-time
Salary type: Annual
Salary: negotiable
Job published: 30/03/2026
Job ID: 35569

Senior Applied Researcher

Want to build vision-language models that understand complex, real-world environments?

You’ll join a small, highly technical team working on foundational problems in multimodal AI, focused on training models that can interpret, reason, and act on large-scale first-person video data.

You’ll work directly with the Chief Science Officer, shaping how models are designed, trained, and evaluated. The work sits at the intersection of VLMs, long-context reasoning, and real-world deployment.

The focus is on building systems that move beyond static perception, towards temporal understanding, activity recognition, and higher-level reasoning across dynamic environments.

Your work will centre on:

  • Designing and training VLMs on large-scale video datasets
  • Developing post-training approaches including SFT, RLHF, and parameter-efficient tuning
  • Building scalable training and evaluation pipelines
  • Exploring long-context and temporal modelling
  • Designing efficient systems across edge and server-side inference
  • Defining benchmarks for spatial and behavioural understanding

You’ll bring strong experience training deep learning models, ideally transformer-based, along with hands-on work in vision, language, or multimodal systems.

Experience with large datasets, model optimisation, or deploying models into production environments will be valuable. Exposure to video data or long-context modelling is particularly relevant.

This is a team that values speed, ownership, and first-principles thinking. You’ll be working on open-ended problems with real-world impact, with the freedom to explore and define approaches.

Compensation: Highly competitive salary + equity
Location: San Francisco, onsite

If you’re interested in building multimodal systems that operate in real-world settings, and want to join a well-funded, highly skilled research team, please apply now!

All applicants will receive a response.

Location: San Francisco, CA
Job type: Permanent
Emp type: Full-time
Salary type: Annual
Salary: negotiable
Job published: 30/03/2026
Job ID: 35437

Ready to own the data pipeline powering the voice of the next generation of AI characters?

You'll be joining a well-funded startup building AI character technology, where speech is a core part of the product experience.

Think super natural conversations, handling interruptions, personality shifts and more!

You'll own the datasets that power their speech systems — from raw, messy audio through to clean, versioned training corpora that directly drive TTS and ASR model performance.

Your focus

  • Own the full data lifecycle — defining specs, auditing and curating large-scale audio and text corpora
  • Build automated quality metrics and dashboards across SNR, VAD, WER, speaker verification and safety, validated against listening tests
  • Train and deploy lightweight classifiers for noise detection, diarisation, language ID, and content moderation

What you'll bring

  • Deep experience working with speech and audio data at scale — 1M+ hours
  • Strong ML engineering skills in Python and PyTorch, including training and fine-tuning models like Whisper or Wav2Vec
  • Practical knowledge of audio processing — torchaudio, librosa, spectrograms, DSP basics
  • A solid understanding of audio quality metrics — MOS, WER, PESQ/STOI, SNR, speaker verification

Nice to have

  • Experience with Spark/Beam, Airflow, SQL or similar data engineering tools
  • Open-source contributions or publications in speech or audio ML
  • Background in denoising and enhancement, and how it affects downstream model quality

Remote, with a preference for European or overlapping timezones. Competitive compensation and equity.

Location: Remote
Job type: Permanent
Emp type: Full-time
Salary type: Annual
Salary: negotiable
Job published: 27/03/2026
Job ID: 34412

Want to build the systems that make AI agents actually work in production?

Most agents fail outside controlled environments, not because the models are weak, but because the systems around them can’t represent how real work happens.

This team is building that missing layer...

Their platform sits inside enterprise workflows, capturing how tasks are executed across tools, then structuring that data so models and agents can actually use it. Real operational context, not synthetic benchmarks.

As a Full Stack Engineer, you’ll focus on the backend and product systems that make this usable in production.

You’ll design workflow data models, build high-throughput pipelines, and ship full-stack features used by real customers. This sits across distributed systems, data engineering, and LLM integrations.

Tech stack includes TypeScript (NestJS, React, Vite, TanStack), PostgreSQL, and AWS/GCP, with OpenAI and Anthropic models integrated into core systems.

You’ll join a small, highly technical, Accel-backed team that’s already post-revenue and scaling with enterprise customers. This isn’t speculative infrastructure, it’s being used.

Experience with Python pipelines, Terraform monorepos, or Rust/Swift is useful, but not essential.

What matters is your ability to build systems that hold up in real-world complexity.

📍 San Francisco (on-site)
 💰 $160K–$280K base + equity + additional comp

If you’re interested in building the layer that makes AI agents usable, this is where that work is happening.

All applicants will receive a response.

Location: San Francisco, CA
Job type: Permanent
Emp type: Full-time
Salary type: Annual
Salary: negotiable
Job published: 19/03/2026
Job ID: 35404
Rip up the playbook and step into uncharted territory.

If you've been building long-horizon multi-agent systems and pushing the boundaries of AI research, this is the kind of role where curiosity and ambition meet real execution, exploring truly novel problems at the frontier of what's currently possible.

You will work on systems designed to outperform the current state of the art, tackling problems that don't yet have standardised solutions across RL, long-horizon reasoning, LLM post-training for non-myopic objectives, environment and feedback design.

Whether you're early-career PhD or highly experienced, what matters most is your ability to push novel ideas into working systems, execute your knowledge across reasoning, RL and memory to make real-world impact.

This is a small, ambitious team operating where few others are, building and executing quickly in areas such as computational R&D science. This is your opportunity to shape the systems that generate and validate new discovery in environment primed for success. 

Skills & experience
  • PhD and/or publications at top conferences across long-horizon reasoning, RL, or similar
  • Post-training experience (RLHF, DPO, reward modelling)
  • Experience working on open-ended research 
Location- San Francisco
Salary- $400k base 0.5–1%+ equity Negotiable DOE

All applicants will receive a response. 
Location: San Francisco, CA
Job type: Permanent
Emp type: Full-time
Salary type: Annual
Salary: negotiable
Job published: 12/03/2026
Job ID: 35041

Want to own the inference layer behind millions of real-world voice AI interactions every day?

You’ll join a profitable, founder-led enterprise conversational AI company powering billions of interactions annually across 30+ languages. Their systems sit behind major global brands and handle millions of customer conversations daily.

They’re now moving toward end-to-end multimodal and speech-to-speech architectures. You’ll own the inference stack powering both their multimodal speech-text LLM and their text reasoning LLM.

This goes well beyond tuning configs.

You will:

• Optimise production inference across A10, A100 and H100 GPUs
• Own scheduler design, KV cache allocation and batching logic
• Build serving systems tailored to multimodal audio-text workloads
• Support agentic, multi-step reasoning under real latency constraints
• Profile kernel-level bottlenecks and fix them properly

You’ve modified inference framework internals before, not just used them. You’re comfortable in Python and C++, and you’re happy diving into CUDA graphs, memory bandwidth limits or custom kernels when required.

This platform processes over 2 million interactions per day. Latency, throughput and cost are production realities, not lab metrics.

Package: €150,000 base + bonus + stock options
Location: Remote within Europe

If you want full ownership of inference performance at real enterprise scale, let’s talk.

All applicants will receive a response.

Location: Remote
Job type: Permanent
Emp type: Full-time
Salary type: Annual
Salary: negotiable
Job published: 03/03/2026
Job ID: 35239