Your search has found 3 jobs

Interested in working on the models behind superintelligence?

This role focuses on developing AI systems designed to accelerate scientific discovery, from generating hypotheses and new datasets through to powering robotic labs capable of running real-world experiments autonomously.

You’ll join a frontier research team building large-scale models that reason, plan, and act across complex environments, not just generate outputs.

The work sits across agentic systems, reasoning, continual learning, inference-time scaling, reinforcement learning, and large-scale evaluation. The team is already training models beyond 100B parameters and continuing to push architecture and efficiency improvements across next-generation systems.

You’ll work closely with researchers and engineers tackling some of the hardest problems in modern AI, including scalable reasoning, long-horizon planning, model behaviour, evaluation, and efficient deployment of large frontier systems.

The ideal background is someone who has worked hands-on with large-scale training environments and modern deep learning systems.

You’ll bring experience with:
- Experience working on large models (ideally 30B+ parameters)
- Exposure to MoE or large-scale distributed training
- Background in LLM post-training, RL, or agent systems
- Experience in a frontier lab or similarly ambitious environment

Publications are valued, but practical experience building and scaling systems carries equal weight.

This is the type of environment many people at top labs leave to build themselves. The difference here is you get the backing, compute, and team already in place.

Salary: $350k - $450k Base DOE + Sizeable Stock Options and Bonuses
Location: SF hybrid or US Remote or London

If you’re thinking seriously about where frontier model development is heading next, this is worth exploring.

All applicants will receive a response.

Job type: Permanent
Emp type: Full-time
Salary type: Annual
Salary: negotiable
Job published: 30/04/2026
Job ID: 35932

Are you looking to scale GPU infrastructure up to and beyond 10,000 GPUs?

You'll help push an already high-performing team past their current operating level, using your skills and experience to scale training workloads, improve cluster reliability/usage and build systems that hold up under real pressure.

Your focus will be on distributed training and GPU infrastructure, making large-scale training actually usable for researchers—not just possible.

You'll be working across frontier model training, scientific workloads and robotics environments. So you're dealing with high-throughput systems and real-world constraints, not just controlled experiments.

You'll join a team that owns compute end-to-end—infra, systems, and operations—working closely with researchers to make training at this scale reliable.

They've raised over $500M, have real customers, and are now integrating models directly into robotics environments and beyond.

Key experience

  • Experience scaling GPU infrastructure from 2,000 to 10,000+ GPUs
  • Experience with Ray, Slurm or similar
  • Experience supporting core model training

The culture is collaborative and hands-on:

  • Strong focus on knowledge sharing and upskilling
  • Cross-team collaboration with researchers
  • 6-week cycles to allow deep focus and meaningful impact
  • A team that works hard but also likes to keep it fun

Up to $350k base + bonus + equity DOE.

Remote across the US or hybrid options available in SF.

All applicants will receive a response. 

Job type: Permanent
Emp type: Full-time
Salary type: Annual
Salary: negotiable
Job published: 08/04/2026
Job ID: 35635

Want to push the boundaries of what reinforcement learning can achieve with frontier models?

In this role you will be advancing reinforcement learning methods for large-scale AI systems. You’ll be applying RL techniques to enhance reasoning, planning, and decision-making in models that directly impact fields from biology to climate and materials science.
Your work will combine RL with large language models, experimenting with RLHF, PPO, and DPO, designing evaluation frameworks, and fine-tuning models at scale. The aim is to go beyond benchmarks and deliver models that researchers can use to accelerate discovery.

You will be a driving force in a team that is building towards a broader superintelligence platform: models that don’t just generate text or data, but drive breakthroughs across multiple domains. As part of this, you’ll collaborate with domain experts to ensure your research translates into real-world scientific progress.

You should bring:

  • Deep expertise in reinforcement learning (policy optimisation, value-based, or model-based methods).

  • Experience applying RL to large models (RLHF, PPO, DPO).

  • Hands-on experience with model training and fine-tuning at scale.

  • PhD in Computer Science, Machine Learning, Robotics, or related field, with contributions to top-tier conferences (NeurIPS, ICML, ICLR, AAAI).

  • Experience with distributed computing platforms (cloud or HPC clusters).

  • Track record of running rigorous experiments and improving models based on results.


If you have experience with multi-agent RL, hierarchical/offline RL, or domain-specific work with scientific datasets you will be an ideal candidate for this position. 

Package: $250k - $400k base + bonus + stock 
Location: SF Bay area or potential for remote with travel to office when needed.

If you want to see your RL research power the next generation of superintelligence , this is the role for you!

 All applicants will receive a response.

 

Job type: Permanent
Emp type: Full-time
Salary type: Annual
Salary: negotiable
Job published: 20/10/2025
Job ID: 33780