Job Description
Rip up the playbook and step into uncharted territory.
If you've been building long-horizon multi-agent systems and pushing the boundaries of AI research, this is the kind of role where curiosity and ambition meet real execution, exploring truly novel problems at the frontier of what's currently possible.
You will work on systems designed to outperform the current state of the art, tackling problems that don't yet have standardised solutions across RL, long-horizon reasoning, LLM post-training for non-myopic objectives, environment and feedback design.
Whether you're early-career PhD or highly experienced, what matters most is your ability to push novel ideas into working systems, execute your knowledge across reasoning, RL and memory to make real-world impact.
This is a small, ambitious team operating where few others are, building and executing quickly in areas such as computational R&D science. This is your opportunity to shape the systems that generate and validate new discovery in environment primed for success.
Skills & experience
- PhD and/or publications at top conferences across long-horizon reasoning, RL, or similar
- Post-training experience (RLHF, DPO, reward modelling)
-
Experience working on open-ended research
Location- San Francisco
Salary- $400k base 0.5–1%+ equity Negotiable DOE
All applicants will receive a response.
All applicants will receive a response.