Job title: Research Scientist
Job type: Permanent
Emp type: Full-time
Industry: AI Agents
Salary type: Annual
Salary: negotiable
Location: San Francisco
Job published: 22/08/2025
Job ID: 33119

Job Description

Want to build simulated RL environments that push frontier models to their limits?

This role is about advancing the science of post-training, reinforcement learning, and scalable evaluation. Instead of static benchmarks, you’ll create dynamic simulations that probe reasoning, planning, and long-horizon behaviour — work that defines how the next generation of AI will be trained and supervised.

You’ll design new post-training algorithms (RLHF, DPO, GRPO and beyond), develop reward models that move beyond exact-match signals, and publish your findings while seeing them deployed in production systems. The work spans both core research and practical implementation, giving you the chance to shape frameworks already being adopted by industry leaders.

We’re looking for:

  • Research experience in post-training or RL methods with LLMs.

  • Strong background in transformers and evaluation frameworks.

  • Publication record at top venues (NeurIPS, ICLR, ICML, ACL, EMNLP).

  • PhD in CS/ML/NLP/RL or equivalent research experience.

Package: Up to $300k base (DOE) + meaningful equity, with comprehensive benefits, 401k, unlimited PTO, relocation support and sponsorship available. Location is San Francisco preferred, with NYC also considered.

Ready to help define how AI learns and is evaluated in simulated environments?
All applicants will receive a response.