Job title: Research Scientist
Job type: Permanent
Emp type: Full-time
Industry: AI Agents
Salary type: Annual
Salary: negotiable
Location: San Francisco
Job published: 05/01/2026
Job ID: 33119

Job Description

Interested in advancing how agents and LLMs learn from feedback in realistic environments?

You’ll be joining a research-driven AI company building reinforcement learning simulation environments for agents and large language models, with a focus on post-training, evaluation, and scalable supervision. 

Their tools are already used in production by leading AI labs and enterprises and due to demand they are growing fast.

As a Research Scientist, you’ll work hands-on on fundamental problems spanning LLM post-training, RL environments, and agentic evaluation. Your work will shape core methods and benchmarks, and you’ll see your research deployed into production systems. The team actively publishes and collaborates with external research labs, with recent work appearing at ACL and NeurIPS.

What you’ll do

  • Conduct research on LLM post-training methods (RLHF, RLAIF, RLVR)

  • Design and build realistic RL simulation environments for agents

  • Develop agentic evaluation and supervision frameworks

  • Create and maintain benchmarks for emerging AI capabilities

  • Collaborate with engineers to take research from idea to deployed systems

What you’ll bring

  • Experience in applied research in reinforcement learning, LLM post-training, or agent-based systems

  • Strong understanding of transformer architectures and LLM fine-tuning

  • Ability to translate research ideas into working, production-ready systems

Nice to have

  • Publications at top-tier venues (NeurIPS, ICML, ACL, EMNLP)

  • Experience working on evaluation, safety, or oversight for advanced AI systems

  • Prior work on large-scale training or simulation environments

SF-based. Compensation up to $300k base (flexible, DOE) plus equity, unlimited PTO, and benefits.

Interested in working on the foundations of AI training, evaluation, and safety—while publishing high-quality research that ships?

All applications will receive a response.