Job Description
Lead AI research that shapes how AI evaluates AI
Want to pioneer AI supervision systems that ensure powerful models can be safely trusted? This leadership role offers a rare chance to solve one of AI's most critical challenges: making AI systems that can effectively evaluate other AI.
You'll join a well-funded startup backed by top Silicon Valley VCs and led by a team with elite ML research backgrounds from FAIR, Meta Reality Labs, and quant finance. Their technology is already trusted by OpenAI, HP, and Fortune 100 companies across education, finance, and healthcare.
As Head of AI, you'll lead a team building cutting-edge evaluation systems that leverage memory, long-context reasoning, and multimodal capabilities. Your research won't just advance the field – it will be immediately implemented in products that solve real enterprise AI safety challenges.
The team has published at top ML conferences (NeurIPS, EMNLP, ACL) and developed models and benchmarks used by leading AI companies worldwide. Their evaluators already outperform OpenAI's by 18% in hallucination detection.
Your focus:
Lead a research and ML engineering team developing state-of-the-art AI supervision systems
Solve open research problems in evaluation, explainability, and robustness
Set research vision alongside the CTO and establish rigorous research processes
Guide development of novel benchmarks for SOTA AI systems
Represent the company through publications, speaking, and industry relationships
Build and grow a world-class technical team
You should have:
PhD in Computer Science, Mathematics, Statistics, Linguistics or related field
Strong publication record at top AI conferences (NeurIPS, ICML, EMNLP, ACL)
Experience conducting empirical NLP research in academic or industry settings
Deep knowledge of transformer architectures and evaluation frameworks
Experience training language models in applied or research contexts
Ability to communicate complex technical concepts across different audiences
The package:
Competitive salary ($300K-$350K base)
Meaningful equity in a well-funded startup
Performance bonus
Full health, dental and vision coverage
401k plan
Unlimited PTO
Regular global team off site days
Location preference for NYC or SF, with flexibility for exceptional candidates.
If you're passionate about ensuring advanced AI systems can be effectively supervised and evaluated, this role offers the chance to make a significant impact on the future of AI safety and deployment.
Questionnaire
Do you have experience leading research teams? Please select Yes No
Have you published at top scientific AI conferences? Please select Yes No
Are you willing to work out of San Francisco Bay Area? Please select Yes No
Have you worked on topics such as AI alignment, Evals, LLM as judges, Benchmarking etc Please select Yes No