Job Description
Want to own how an AI product actually thinks at scale?
You’ll join a team building one of the largest conversational AI platforms globally, already used by 50M+ people and growing fast. This isn’t an API wrapper or a thin product layer. AI is the product.
You’ll take ownership of the core model behaviour, shaping how the system responds, adapts, and improves across millions of real conversations. That means working where model design meets product reality, where latency, cost, safety, and user experience all collide.
You’ll lead from the front. Still hands-on, still in the code, but responsible for the direction.
The work sits across post-training, inference, and system design. You’ll be making decisions that directly affect how users experience the product every day.
Your focus will include:
- Owning LLM behaviour across a high-scale conversational system
- Fine-tuning and adapting open-source models such as Llama, Mistral, and Qwen
- Improving response quality, alignment, and conversational memory
- Designing evaluation pipelines that reflect real user interactions, not just offline benchmarks
- Optimising inference for latency, cost, and reliability at scale
You’ll also lead a small team, setting direction while staying close to implementation. This is not a step away from the work.
There’s real technical ownership here. You’ll define trade-offs across:
- RAG versus fine-tuning approaches
- Model selection and architecture decisions
- Scaling strategies across compute, latency, and cost
You’ll likely have experience building and deploying LLM systems in production, not just experimenting. You understand how models behave in messy, real-world environments and how to improve them iteratively.
Background-wise, you might come from conversational AI, assistants, or agent-based systems. You’ve probably worked with post-training methods like LoRA, QLoRA, SFT, RLHF, or DPO, and you’re comfortable with modern tooling across PyTorch, Hugging Face, and inference stacks.
Why this role?
You’ll be working on a product with real usage at global scale. The feedback loop is immediate. Changes you make will impact millions of interactions.
The team moves quickly. Ideas are tested and shipped in days, not quarters. There’s minimal process overhead and a strong bias toward building.
You’ll also be operating in a product space that brings real complexity, including content moderation and safety challenges. It’s not a clean lab environment, it’s production AI with all the edge cases that come with it.
Package
Salary: ~$200,000 base + ~$80,000 equity
Location: Fully remote (global)
Type: Full-time (B2B or employment)
If you’re looking to own LLM systems at scale, technically and directionally, this is worth exploring.
All applicants will receive a response.