Job Description
Looking to solve complex ASR challenges at scale?
You'll be joining an established conversational AI company with proprietary in-house speech models that process billions of interactions annually, where your ASR expertise will directly impact real-world customer experiences.
You'll be tackling demanding ASR problems in production environments: streaming speech recognition in noisy conditions, robust accent handling and maintaining high performance at scale.
Working across key production environments, you'll enhance speech capabilities and bring new features to production that push the boundaries of what's possible in challenging acoustic environments.
Your focus
- Maintain and iteratively improve existing ASR technology while introducing cutting-edge enhancements
- Work end-to-end across speech processing components: speech enhancement, VAD, diarisation, and ASR (AM/LM modelling, ASR biasing)
- Build streaming ASR systems optimised for challenging acoustic environments
- Implement emotion detection and acoustic condition classification capabilities
- Run extensive experiments to advance activity detection and speech processing performance
What you'll bring
- Strong background in ASR model development and deployment
- Hands-on experience with SOTA speech toolkits (Kaldi, K2, NVIDIA NeMo, Parakeet)
- Proven streaming ASR experience in production environments
This is a fully remote role – must be close to EU timezone.
Ready to make your mark on speech tech that millions rely on daily? Apply today.