Job title: Lead/Staff DL Scientist - Speech Synthesis
Job type: Permanent
Emp type: Full-time
Industry: Speech technology
Skills: Machine Learning Deep Learning Speech TTS Voice conversion Python SOTA Deep Learning PyTorch GANs Diffusion
Salary type: Annual
Salary: negotiable
Location: Remote, global
Job published: 21/02/2024
Job ID: 32242

Job Description

Are you an adept deep learning scientist looking to work on advanced speech AI technology?

 

This company are actively looking for an experienced Research Scientist for a Lead Researcher position focussed on TTS. You’ll pioneer deep learning and machine learning models and play a role in the development on the Science team.

 

Key Responsibilities:

  • ML Model Refinement: Collaborate on honing architectures, fine-tuning parameters, and training text-to-speech and speech-to-speech machine learning models.
  • Experimentation and Refinement: Conduct research experiments to iterate and refine machine learning models, aiming for heightened performance.
  • Collaboration: Work closely with peers in scientific and engineering roles, contributing to the development and seamless integration of machine learning models. Collaboration and communication is essential to lead this globally distributed team.
  • Cloud-Based Streamlining: Ensure the effective execution of models and overall inference pipelines on cloud platforms.

Required Qualifications:

  • Professional Background: Over 5 years in Deep Learning. With recent 2+ years in text-to-speech and voice conversion.
  • Team Leadership: 2+ years leading teams.
  • Technical Competence: Hands-on involvement with Python development and debugging.
  • Expertise in Deep Learning: Proficiency in various deep learning methodologies.
  • State-of-the-art: Experience with various SOTA deep learning techniques for TTS including Diffusion, GANs, VAEs, Vocoder, Encoder, Decoders, Language modelling for TTS.
  • Educational Foundation: PhD preferred, Masters considered (in areas such as Computer Science, Deep Learning, Machine Learning, Speech/Language, Dialog systems)

 

This is a remote position, and applications from across the globe are welcomed and adjustments are always made to accommodate those on different timezones. Your salary will depend on location and experience. Estimated salary for the United States is $200k. Up to $220k for tech hubs such as San Francisco, Seattle, New York, Los Angeles or Austin.

 

If you are enthusiastic about breaking new ground in speech AI, possess substantial expertise in deep learning, and have a history of driving innovation, we encourage you to apply.