Job Details

HomeShareAdd To BasketApply

AI Voice Synthesis Developer

Cambridge, UK - Market Rates
Permanent
Posted by IF Recruitment Ltd
Applicants must be eligible to work in the specified location

We're seeking an exceptional AI Voice Synthesis Developer to join an innovative start-up. The ideal candidate will combine deep technical expertise in text-to-speech (TTS) systems with a passion for creating efficient, production-ready solutions that push the boundaries of what's possible in voice synthesis.

Key Responsibilities

  • Design and implement low-latency TTS systems optimised for minimal computing resources
  • Develop and optimise AI models for Real Time voice synthesis
  • Create efficient architectures that balance quality, speed, and resource utilisation
  • Collaborate with team members to integrate voice synthesis capabilities into our products
  • Research and implement state-of-the-art techniques in speech synthesis
  • Contribute to technical architecture decisions and product strategy

Skill Required

  • Strong programming skills with demonstrated experience in AI/ML frameworks (PyTorch, TensorFlow)
  • Expertise in speech processing, Digital Signal Processing, and audio engineering
  • Advanced Python programming
  • Experience with Azure
  • Proficiency in Real Time audio processing with target latency
  • Experience optimising models for edge deployment
  • Knowledge of audio compression techniques and format
  • Familiarity with audio quality metrics
  • Experience with audio processing libraries
  • Proficiency in version control (Git) and CI/CD pipelines
  • Previous work on TTS systems (commercial or lab)
  • Background in voice conversion or voice cloning technologies

AI/ML Platform Experience

  • Experience with Groq for high-performance inference
  • Familiarity with Deepgram's API and speech-to-text capabilities
  • Knowledge of large language model deployment and optimisation

Speech Technology Expertise

  • Deep understanding of modern TTS architectures:
    • Non-autoregressive models (FastSpeech 2, Glow-TTS)
    • Autoregressive models (Tacotron 2, YourTTS)
    • Flow-based models (Flow-TTS, WaveFlow)
  • Experience with vocoders:
    • HiFi-GAN
    • WaveNet
    • UnivNet
    • BigVGAN
Cambridge, UK
ASAP
Market Rates
IF Recruitment Ltd
Melanie Bosley
02033624159
JS/2410AIDEV/MB
24/10/2024 15:39:40