https://www.infoq.com/news/2023/01/microsoft-text-to-speech-valle/
Microsoft has introduced VALL-E, a novel language model method for text-to-speech synthesis (TTS) that employs audio codec codes as intermediate representations and can replicate anyone's voice after listening to just three seconds of audio recording.
Create an account or login to join the discussion