Microsoft creates an AI voice generator so dangerous that it cannot be released

nativetechdoctorJuly 18, 20240560 views

Microsoft has developed an advanced text-to-speech AI model known as VALL-E 2, which can closely mimic human speech with exceptional accuracy. The technology has reached a level of sophistication where it can reproduce the voice of the original speaker with high naturalness and precision. However, due to concerns about potential fraudulent use and impersonation, the company has opted not to make this AI model available to the public.

Named VALL-E 2, this AI model represents a significant achievement in text-to-speech synthesis, achieving human-like voice quality and performance. Microsoft’s internal benchmarks indicate that VALL-E 2 can replicate or even surpass human speech in certain cases. The company’s researchers conducted experiments on the LibriSpeech and VCTK datasets, demonstrating that VALL-E 2 outperforms previous zero-shot TTS systems in terms of performance, robustness, naturalness, and voice similarity. This is the first system to achieve human parity in these standards.

While Microsoft has emphasized that VALL-E 2 is solely a research project with no current plans for public release, the company has outlined potential applications in various industries such as education, journalism, self-authored content, accessibility features, voice response systems, translation, and chatbots.

The FBI unlocked the phone of Donald Trump’s assassin after just two days

A series of Windows computers around the world are suffering from ‘blue screen of death’

Related posts

Google launches Gemini 2.0 – comprehensive AI that can replace humans

NVIDIA RTX 5090 can be 70% more powerful than RTX 4090?

iOS 18.2 launched with a series of groundbreaking AI features