OpenAI recently announced that it will upgrade ChatGPT based on the new Dall-E, adding support for inputting commands by image and voice.
The fact that Google has incorporated Bard into many applications such as Gmail, YouTube, Google Maps, and Flights is a big advantage over ChatGPT. Therefore, OpenAI recently announced that the free version of ChatGPT will soon allow voice and image command input.
This means users can ask for ChatGPT in a more natural way than having to type on iPhone and Android, or can even use images to get better answers. The main point is that users won’t have to pay for ChatGPT Plus to receive updates, although premium accounts will be the first to try it out.
Plus and Enterprise account users will receive this update over the next two weeks, followed by other user groups, including developers. Using images to import into ChatGPT is how multimodal AI models work. It’s similar to how the search giant uses Google Lens with AI.
Meanwhile, the voice support feature will only be available on the ChatGPT app for iPhone and Android. Users simply need to enable it in the app’s settings once the feature is opened. OpenAI says ChatGPT only needs a few seconds of sample speech to generate human-like audio from text, and it’s using a new text-to-speech model for that.
This technology is capable of creating realistic synthetic voices from real speech in seconds, opening the door to many creative and accessibility-focused applications. However, this also poses new risks, such as the possibility of impersonating a famous person or committing fraud. OpenAI also said it is working with Spotify to test a voice translation feature for podcasts, allowing creators to translate their content into other languages using their own voice.