This feature, called Looking to Listen, uses both audio and visual cues to improve the sound quality of your video.
According to Engadget, you may notice a marked improvement in the sound quality of Stories on YouTube in the near future thanks to the new feature. Several years ago, Google introduced AI technology “Looking to Listen” with the ability to filter out voices from a crowd. They will now apply it to YouTube Stories on iOS devices.
With a large collection of online videos, Google has trained Looking to Listen on the correlation between voice and visual signals, such as mouth movements and facial expressions. To make sure it works for everyone and not be biased, Google conducted a series of tests to find out the effectiveness based on different auditory and visual properties. These attributes include the speaker’s age, skin color, language, voice pitch, facial visibility, head position, facial hair, presence of glasses, level of background noise. Google was able to determine that the technology’s voice improvement capabilities remained fairly consistent across the speaker’s languages.
The company also explains how it has improved this type of technology over the years. The programmers made sure it could still do all the processing on the device, so there was no need to send data to the remote server. They also used a technique to help them quickly extract the face thumbnail image from the video. That allows Looking to Listen to do the job while the video is being recorded. In addition, Google also significantly increased performance in terms of time, as it only needs a few seconds to process a video of about 15 seconds.
To enable this feature, users just need to turn on Enhance speech in volume control on iOS.