Meta’s New AI Can Translate Nearly 100 Languages

by nativetechdoctor
3 minutes read

Facebook’s parent company, Meta Platforms, just launched its AI-SeamlessM4T model. It can support the translation of written and spoken texts in nearly 100 languages. This is a combination of technology that was previously only available in separate models.

In a blog post, the company said that SeamlessM4T can also translate entire “speech to speech” in 35 languages.

CEO Mark Zuckerberg said he envisions such tools facilitating interaction between users around the globe in a virtual universe, a collection of interconnected virtual worlds that he is betting on the future of the company.

The blog post says that Meta is working on making this model available to the public for non-commercial use.

The world’s largest social media company released a series of mostly free AI models this year, including the massive conversational language model Llama. This poses a serious challenge to the proprietary models sold by OpenAI and Alphabet’s Google. Microsoft-backed

SeamlessM4T builds on a previous artificial intelligence (AI) project by Meta. In July 2022, the company launched the “No Language Left Behind” project, which uses AI to translate text-to-text for 200 languages ​​with a focus on improving translations for rare languages. more or less used, according to CNET.

Like many big tech companies, Meta has increased its focus on developing and launching AI-powered tools and services this year. Like Microsoft launching a new AI-powered Bing search feature in February, using the same OpenAI ChatGPT-enabled technology…

ecosystem Mr. Zuckerberg said the open AI benefits Meta because the company can effectively mobilize resources from the community to create tools that are consumer-facing, interactive, and serve consumers. users, for its social platform rather than charging for access to the models, according to Reuters.

However, Meta faces legal questions around the issue of training data (the initial data they need to create AI models).

In July, comedian Sarah Silverman and two other authors filed a lawsuit against Meta and OpenAI for copyright infringement, accusing the two companies of using their books as training data without their permission. author, according to Reuters.

For the SeamlessM4T model, Meta researchers say they collected audio training data from 4 million hours of “publicly available web data archive raw audio”. They don’t specify which repository. A spokesperson for Meta did not respond to questions about the source of the audio data. According to the research paper, the text data comes from datasets created last year that pull content from Wikipedia and affiliated websites.

Related Posts

Leave a Comment

Adblock Detected

Please support us by disabling your AdBlocker extension from your browsers for our website.