Loading...

  • 18 Jun, 2024

AI Meta can translate many low resource languages

AI Meta can translate many low resource languages

A paper published in the journal Nature explains the technology behind the artificial intelligence meta model, which can translate 200 different languages. This model increases the number of languages ​​that can be translate by machine translation.

Neural machine translation models use artificial neural networks to interpret languages. These models usually require a large amount of online data available for training, which may not be extensive or small, usually for some so-called "limited resource languages". Increasing the language output of the model based on the number of translated languages ​​may adversely affect the translation quality of the model.

Marta Costa-jussà and the No Language Left Behind (NLLB) team have developed a language translation approach that allows neural machine translation models to learn how to translate very small languages ​​using their previous abilities. to translate high-quality materials. language Language Therefore, the researchers developed an online language translation tool called NLLB-200, which covers 200 languages, three times as many low-resource languages ​​as high-resource languages. , and improve performance by 44% over existing systems.

Since the researchers have only 1,000-2,000 samples for many of the limited resource languages, in order to increase the amount of training data for NLLB-200, they use linguistic classification systems to identify other examples of special languages. The team also collected bilingual vocabulary data from online reports, which helped improve the quality of translations provided by NLBB-200.

The authors recognize that these tools can help people access the Web and other technologies in languages ​​that cannot be translated. In addition, they emphasize knowledge as an important demand, because this model can help those who speak very few languages ​​to access books and research articles. However, Costa-Yusa and her co-authors acknowledge that translation errors still occur.