Azure AI empowers organizations to serve users in more than 100 languages

Two men drink tea in an Uzbekistan teahouse

Microsoft announced today that 12 new languages and dialects have been added to Translator. These additions mean that the service can now translate between more than 100 languages and dialects, making information in text and documents accessible to 5.66 billion people worldwide.

“One hundred languages is a good milestone for us to achieve our ambition for everyone to be able to communicate regardless of the language they speak,” said Xuedong Huang, Microsoft technical fellow and Azure AI chief technology officer.

Translator today covers the world’s most spoken languages including English, Chinese, Hindi, Arabic and Spanish. In recent years, advances in AI technology have allowed the company to grow its language library with low-resource and endangered languages, such as Inuktitut, a dialect of Inuktut that is spoken by about 40,000 Inuit in Canada.

The new languages and dialects taking Translator over the 100-language milestone are Bashkir, Dhivehi, Georgian, Kyrgyz, Macedonian, Mongolian (Cyrillic), Mongolian (Traditional), Tatar, Tibetan, Turkmen, Uyghur and Uzbek (Latin), which collectively are natively spoken by 84.6 million people.

Removing language barriers

Thousands of organizations have turned to Translator to communicate with their members, employees and clients around the world. The Volkswagen Group, for example, is using the machine translation technology to serve its customers in more than 60 languages. The workload involves translating more than 1 billion words each year. The company started with standard Translator models and is using the custom feature in Translator to fine tune these models with industry specific terms.

The ability for organizations to fine tune pre-trained AI models to their specific needs was core to Microsoft’s vision when it launched Azure Cognitive Services in 2015, according to Huang.

In addition to language, Azure Cognitive Services include AI models for speech, vision and decision-making tasks. These models enable organizations to leverage capabilities, such as a Computer Vision technology known as Optical Character Recognition (OCR). This service extracts text entered on a form in any of the more than 100 languages covered by Translator and uses the text to populate a database.

“Not only do we celebrate what we have done on translation – reach 100 languages – but also for speech and OCR as well,” Huang said. “We want to remove language barriers.”

A girl rides a horse on an open field in Mongolia
Mongolian (Cyrillic) and Mongolian (Traditional) are among the dozen languages and dialects now available on Translator. In this image, a woman races a horse in the traditional Naadam festival in Bayankhongor province of Mongolia. Photo courtesy of Getty Images.

Multilingual model

The frontier of machine translation technology at Microsoft is a multilingual AI model called Z-code, according to Huang. The model combines several languages from a language family such as the Indian languages of Hindi, Marathi and Gujarati. In this way, the individual language models learn from each other, which reduces data requirements to achieve high-quality translations. For example, the quality of translations to and from Romanian were improved when the translation model is trained together with related French, Portuguese, Spanish and Italian data.

“We can leverage the commonality and use that shared transfer learning capability to improve the whole language family,” Huang said.

The reduced data requirements also enable the Translator team to build models for languages with limited resources or that are endangered due to dwindling populations of native speakers. Several of the languages carrying Translator over the 100-language milestone are low-resource or endangered.

Z-code, Huang added, is part of a larger initiative to combine AI models for text, vision, audio and language in order to enable AI systems that can speak, see, hear and understand and thus more efficiently augment human capabilities. Proof of this so-called XYZ-code vision coming into focus is manifest in the continual rollout of new languages built with multilingual model training technology, he said.

“This is bringing people closer together,” Huang said. “This is the capability already in production because of our XYZ-code vision.”

Related:

John Roach writes about Microsoft research and innovation. Follow him on Twitter