[Recap] The Learning Curve: How Samsung’s R&D Institutes Around the World Worked on Galaxy AI

Galaxy AI has already helped millions of users around the world connect and communicate. On-device AI features based on large language models (LLMs) — such as Live Translate, Interpreter, Note Assist and Browsing Assist — supports 16 languages, with four more coming by the end of the year.

The process of building language features for Galaxy AI involved much time and effort as each language presents a unique structure and culture. Samsung’s Researchers from around the world — in Brazil, China, India, Indonesia, Japan, Jordan, Poland and Vietnam — shared the challenges and triumphs behind the development of Galaxy AI. Samsung Newsroom compiled a recap of their stories below.

Developing a Translation Model

Galaxy AI features such as Live Translate perform three core processes: automatic speech recognition (ASR), neural machine translation (NMT) and text-to-speech (TTS).

▲ Automatic speech recognition (ASR), neural machine translation (NMT) and text-to-speech (TTS) each require distinct sets of information for training

Samsung R&D Institute Vietnam (SRV) faced obstacles with automatic speech recognition (ASR) models because Vietnamese is a language with six distinct tones. Tonal languages can be difficult for AI to recognize because of the complexity tones add to linguistic nuances. SRV responded to the challenge with a model that differentiates between shorter audio frames of around 20 milliseconds.

Samsung R&D Institute Poland (SRPOL) had the mammoth hurdle of training neural machine translation (NMT) models for a continent as diverse as Europe. Leveraging its rich pool of experience in projects spanning more than 30 languages across four time zones, SRPOL was able to navigate the untranslatability of certain phrases and handle idiomatic expressions that may not have direct equivalents in other languages.

Samsung R&D Institute Jordan (SRJO) adapted Arabic — a language spoken across more than 20 countries in about 30 dialects — for Galaxy AI. Creating a text-to-speech (TTS) model was no small endeavor since diacritics and guides for pronunciation are widely understood by native Arabic speakers but absent in writing. Based on a sophisticated prediction model for missing diacritics, SRJO was able to publish a language model that understands dialects and can answer in standard Arabic.

The Importance of Data

Throughout the process of training Galaxy AI in each language, an overarching theme was the importance of open collaboration with local institutions. The quality of data used directly affects the accuracy of ASR, NMT and TTS. So Samsung worked with various partners to obtain and review data that reflected each region’s jargon, dialects and other variations.

[Recap] The Learning Curve: How Samsung’s R&D Institutes Around the World Worked on Galaxy AI

▲ Each language has a distinct set of qualities that pose challenges in creating an AI language model for it. Tones add to the complexity for tonal languages such as Vietnamese.

Samsung R&D Institute India-Bangalore (SRI-B) collaborated with the Vellore Institute of Technology to secure almost a million lines of segmented and curated audio data on conversational speech, words and commands. The students got hands-on experience on a real-life project as well as mentorship from Samsung experts; the rich store of data helped SRI-B train Galaxy AI in Hindi, covering more than 20 regional dialects and their respective tonal inflections, punctuation and colloquialisms

Local linguistic insights were imperative for the Latin American Spanish model because the diversity within the language is mirrored by the diversity of its user base. For example, the word for swimming pool could be alberca (Mexico), piscina (Colombia, Bolivia, Venezuela) or pileta (Argentina, Paraguay, Uruguay) based on which region you’re from. Samsung R&D Institute Brazil (SRBR) worked with science and technology institutes SiDi and Sidia to collect and manage massive amounts of data as well as refine and improve upon audio and text sources for Galaxy AI’s Latin American Spanish model.

Samsung R&D Institute China-Beijing (SRC-B) and Samsung R&D Institute China-Guangzhou (SRC-G) partnered with Chinese companies Baidu and Meitu to leverage their expertise from developing large language models (LLM) such as ERNIE Bot and MiracleVision, respectively. As a result, Galaxy AI supports both main modes of Mandarin Chinese and Cantonese.

In addition to external cooperation, due diligence and internal resources were also essential.

Bahasa Indonesia is a language notorious for its extensive use of contextual and implicit meanings that rely on social and situational cues. Samsung R&D Institute Indonesia (SRIN) researchers went out into the field to record conversations in coffee shops and working environments to capture authentic ambient noises that could distort input. This helped the model learn to recognize the necessary information from verbal input, ultimately improving the accuracy of speech recognition.

There are many homonyms in Japanese as the number of sounds is limited in the language. So many words must be determined based on the context. Samsung R&D Institute Japan (SRJ) used Samsung Gauss, the company’s internal LLM, structure contextual sentences with words or phrases relevant to each scenario to help the AI model differentiate between homonyms.

Samsung’s Global Research Network

The professionals across various Samsung R&D Institutes made full use of Samsung’s global research network.

Before tackling Hindi, SRI-B collaborated with teams around the world to develop AI language models for British, Indian and Australian English as well as Thai, Vietnamese and Indonesian. Engineers from other Samsung research centers visited Bangalore, India, to bring Vietnamese, Thai and Indonesian to Galaxy AI.

▲ Staff and collaborators pose in front of Samsung R&D Institute India-Bangalore (SRI-B)

SRPOL had extensive experience developing ASR, NMT and TTS models for a multitude of languages. A key player in Galaxy AI’s language expansion, SRPOL collaborated across continents to support SRJO with Arabic dialects and SRBR with Brazilian Portuguese and Latin American Spanish.

Samsung developers at each of these locations learned to collaborate across borders and time zones. Developers from SRIN even observed the local fasting customs in India when meeting their SRI-B colleagues. Many reflected on their work with pride and gratitude — realizing the lasting implications this project has on language, culture, heritage and identity.

Ongoing Efforts as the Journey Continues

Samsung recently introduced Galaxy AI to its latest foldables and wearables. Since its release earlier this year, Galaxy AI has already been used on more than 100 million devices. “We’re expecting to reach 200 million devices by the end of 2024,” said Won-joon Choi, EVP and Head of the Mobile R&D Office, Mobile eXperience Business at Samsung Electronics at a recent panel discussion.

Amidst this mission to democratize AI, it is important to look back and celebrate the accomplishments and progress that have led to providing this safe and inclusive technology that will benefit humanity and improve lives. By building up the Galaxy AI ecosystem with even more features, languages and regional variations, Samsung is facilitating cross-cultural exchanges in unprecedented ways to realize its vision of AI for All.

Source: Samsung Mobile Blog
—

[Recap] The Learning Curve: How Samsung’s R&D Institutes Around the World Worked on Galaxy AI

Developing a Translation Model

The Importance of Data

Samsung’s Global Research Network

Ongoing Efforts as the Journey Continues

By Duncan Nagle

You Missed

5 gaming trends across PlayStation in 2024

Apple Watch | Quit Quitting | Apple

Empowering MediaMarkt to offer a seamless retail experience | Samsung

Capture unique shots with this simple setup

Developing a Translation Model

The Importance of Data

Samsung’s Global Research Network

Ongoing Efforts as the Journey Continues

By Duncan Nagle

Related Post

You Missed