
Recording for the Multilingual Cloud – Digital Resources for Languages of Bangladesh. Image via EBLICT Project, BCC. Used with permission.
Many endangered languages in Bangladesh are spoken by only a handful of people. For instance, the Kharia language is spoken by just five people. Once these speakers are gone, the language may disappear entirely. To help preserve such languages, the Bangladesh government’s Information and Communication Technology Division (ICTD) under the Ministry of Posts, Telecommunications, and Information Technology has launched a project to digitize ethnic languages. This initiative is part of ICTD's Enhancement of Bangla Language in ICT through Research & Development (EBLICT) project implemented by the Bangladesh Computer Council.
In July 2025, a website titled Multilingual Cloud was launched under this initiative to digitally preserve 42 languages. Designed as an Indigenous language repository under the bangla.gov.bd portal, the site offers a rich collection of words, phrases, and International Phonetic Alphabet (IPA) transcriptions from diverse languages of Bangladesh, aiming to promote understanding and celebrate the nation’s cultural and ethnic diversity.
Speaking about the website, Mamun Or Rashid, a consultant for the EBLICT Project and a faculty member at Jahangirnagar University, told Global Voices in a phone interview:
To preserve Bangladesh's endangered languages, this platform has systematically collected and safeguarded both their written and unwritten forms.
On August 9, International Indigenous Peoples’ Day, he also shared his reflections in a Facebook post:
‘আজকে ওয়ার্ল্ড'স ইনডেজেনাস পিপল ডে। আমি যতগুলো কাজ করতে পেরেছি তার মধ্যে যেগুলো সবচেয়ে ইমপেক্টফুল তারমধ্যে একটা হলো, এই দেশের ৪২টি ভাষা ডিজিটাইজ করে সংরক্ষণের কাজ করা। বাংলাদেশের বিপন্ন ভাষাগুলো টিকিয়ে রাখার জন্য আমরা মাল্টিলিংঙ্গুয়াল ক্লাউড (multiling.cloud) তৈরি করেছি। [..]
প্ল্যাটফর্মটিতে ৭,১৭৭টি টপিক রয়েছে, যা প্রতিটি ভাষার সংরক্ষণের নমুনা হিসেবে কাজে লাগছে। এছাড়া, এখানে IPA-এর মাধ্যমে ৯৭,৭৮২টি বাক্যের সঠিক উচ্চারণ সংরক্ষিত হয়েছে এবং ২১৪ জন নেটিভ স্পিকারের কাছ থেকে সংগৃহীত ১২,৬৪৬ মিনিটের অডিও রেকর্ড করা হয়েছে। এর মাধ্যমে বিপন্ন ও স্বল্পপরিচিত ভাষাগুলো শুধু ভাষাগত নয়, বরং সাংস্কৃতিক বৈশিষ্ট্যও সংরক্ষিত হয়েছে। সবাইকে শুভেচ্ছা।’
Today is World Indigenous Peoples’ Day. Out of all the work I’ve done, one of the most impactful is the preservation of 42 languages of this country by digitizing them. We created the Multilingual Cloud (multiling.cloud) portal to preserve Bangladesh's endangered languages. [..]
The platform includes 7,177 topics that serve as preservation samples for each language. In addition, it has documented the correct pronunciation of 97,782 sentences through the International Phonetic Alphabet (IPA) and recorded 12,646 minutes of audio from 214 native speakers. This effort has preserved not only the linguistic but also the cultural characteristics of endangered and lesser-known languages. Best wishes to all.
One of the languages featured on the website is Khasi, an Austroasiatic language spoken by the Khasi people in the northeastern Sylhet region in Bangladesh. This language does not have its own alphabet or script.
Since the early 1800s, Khasi has been written using the Roman alphabet, with the first book in the language published by missionary William Carey in 1814. Khasi has a rich oral tradition, along with a history of stories, grammar books, and religious texts. In Khasi villages, many schools teach children in their mother tongue to help sustain the language. The local community is also working to preserve and promote Khasi through education and writing.

Data collection from Indigenous villages for the portal Multilingual Cloud – Digital Resources for Languages of Bangladesh. Image via EBLICT Project, BCC. Used with permission.
As part of the preservation effort, voice data and video documentation of the Khasi language have been digitized in the portal. More than 300 minutes of Khasi audio have been recorded across 151 topics, now available online. These recordings include stories about village life, generational differences, traditional songs, medicine, memories of animals, and experiences during the COVID-19 pandemic.
Disappearing languages around the world
Many languages around the world are disappearing. According to UNESCO, one language dies every 14 days. Out of more than 7,000 languages globally, about 2,500 are endangered. A survey conducted by the International Mother Language Institute (IMLI) in Dhaka found that 14 languages in Bangladesh are at risk of extinction — including Konda, Kharia, Koda, Sauria, Munda, Kol, Malto, Khumi, Pangkho, Rengmitcha, Chak, Khyang, Lusai, and Lalen.
In 2022, officials created a national digital language resource repository to preserve Indigenous and local languages in Bangladesh. This initiative aims to safeguard the languages of small ethnic groups as well as other languages spoken across the country. The aim of the repository is to store samples of 42 languages from Bangladesh.
Of these languages, 26 are written in different Indigenous scripts; four languages use the Bengali script: Hajong, Sadri, Koda, and Bishnupriya Manipuri; eight languages have their own scripts, including Meitei Manipuri, Chak, Chakma, Tanchangya, Marma, Rakhine, Urdu, and Mro; and fourteen languages are written in the Roman script, such as Bom, Kol, Kokborok, Khasi, Garo, Lusai, Mahali, Pangkho, Abeng, Attang, Migam, Koch, Khyang, and Khumi.

Data collection for the portal Multilingual Cloud – Digital Resources for Languages of Bangladesh. Image via EBLICT Project, BCC. Used with permission.
The challenge of digitizing languages in Bangladesh
As per a 2025 study by Ritesh Karmakar, Indigenous languages in Bangladesh are fading quickly as they remain absent from schools and public domains, including online spaces, endangering both cultural identity and community bonds. Their marginalization stems largely from national policies that prioritize Bangla as the only language used in schools, government, and public life.
At a recent seminar organized as part of this initiative, attendees shared that over 50 ethnic minority groups in Bangladesh are struggling to keep their languages alive. Almost everyone in these communities must learn Bengali as a second language, either in school or for their livelihood, alongside their mother tongue. Although the government has taken steps to provide primary education in the mother tongue of five ethnic groups, it is not yet working effectively due to various complexities.
The absence of a script poses a major challenge for many of these languages, as most lack a writing system of their own. Even where scripts exist, the languages have not been able to prosper due to limited use everywhere. To preserve the languages of ethnic minority groups, it is essential to develop grammar structures and dictionaries for the most commonly spoken ones and make them accessible both online and offline. Collaboration between private organizations and the government could make the digitization of these languages more sustainable.
The project aims to research and document these marginalized languages digitally by creating fonts and keyboards for online use, helping to ensure their survival. This will also enable speakers to use their native languages on digital platforms and strengthen their sense of identity.
The features of the website

Multilingual Cloud- Digital Resources for Languages of Bangladesh. Screenshot of the Main page. Fair use.
The website Multiling Cloud features a map showing where each language is spoken in Bangladesh. It also provides detailed information on preservation efforts, including the number of recorded sentences, contributing speakers, and the amount of data collected. Each language has its own dedicated dictionary section.
This language preservation initiative has generated a lot of interest among Indigenous users. Samar Soren, who has worked with the Santali language for a long time, told Global Voices over the phone:
বাংলাদেশের বহুভাষিক বৈচিত্র্য সংরক্ষণ ও প্রসারে multiling.cloud একটি গুরুত্বপূর্ণ ডিজিটাল রিপোজিটরি। এখানে বিভিন্ন আদিবাসী ভাষার বিভিন্ন ডোমেইনের নমুনা ভয়েস সংরক্ষিত করা হয়েছে, যা ভাষার পরিধি বা ল্যাংগুয়েজ স্পেস ও ব্যবহার বাড়াতে সহায়তা করবে। শুধু গবেষক ও ভাষাকর্মীরাই নন, সাধারণ মানুষও এই প্ল্যাটফর্ম ব্যবহার করে দেশের যেকোনো আদিবাসী ভাষায় ন্যূনতম প্রাথমিক কথোপকথন শিখতে পারবেন।
Multiling.cloud is an important digital repository for the preservation and promotion of Bangladesh's multilingual diversity. It has preserved sample voices from different domains of various Indigenous languages, which will help expand the language space and its usage. Not only researchers and language activists but also ordinary people can use this platform to learn basic conversational skills in any Indigenous language of the country.







