“While most of us have an Aadhaar card, have used UPI, and may have heard of AA, OCEN, or NDHM, there is one new Digital Public Infrastructure (DPI) that is quickly gaining traction: Bhashini,” posted Rahul Mathur, associate vice president of DeVC, on X.
Bhashini, translated finance minister Nirmala Sitharaman’s Union Budget 2024 speech into Hindi in real-time.
“UPI is the DPI which we use the most (at max a few times per day). But, we use language MUCH more than we use payments. Bhashini can be used even more than UPI if it is successfully adopted across consumer (B2C) and business (B2B) use-cases,” Mathur added.
Most recently, as part of Project Vaani, led by the ministry of electronics and information technology, Bhashini, in partnership with the Indian Institute of Science (IISc) and AI and Robotics Technology Park (ARTPARK) announced plans to open-source 16,000 hours of spontaneous speech data from 80 districts.
Launched by Prime Minister Narendra Modi on July 29, 2022, Bhashini is a key component of India’s National Language Translation Mission (NLTM). It focuses on developing and providing tools and resources for language technologies, including speech recognition, translation, and natural language processing.
The goal is to enhance accessibility and inclusivity for users across different languages and regions in India. “We are a government service built under government funding,” said Bhashini chief Amitabh Nag in a previous interaction with AIM, emphasising the non-political nature of the initiative.
“We should have a product like Bhashini in France,” said Hélène Quintin, the CEO of the KERU Project, who visited the Digital India Experience Zone at the 46th World Heritage Committee Meeting at Bharat Mandapam.
Startups’ Favorite
On his recent visit to India, OpenAI vice president Srinivas Narayanan praised Bhashini for its efforts in addressing language barriers during the inaugural ceremony at the Global India AI Summit. “Language barriers are one of the challenges for large language models, and this is an area we have been working on. We want to applaud the government’s efforts with Bhashini,” he said.
He also suggested that OpenAI might explore a partnership with Bhashini to access Indic data. OpenAI is expanding its presence in India and recently hired Pragya Misra as its first employee in the country. Misra will serve as the head of government relations to engage with the Indian government and create a conducive environment for OpenAI to operate in India smoothly.
In India, several startups are using datasets created by Bhashini. For instance, Bengaluru-based startup KOGO AI is building multilingual AI agents in partnership with Bhashini.
Bhashini launched a crowdsourcing initiative called Bhasha Daan to collect voice and text data in multiple Indian languages, where anyone can contribute. In collaboration with Nasscom, it also launched the ‘Be our Sahayogi’ program on National Technology Day to crowdsource multilingual AI problem statements.
A few months ago, Snapdeal signed a Memorandum of Understanding (MoU) with Bhashini to develop products and services that enhance language translation capabilities, promoting digital inclusivity throughout India.
The Indian Railways recently upgraded its RailMadad grievance portal, incorporating advanced multilingual capabilities and cutting-edge text and speech recognition features powered by Bhashini. These enhancements enable travelers to easily inquire about tickets, track bookings, and resolve grievances. IRCTC’s chatbot, AskDISHA, developed by Corover.ai, is also using Bhashini.
Voice is the New Interface
The Reserve Bank of India (RBI) is also considering integrating voice commands into the Unified Payments Interface (UPI), leveraging Bhashini’s capabilities. This initiative will enable users to make payments using a voice interface in both English and Hindi, with more languages planned for future inclusion.
“If you can do voice-based payment, UPI will go through the roof,” said Pramod Varma, chief architect of Aadhaar in an exclusive interview with AIM.
Earlier this year, AI4Bharat released the IndicVoices dataset, which includes 7,348 hours of natural and spontaneous speech from 16,237 speakers across 145 Indian districts and 22 languages. This project was funded by Bhashini.
Last year, Bhashini released IndicTrans2, an open-source, Transformer-based multilingual neural machine translation (NMT) model that facilitates high-quality translation across all 22 scheduled Indian languages.
In India, several AI startups believe that voice will become the primary interface for interacting with computers. Earlier this year, during his visit to India, Microsoft chief Satya Nadella announced a partnership with Sarvam AI, which is developing an Indic voice LLM set to be released in the coming months.
“We believe that in India, people will experience generative AI through the medium of voice,” said Vivek Raghavan, the founder of Sarvam AI, in an exclusive interview with AIM. He noted that inputting text in Indian languages can be challenging and that people in India generally prefer voice communication over text.
Similarly, Varma believes that India’s future is voice-based AI. “Indian entrepreneurs should really look at voice as a completely new human-computer interaction method. It could be very powerful, and I think it’s going to happen because voice is natural to humans,” he said.
What Bhashini Offers
Bhashini offers a range of services, with APIs available on the National Platform for Language Technology. These APIs provide services such as machine translation, speech-to-text, and text-to-speech. “This language technology hub provides modern service APIs to various enterprises. It receives about 100,000 hits per day, and that is where customers might have their own applications,” said Nag.
“The second [aspect] is the Bhashini app, which is targeted towards end-users looking to translate either text or have voice-to-voice discussions,” he added
Bhashini has been consistently adding new features to its app. Recently, it introduced the Next-Gen Multilingual Audiobook Converter, which effortlessly converts diverse texts into audiobooks, offering users a personalized, accessible, and language barrier-free reading experience.
The app also has the OCR feature, called SCENE, in its beta version, allowing users to seamlessly extract text and enhance accessibility along with the Browse feature facilitating effortless website translation.
“Additionally, we have another web service called Anuvaad, which handles translation,” said Nag. Anuvaad, much like Google Translate, is a web and mobile application that enables users to translate text between 22 Indian languages and English.
Data Collection and Annotation
Bhashini has approximately 200 translators working in the field and collecting digital data. “We have established a straight integrated pipeline in one of our research institutes at IIT Madras, in collaboration with AI4 Bharat. Here, data is collected, curated, annotated, and labeled to train AI models. We also have a specially funded mechanism to create the digital data,” said Nag.
Moreover, Nag told AIM that Bhashini is collaborating with Karya, one of the world’s first data cooperatives that offers labelling and annotation services. Karya is known for constructing datasets for firms like Microsoft and Google, which are used in AI models for education, healthcare, and other services.
“We lack digital data on low-resource languages, such as Bodo or Sindhi. For these languages, we approach individuals proficient in both Bodo and English,” he said, explaining that they help create a training dataset by providing parallel text in English corresponding to the content written in Bodo.