At Google I/O Connect, Bengaluru, the generative AI powerhouse expanded access to its multimodal AI model Gemini 1.5 Pro and family of open models Gemma 2 for Indian developers.
The new 2 million token context window on Gemini 1.5 Pro, previously limited, is now accessible in India. With a capacity of one million tokens, users can analyse extensive data, including up to one hour of video or 11 hours of audio.
Explaining the business use cases of the models and what Indian developers are looking for, Chen Goldberg, VP, GM, Google Cloud Runtimes told AIM today, “We’re talking with them about how they can scale—scale with their customers, their business, and their teams and also how they can run more efficiently.
“Our customers in India are critical for us. We expect to see a lot of innovation in AI coming from the local market” she added.
Additionally, the newly released Gemma 2 models, available with nine billion and 27 billion parameters, claim to offer improved performance and safety protocols. Optimised by NVIDIA, these models run efficiently on next-gen GPUs and a single TPU host in Vertex AI.
Boosting India’s GenAI Space
The availability of Gemma in India is likely to be a big leap in the surge of foundational models in Indian languages. Many developers in India prefer Gemma over other open-source models like Meta’s Llama for building Indic LLMs.
Gemma’s tokenizer is particularly effective for creating multilingual solutions, as demonstrated by Navarasa, a multilingual variant of Gemma for Indic languages. At Google I/O California, the company highlighted the success of this project, developed by Telugu LLM Labs founded by Ravi Theja Desetty and Ramsri Goutham Golla. It is accessible in 15 Indic languages.
“In India, there are two main areas of focus. Firstly, addressing language-related issues. Secondly, involves large-scale transformations across various industries be it in customer engagement or addressing the broader needs of the Indian population,” Subram Natarajan, director of customer engineering and field CTO at Google Cloud told AIM at the event, echoing similar thoughts.
Previously, Vivek Raghavan, the co-founder of Sarvam AI also told AIM that Gemma’s tokenizer gives it an advantage over Llama when it comes to Indic Languages. He explained that The tokenization tax for Indic languages means asking the same question in Hindi costs three times more tokens than in English and even more for languages like Odiya due to their underrepresentation in these models.
Today, the company also unveiled IndicGenBench to evaluate the generative capabilities of Indic LLMs, covering 29 languages, including several Indian languages without existing benchmarks.
Going ahead, the company will continue to focus on investments in the developer community and partnerships.
“These are crucial for scaling our operations. We understand that these elements are essential not just for our success but for the broader public’s benefit” concluded Natarajan.