UHG
Search
Close this search box.

Is Sarvam AI the OpenAI of India?

Bengaluru-based Sarvam AI looks to create an LLM from scratch, while also using existing open-source models like Mistral, Databricks DBRX, and Meta’s Llama 2

Share

Illustration by Nikhil Kumar

Listen to this story

“We’ve just started here, I don’t think we are trying to build the class of models that OpenAI is trying to build with GPT-5. If there are as many people who can use our models as OpenAI, I will be very happy,” said a camera-conscious Vivek Raghavan, the co-founder of Sarvam AI, a startup nestled in Indiranagar, the heart of Bengaluru, mirroring Starbucks. 

“I would be very happy if we are as successful as OpenAI,” said Raghavan, kick-starting an exclusive interview with AIM on a humble note. 

Now, OpenAI too is making its presence felt in India. The San Francisco-based company recently hired Pragya Misra, and is working with former Twitter India head Rishi Jaitly as a senior advisor to oversee the changing regulatory landscape, and facilitating talks with the government about AI policy in the country. 

The Sam Altman-led company is likely to set up an office in Namma Bengaluru hopefully not in Indiranagar.

Established in July 2023, Sarvam AI was co-founded by Raghavan and Pratyush Kumar with the dire need to make generative AI accessible to everyone in India at scale. 

“The intent was actually to leverage GenAI and make people’s lives better. We think this is a foundational technology, and we don’t want India to become solely a prompt engineering nation,” said Raghavan. 

The duo have a strong background in AI, and have previously worked at AI4Bharat, a research initiative based at IIT Madras, which focuses on open-source Indian language AI. Raghavan has over a decade of experience at UIDAI, the entity overseeing the Aadhaar identity system in India. 

Kumar, on the other hand, holds a PhD from ETH Zurich and a BTech from the Indian Institute of Technology Bombay.  He was also  involved in AI4Bharat, an initiative he co-founded, which is dedicated to advancing Indian language AI applications. 

Last December, the company raised $41 million in its Series A funding round led by Lightspeed Ventures with participation from Peak XV Partners and Khosla Ventures. “The fact that we have been able to raise some money is actually a responsibility. It’s not an indicator of success.” Raghavan said that a significant portion of the amount has been invested in compute. He added that, in the near future, the company does not plan to raise more funds. 

Small Team, Big Impact

Fifty-five-year-old Raghavan is young at heart and open to new ideas. He is currently leading a lean team of 25 members, in the age group of 25-30. 

“Frankly, I never thought that I would be doing this at this age, but I’m very excited that I am,” said Raghavan. He shared that he got interested in deep learning 7-8 years ago while working for Aadhaar, where he worked with Transformer-based models.

“At this stage, our team consists of about 25 people, and we don’t plan on growing too large, maybe 30-40 at most,” Raghavan said about Sarvam AI’s team size. 

Moreover, he believes there is no dearth of AI talent in India. “If you look at generative AI papers globally, you will see that a significant percentage of them are authored by Indians anyway,” he said.

OpenAI, on the other hand, is run by a bunch of ‘oldies’, with an average age of 35-40, and currently has about 500 employees. A majority of AI engineers at OpenAI happen to be Indians, with the country being the second largest market for them after the US.  

OpenAI vs Sarvam AI 

When OpenAI and other startups in the West are targeting AGI, Raghavan isn’t even thinking about it. “I don’t think much about AGI and those kinds of things. I think about how human lives can get easier,” he said. 

He added that the countries that use generative AI are the ones that will benefit the most.

Meanwhile, when Altman visited India, he didn’t mention even once that OpenAI would build models for India. Disappointingly, no substantial discussions took place regarding the establishment of an office, fostering local talent and startups, or developing future models for Indian languages and use cases.

While Sarvam AI is leveraging its advantage with Indic datasets and building for India, OpenAI is still figuring out the legal aspects and complex language dynamics in the country. Sarvam AI is definitely a step ahead. 

Last year, Sarvam AI open sourced OpenHathi, an Indic Hindi LLM built on top of Llama 2. Raghavan wasn’t sure about  the exact number of downloads for the model. He explained, “It was more about offering something for users in the ecosystem to play and experiment with.” 

On Hugging Face, the model has been downloaded more than 18,000 times last month. 

Sarvam AI recently open-sourced ‘Samvaad’, a curated dataset with 100,000 high-quality conversations in English, Hindi, and Hinglish, totalling over 700,000 turns.

He said that the company’s goal is to create an LLM from scratch, while also using existing open-source models like Mistral, Databricks DBRX, and Meta’s Llama 2. “We’re going to experiment with all the open models available. From what we’ve seen, Meta’s latest release, Llama 3, looks quite good,” he said. 

Meta is also working with Sarvam AI to build vernacular LLMs. 

Moreover, Raghavan believes that to be a successful generative AI startup you do not necessarily need to build LLMs from scratch. “In the end, the test is going to be about who is building things that are useful for the market and actually moving generative AI forward in India,” he said.

Meanwhile, Soket Labs AI became the first Indian AI startup to focus on building solutions to achieve AGI and beyond, alongside building small foundational models from scratch for enterprises and consumers. 

Recently, SML’s Hanooman also unveiled the alpha version of a ChatGPT-like platform to Indian consumers with extensive support for various Indian languages. The company is also looking to build search capabilities in the coming months (similar to Perplexity AI). Surprisingly, the Hanooman chatbot is way better than Ola Krutrim

Ironically, there are now three OpenAIs in India. 

How is Sarvam AI Different?

Earlier this year, during Microsoft chief Satya Nadella’s visit to India, he announced a partnership with Sarvam AI, which is looking to build an Indic voice LLM. The team said that this would be released in the coming months.  

“We believe that in India, people will experience generative AI through the medium of voice,” said Raghavan.  

He added that it is very hard to input text in Indian languages and that in India, people tend to prefer voice communication over text. With Ola Krutrim and SML’s Hanooman also building LLMs from scratch, Raghavan said that the voice interface is going to give Sarvam AI the edge. 

“We want people to do things through voice and that will be the USP of Sarvam AI,” he said.

Further, he said that Sarvam AI will be building agentic systems, allowing users to not only receive information but also take action. “I hope in the next few months we’ll see some of these things being announced and released in the marketplace,” he said. 

He highlighted that this preference has numerous practical applications in the country, such as in customer support and gathering feedback, where voice-based models can efficiently handle large-scale feedback listening. “We will support 10 languages and hopefully over time we will expand it even further from that,” he said. 

📣 Want to advertise in AIM? Book here

Picture of Siddharth Jindal

Siddharth Jindal

Siddharth is a media graduate who loves to explore tech through journalism and putting forward ideas worth pondering about in the era of artificial intelligence.
Related Posts
Association of Data Scientists
Tailored Generative AI Training for Your Team
Upcoming Large format Conference
Sep 25-27, 2024 | 📍 Bangalore, India
Download the easiest way to
stay informed

Subscribe to The Belamy: Our Weekly Newsletter

Biggest AI stories, delivered to your inbox every week.

Flagship Events

Rising 2024 | DE&I in Tech Summit
April 4 and 5, 2024 | 📍 Hilton Convention Center, Manyata Tech Park, Bangalore
Data Engineering Summit 2024
May 30 and 31, 2024 | 📍 Bangalore, India
MachineCon USA 2024
26 July 2024 | 583 Park Avenue, New York
MachineCon GCC Summit 2024
June 28 2024 | 📍Bangalore, India
Cypher USA 2024
Nov 21-22 2024 | 📍Santa Clara Convention Center, California, USA
Cypher India 2024
September 25-27, 2024 | 📍Bangalore, India
discord icon
AI Forum for India
Our Discord Community for AI Ecosystem, In collaboration with NVIDIA.