Tensoic, the creator of Kannada Llama, aka Kan-Llama, recently released the playground to test the model, which was in partnership with E2E Networks built on NVIDIA A100s infrastructure and with Xylem AI for inference.
Adarsh Shirawalmath, the brain behind the model, said that Xylem AI’s inference platform has a mind-blowing inference stack.
AIM got in touch with Arko Chattopadhyay, the co-founder and CEO of Xylem AI, to understand the company’s vision and what they are offering to help build Indic LLMs, which makes them stand apart from others in the field, such as Together AI and Anyscale.
“We started as a search index company for the enterprise, something that Perplexity AI is doing for the web. We wanted to do it for enterprises’ own data,” said Chattopadhyay. The company pivoted its model three months into the business as it realised that there is not enough infrastructure to support building LLMs, and thus wanted to fill that space.
Along with Chattopadhyay, the nine-month-old startup was co-founded by Enrique Ferrao and Pranav Reddy. All three met at Manipal Institute of Technology when they started participating in hackathons and working on AI models and algorithms for autonomous defence robots. After founding Amigo, a mental healthcare services app for enterprise, the founders began working on Xylem AI.
No need for engineering efforts
Xylem offers a comprehensively managed LLMOps platform designed for teams to efficiently train, deploy, and scale LLMs in production, eliminating the need for additional engineering involvement. “We want to eliminate the need to hire expensive machine learning engineers for companies to build LLMs,” said Chattopadhyay, emphasising that Xylem’s platform is very easily scalable for everyone.
Highlighting the privacy and security concerns that enterprises have with using closed-source models such as OpenAI’s GPT and Anthropic’s Claude, Chattopadhyay said that Xylem helps in fine-tuning with personal data on open source models, which increases the adoption by enterprises.
“This is also very compute-heavy for enterprises, which makes it very costly for companies,” he added. “We are building our fine-tuning stack that allows people to build their custom models fine-tuned much faster and cheaper. Our whole idea is that we don’t want to throw GPUs at scale, but optimise the models to run efficiently on GPUs,” said Chattopadhyay.
The company is also increasing the number of open source models available on the platform as well as the inference speed.
Currently, Xylem is scaling on a managed cloud of E2E Networks and running on NVIDIA GPUs, but it also has plans to expand to AMD’s latest offerings and is looking for more options coming out in the market.
The goal is to be one of the fastest inference engine
“The battle to be the fastest inference engine is probably a zero-sum game,” said Chattopadhyay. “You can bring your data and train your model on our platform. We will take care of the backend and the engineers don’t have to waste any of their time,” he said.
He further added that once the model is trained, you can easily deploy it with Xylem’s platform, and keep updating the models as they come along.
Chattopadhyay claims that Xylem is currently the fastest inference platform in India, and is slowly closing in on global competitors such as Together AI and Perplexity AI. “We currently are working to get even faster by optimising further on our speeds. Moreover, LLMs would get better and the responses would be instantaneous for the end users,” he added that the conversation would shift to easier API integration and smoother developer experience.
“Building an LLM should not be the moat of the company, the data should be the moat,” said Chattopadhyay and added that anyone can build LLMs these days, but the point is how do you deliver it to the audience. “Since most of them are anyway building on top of other LLMs, there is no need to worry about deploying it to the people.”
“Of course, we are focusing on India as it is our home and also the biggest market and Indic LLMs are coming up, but our goal is to go global,” he said that the aim for Xylem is to help build LLMs for India and also support the global AI ecosystem.
Chattopadhyay further said that the company already has customers in the Middle East and some on the way in Europe, along with Indian companies, and is also highly appreciative of the work that Sarvam AI and Japanese AI startup Sakana AI are doing.
“If you want to deploy a model for 1.5 billion people, you cannot just keep on stacking GPUs as the cost would be very high,” Chattopadhyay concluded, saying that not everyone can focus on every part of the stack of building and deploying LLMs.
That is where Xylem wants to take its place in the inference, fine-tuning, and deployment of the models for others, and focus on making the developer experience easier and faster.