Latest Articles and Breakthrough Stories in Artificial Intelligence https://analyticsindiamag.com/ai-breakthroughs/ Artificial Intelligence, And Its Commercial, Social And Political Impact Tue, 03 Sep 2024 16:16:50 +0000 en-US hourly 1 https://analyticsindiamag.com/wp-content/uploads/2019/11/cropped-aim-new-logo-1-22-3-32x32.jpg Latest Articles and Breakthrough Stories in Artificial Intelligence https://analyticsindiamag.com/ai-breakthroughs/ 32 32 These Ex-Apple Employees from India are Building the Foundation Model for Robotics https://analyticsindiamag.com/ai-breakthroughs/these-ex-apple-employees-from-india-are-building-the-foundation-model-for-robotics/ Tue, 03 Sep 2024 05:16:35 +0000 https://analyticsindiamag.com/?p=10134281

Backed by Khosla Ventures, Lockheed Martin Ventures and others, the startup is developing safe, affordable, and intelligent robots.

The post These Ex-Apple Employees from India are Building the Foundation Model for Robotics appeared first on AIM.

]]>

The next wave of AI is likely to be Physical AI or Embodied AI. Today, a number of startups are focused on building this technology, including the California-based Vayu Robotics, which is strategically accelerating autonomy.

Safety, Cost, and Autonomy

“If autonomy is ever to make it into the world and gain trust and credibility among the people, then it has to start at a very safe spot. So the biggest knob that we have to dial up is the safety, [by] reducing the mass and speed of the robot,” said Mahesh Krishnamurthi, the co-founder and chief product officer of Vayu Robotics, in an exclusive interaction with AIM.  

“The kinetic energy of our robot is almost 1000% lower than the kinetic energy of a truck going on the road. So, by reducing the kinetic energy and the momentum of the robot, we inherently get very high safety,” he said. 

Addressing safety is one aspect, but another important focus of the startup is developing new sensors that are both low-cost and high-quality.

“So, we thought, maybe we could take a bet on this sensor and say that maybe in a few years, we would be able to build a sensor that is super low-cost, but very high in quality. That would bridge the gap between a very expensive LiDAR system, which is very high quality, and an inexpensive camera system that is not as safe as a LiDAR. 

“We tried to solve this problem and came up with the technological breakthrough,” said Krishnamurthi. 

Vayu Robotics is focusing on developing an intelligent drive agent that requires less capital to build by using simulated data for training, unlike Tesla’s approach, which relies on real-world data. 

Krishnamurthi believes that in recent years, simulators have advanced to the point where their output is nearly indistinguishable from reality, allowing them to create driving behaviours in simulations and seamlessly apply them into the real world.

“So that’s another technology breakthrough that has enabled us to build a product like a delivery robot driving like a bicyclist for less than $5,000,” he added. 

The startup has developed six specialised simulators, each tailored to specific environments. These include one for indoor environments, one for outdoor settings, another for bike lanes, and yet another for the highways. 

Additionally, they have a dedicated simulator purely for validation purposes, which is kept separately from the training process. 

Vayu and Strength

Inspired by the Sanskrit word ‘vayu’, which means intelligence behind all motion and energy, Vayu Robotics is building to enable autonomous mobile robots to move through the world. Not just that, the founding team comprises members with decades of experience in the autonomous segments. 

Krishnamurthi comes with a deep-tech background, beginning at Intel Labs in 2008 and moving on to Apple, where he worked on autonomous systems. Specialising in optoelectronics and LiDAR technology, he later joined Lyft. 

He eventually co-founded Vayu Robotics in 2022 with his friends Anand Gopalan, who was the CEO of Velodyne Lidar, and Nitish Srivastava, a former colleague at Apple. 

“Since we have collectively worked on this problem for about 25 years, we kind of had an insight on what works, what doesn’t, and what just might,” said Krishnamurthi. 

The startup closed its seed financing round in October last year with backing from some of the biggest VC players, including Khosla Ventures, Lockheed Martin Ventures, ReMY Investors and others. 

“Khosla himself is one of the big backers of this vision. He loves the idea and he is the best backer we could have as a startup. A true visionary of what the world needs to look like. We are fortunate to have him,” said Krishnamurthi. 

Over the next few years, the startup is focusing on commercialising and scaling their technology, with a contract to deploy up to 2,500 robots across multiple US cities in phases. It will be a scaled ramp-up, with set milestones. 

The startup will also focus on building sensors as a product on its own. “We will also try to have some ancillary applications for just the sensor,” he said. 

While a number of robotic startups are emerging in this space, Krishnamurthi doesn’t perceive it as competition, rather a combined effort to solve a common problem. 

“I do feel like it’s a good time for the industry and people in general because more people working on a problem increases the probability of it getting solved,” he said. 

Without revealing the specifics, Krishnamurthi mentioned that their primary customer is a multi-billion-dollar e-commerce company, and another major fan company has expressed interest in their sensing product, making them another key customer.

The post These Ex-Apple Employees from India are Building the Foundation Model for Robotics appeared first on AIM.

]]>
Reliance Jio Free 100 GB AI Cloud Storage to Take on Google Cloud https://analyticsindiamag.com/ai-breakthroughs/reliance-knows-how-to-scale-ai-in-india/ Fri, 30 Aug 2024 13:32:02 +0000 https://analyticsindiamag.com/?p=10134171

Reliance Jio’s massive user base is one of the most valuable assets in its AI journey.

The post Reliance Jio Free 100 GB AI Cloud Storage to Take on Google Cloud appeared first on AIM.

]]>

A ‘Jio moment’ in AI was long overdue – and Mukesh Ambani has finally delivered it! At its 47th Annual General Meeting (AGM), Reliance Industries unveiled a series of AI initiatives. 

Major highlights included the introduction of Jio Brain, Jio AI-Cloud, Jio Phone Call AI, and the vision for a national AI infrastructure. 

Ambani, the RIL chairman and managing director, emphasised AI’s central role in the company’s future, showcasing innovations like JioBrain, an AI service platform that will enhance operations across all Reliance enterprises. 

Jio also introduced an AI-powered feature designed to transcribe, summarise, and translate phone conversations in real time. 

To support these and other AI-driven initiatives, Reliance announced the establishment of gigawatt-scale AI data centres in Jamnagar, Gujarat, powered by green energy. These centres will be part of a broader plan to create AI inference facilities nationwide, ready to meet India’s growing demand for AI capabilities.

Reliance Jio’s massive user base is one of the most valuable assets in its AI journey. Jio has over 490 million customers, each consuming an average of 30 GB of data monthly. This contributes to Jio carrying a staggering 8% of global data traffic.

“Thanks to Jio, India is now the world’s largest data market,” said Ambani.

The immense data pool provides Jio with unparalleled insights and the ability to continuously refine and scale its AI solutions, ensuring they are both effective and widely applicable.

How Reliance Jio is Scaling AI in India

Unlike many AI-focused companies such as OpenAI, which primarily cater to B2B markets, Jio is targeting the consumer market with its AI products and services. This is a strategy also adopted by Meta. The company made AI available on all its social media platforms and WhatsApp, taking AI directly to consumers.

Leveraging its deep understanding of consumer needs across sectors like healthcare, telecommunications, manufacturing, and customer services, Reliance is tailoring AI solutions that directly impact the working environment of its customers. 

This B2C approach sets Jio apart from global competitors, positioning it as a leader in consumer-centric AI innovations. With Jio AI, Ambani is bringing AI directly to its 490 million Jio customers.

Jio 100 GB Free AI Cloud Storage to Take On Google

Interestingly, Jio’s entry into the AI cloud market is a direct challenge to established players like Google Cloud. As part of its AI strategy, Jio introduced the Jio AI-Cloud Welcome Offer, which provides up to 100 GB of free cloud storage for Jio users starting this Diwali. 

In contrast, Google Cloud currently offers its basic plan of 100 GB for INR 130 per month, with more extensive plans like 200 GB at INR 210 per month and 2 TB at INR 650 per month. They even offer an AI Premium plan with 2 TB of storage and access to advanced AI models like Gemini Advanced. 

Jio’s free cloud offering is a strong counter to these paid plans, positioning it as a cost-effective alternative in the market. This aggressive pricing strategy is a replica of Jio’s launch in the telecom sector, where free services were initially offered to attract a massive user base. 

When Jio launched its 4G services, it offered free data and calls, which quickly attracted millions of users. This approach helped Jio capture over 100 million mobile subscribers within just 170 days, with data consumption on its network surpassing that of the US and doubling that of China.

So, just as Jio revolutionised the telecom industry, Jio Cloud aims to disrupt the cloud market by making AI services more affordable and accessible to the masses.

Jio Could be the Next Big Hyperscaler

Jio is also targeting the B2B cloud segment with their inference facilities. Earlier in 2023, Reliance Industries Limited announced its partnership with US-based chipmaker NVIDIA to advance AI in India.  This would put Jio in direct competition with the likes of AWS, Google Cloud and Microsoft Azure.

The collaboration aims to build AI infrastructure in India that they claim will be more powerful than the fastest supercomputer in the country. Further, NVIDIA said that it will provide access to Reliance with GH200 Grace Hopper Superchip and NVIDIA DGX Cloud for exceptional performance.

However, Jio is not alone in the market. Earlier this year, Yotta Data Services received the first tranche of 4,000 GPUs from NVIDIA. It further plans to scale up its GPU stable to 32,768 units by the end of 2025. And in 2023, Tata Communications entered into a partnership with NVIDIA to develop a similar hyperscale infrastructure.

According to sources, the upcoming NVIDIA India Summit is poised to reveal an expansion of the chipmaker’s collaboration with Reliance. Speculation is around the introduction of the Blackwell GPU and the NIM platform, with potential plans for employee training and upskilling in AI.

Interestingly, NVIDIA has also recently collaborated with Infosys to bring AI-driven, customer-centric solutions to the telecommunications industry. The question now is, who’s the telecom giant set to benefit from this AI revolution?

The post Reliance Jio Free 100 GB AI Cloud Storage to Take on Google Cloud appeared first on AIM.

]]>
This Bengaluru-based AI Startup Knows How to Make Your Videos Viral https://analyticsindiamag.com/ai-breakthroughs/this-bengaluru-based-ai-startup-knows-how-to-make-your-videos-viral/ Fri, 30 Aug 2024 10:30:00 +0000 https://analyticsindiamag.com/?p=10134152

The team has built a metric called "virality score", which is derived from a dataset of 100,000 social media videos.

The post This Bengaluru-based AI Startup Knows How to Make Your Videos Viral appeared first on AIM.

]]>

Editing videos can be a tedious and time-consuming task, taking video editors hours and days to get their footage ready for release. Moreover, hiring a team of video editors is not what every content creator or small company wants to invest in. This is where vidyo.ai comes into the picture. 

Launched two years ago by Vedant Maheshwari and Kushagra Pandya, the platform has experienced remarkable growth, scaling from zero to approximately 300,000 monthly active users, and achieved a revenue milestone of $2 million. Notably, a significant portion of vidyo.ai’s revenue, about 85%, comes from the US market.

Most recently, the company was part of the Google for Startups Accelerator programme. The company hasn’t raised any funding since its seed round of $1.1 million in 2022.

The team has made significant strides in addressing one of the industry’s most persistent challenges: video editing.

Maheshwari and vidyo.ai’s journey into the realm of video content and social media began over eight years ago, during which he collaborated with creators and influencers to refine their content strategies across platforms like YouTube, TikTok, and Instagram. 

It was during this period that Maheshwari identified a major pain point: the time-consuming and complex nature of video editing.

This insight led to the creation of vidyo.ai, a platform designed to streamline the video editing process. The vision was to leverage AI to handle 80-90% of the editing, leaving users with the flexibility to make final adjustments before sharing their content on social media.

The platform caters to a diverse user base, including content creators, podcasters, and businesses seeking to generate short-form content with minimal effort. “We essentially enable them to let the AI edit their videos, and then they can publish directly to all social media platforms using our software,” Maheshwari added.

How vidyo.ai Works

vidyo.ai combines OpenAI’s models with proprietary algorithms to transform raw video footage into polished content. Users upload their videos to the platform, which then processes the content through a series of OpenAI prompts and proprietary data. This includes analysing what kind of videos perform well online, identifying potential hooks, and determining effective calls-to-action (CTAs).

“We run the video through multiple pipelines, identifying key hooks and combining them to create a final video. Our algorithms then score these videos based on their potential virality,” Maheshwari elaborated on the process. This “virality score” is derived from a dataset of 100,000 social media videos, allowing the platform to suggest the most promising clips for engagement.

When compared to other video editing tools like GoPro’s QuikApp and Magisto, vidyo.ai distinguishes itself with its frame-by-frame analysis of both video and audio content. Unlike these platforms, which often edit videos based on mood or music, vidyo.ai dives deeper into the content to optimise for social media performance.

“We do a comprehensive analysis of the content, ensuring that every aspect is optimised for virality,” Maheshwari said. This level of detail, combined with the ability to publish directly across multiple platforms, provides users with a unique advantage.

Challenges and Opportunities

Despite its success, vidyo.ai faces challenges common to Indian startups, particularly in securing funding. Maheshwari noted that while Indian VCs are cautious about investing in AI, preferring application-layer solutions over foundational work, US VCs often have a more aggressive approach.

“We’ve gone from zero to $2Mn in ARR in less than two years, which is remarkable. However, raising subsequent rounds of funding in India remains challenging due to a lack of clarity on how AI investments will pay off,” Maheshwari explained, saying that the VCs of the US would be ready to invest looking at this metric alone.

He also reflected on the possibility of starting the company in the US instead of India, citing potential benefits in terms of ease of operations and investor interest. “It often feels like running a company in India comes with more challenges compared to the US,” he admitted.

When it comes to finding the moat, vidyo.ai still stands on the tough ground of maybe Instagram, LinkedIn or TikTok releasing a similar feature on the base app. “There is definitely a little bit of platform risk,” but Maheshwari explained that it is unlikely that customers would shift to that since they don’t want to restrict themselves to the workflow of a creation platform. 

Comparing it to building something like Canva, Maheshwari said that vidyo.ai plans to expand its offerings, including the potential integration of generative AI features like deep-fakes and avatars. Currently, the team is also working on building an AI-based social media calendar, which would suggest to users content that would work the best in the coming week.

Maheshwari envisions building a comprehensive suite of tools for social media creation and publishing. “Our goal is to develop a full-stack solution that encompasses every aspect of social media content creation,” he said.

The post This Bengaluru-based AI Startup Knows How to Make Your Videos Viral appeared first on AIM.

]]>
AI Can Never Build Software on its Own  https://analyticsindiamag.com/ai-breakthroughs/ai-can-never-build-software-on-its-own/ Thu, 29 Aug 2024 06:34:47 +0000 https://analyticsindiamag.com/?p=10133988

Experts disagree.

The post AI Can Never Build Software on its Own  appeared first on AIM.

]]>

“Techies are claiming that AI can never build software on its own! That is blatantly false!” said Abacus AI chief Bindu Reddy, saying it already creates reliable test scripts, simple websites, and chatbots without any technical assistance. 

She anticipates that in the future, it will be able to do even more. 

Going beyond code editing platforms like Cursor, and AI agents like Cognition Labs’ Devin, Cosine’s Genie and others, alongside code assistant tools like OpenAI’s ChatGPT, Amazon Q, and GitHub Copilot are helping developers generate code snippets from natural language descriptions, streamlining the coding process, reducing errors and more. 

Amazon CEO Andy Jassy recently revealed that by leveraging Amazon Q, the company was able to save 4,500 developers-years of work. “Yes, the number is crazy, but real,” he posted on X.

In the realm of testing, AI-powered platforms such as Applitools and Testim automate the creation and execution of test cases, effectively mimicking human cognitive functions to identify UI discrepancies and performance issues. 

Looking ahead, as AI continues to evolve, its potential will expand even further. Future developments may enable AI to tackle more intricate software design tasks, optimise code for performance, and even contribute to user experience design. 


This evolution signifies a transformative shift in software development, where AI not only assists but actively participates in the creation of software solutions. GiHub Copilot Workspace, currently under technical preview, is designed to do exactly that. 

However, can AI write software on its own? It remains to be seen.

Limitations of AI

Many tech professionals assert that AI cannot independently develop software due to its reliance on human input and oversight. While AI can assist in software development through automation and optimisation, it fundamentally lacks the ability to autonomously create complex software systems without human intervention.

Discussing its dependence on human expertise, a user on X remarked, “It’s not ready.”

The user asked both Claude and ChatGPT to create a roulette game using HTML, CSS, and JavaScript—both attempts fell short. Claude’s version came closer, but only after extensive debugging. 

While both understood the requirements, they struggled with implementing the spin function, numeric display, and alignment. In response, Reddy said, “You can’t simply ask an LLM—you need to give it detailed instructions and build an entire agentic system around it!”

The utility of using an LLM for tasks requires extensive effort. Despite the model’s ability to produce code, the quality was often inadequate. After multiple iterations, the roulette game generated by ChatGPT still fell short, producing results that a junior programmer could easily outperform. 

It was noted that if an entire agentic system is necessary to support such requests, it might be more practical to write the code directly. This situation underscores the need for the technology to mature.

Despite all this, AI systems still require human expertise to define problems, gather and clean data, select appropriate algorithms, and train models. This process involves significant human judgement and creativity, which AI currently cannot replicate. For example, building an AI model requires understanding specific business needs and ensuring that the data used is relevant and accurate.

Moreover, AI operates within the parameters set by its developers. It can automate repetitive tasks and optimise certain processes, but the conceptualisation of software—such as user experience design, feature prioritisation, and strategic planning—remains a distinctly human domain. 

AI can enhance productivity and streamline workflows, yet it cannot independently navigate the complexities of software architecture or adapt to unforeseen challenges without human guidance. Said that, what Reddy has envisioned might come true someday, however, the technology is not there yet. 

Is the Demand for CS on the Going Down?

The demand for computer science professionals is experiencing a notable decline, influenced by various market dynamics and technological advancements. 

“This doesn’t mean that human engineers will disappear- we will still have human experts and supervisors. It just means that the demand for CS graduates will come down over time. We are already experiencing some of this today!” said Reddy. 

Although, Reddy’s prediction remains uncertain. Currently, the job market shows a noticeable oversaturation of CS graduates, with many recent graduates struggling to secure positions and facing heightened competition for available roles. 

As enrollment in CS programs continues to rise, the number of graduates may exceed job creation, resulting in fewer opportunities for new entrants, according to a Reddit post.

This reflects a broader market correction after a period of aggressive hiring during the pandemic, driven by inflated valuations and unsustainable growth strategies.

Additionally, the rise of AI tools is reshaping job roles within the tech industry. 

Despite these challenges, the overall need for qualified CS talent remains, particularly in medium to large companies that continue to seek skilled professionals to navigate the evolving tech landscape.

Moreover, software engineers need to upskill faster than others and must continually do so to stay competitive. While experience is valuable, staying updated is essential for maintaining an edge in the industry.

The post AI Can Never Build Software on its Own  appeared first on AIM.

]]>
India’s Rapid Adoption of AI is Transforming Audit Services https://analyticsindiamag.com/ai-breakthroughs/indias-rapid-adoption-of-ai-is-transforming-audit-services/ Wed, 28 Aug 2024 10:34:29 +0000 https://analyticsindiamag.com/?p=10133938

A Chennai-based startup, Fhero Accounting, has leveraged AI to automate repetitive tasks and streamline data entry by automatically capturing and categorising financial transactions.

The post India’s Rapid Adoption of AI is Transforming Audit Services appeared first on AIM.

]]>

As AI sweeps through practically every industry, audit and accounting are getting their due makeovers. However, not all firms are keeping pace with its adoption. 

According to the 2024 Generative AI in Professional Services report from the Thomson Reuters Institute, 30% of tax and accounting firms are still weighing the decision to adopt GenAI tools, while nearly half (49%) aren’t planning to use them anytime soon. 

However, a few big accounting firms have led the way with AI adoption.

Ernst & Young (EY), one of the largest accounting firms globally, has introduced AI into their audit services, deploying a powerful tool that reviews and analyses contracts and documents with a speed and precision that human efforts simply can’t match. 

KPMG has also embraced the AI revolution with its innovative platform, KPMG Ignite. This tech doesn’t just sift through data, it enhances the quality of insights delivered to clients. It also offers predictive insights, spots trends, and provides strategic guidance that pushes the boundaries of traditional accounting.  

Deloitte, too, is a pioneer in the AI accounting frontier. 

Working on similar lines, a Chennai-based startup, Fhero Accounting has leveraged AI in automating some of the repetitive tasks, to streamline data entry by automatically capturing and categorising financial transactions from various sources like invoices, receipts, and bank statements. 

This reduces the need for manual input, minimises errors, and ensures accurate, up-to-date records. 

Further, AI-driven analytics offer deeper insights into financial trends, enabling proactive decision-making for the clients. This enhances efficiency, accuracy, and client satisfaction.

How Does Automating Accounts Work?

The audit and accounting process has multiple stages, starting with data entry. Fhero uses an advanced optical character recognition (OCR) technology. So, when bills and documents are uploaded, the software automatically reads and processes them, seamlessly integrating the data into the client’s accounting system.

Next, a proprietary tool takes over, generating MIS reports and flagging discrepancies, like missed tax deductions or incorrect rates. It ensures all statutory compliances are met, even for backdated entries, automatically highlighting issues as they arise. It’s like having someone work on these tasks while you sleep.  

“This efficiency means what once took six to eight hours can now be done in just five to ten minutes, depending on the client’s size and transaction volume. For large clients, like those handling up to 500 crores annually, this process saves days of manual effort by completing tasks in mere minutes,” said Prashant Bothra, the CEO & founder of Fhero Accounting, in an exclusive interview with AIM

For instance, while working for the e-commerce industry, the firm ensures that the amount received from platforms like Myntra, Nykaa, and Flipkart matches the companies’ books, verifying every detail in seconds. 

“Our MIS reports are so comprehensive that you can drill down from a general P&L statement to the invoice level, making it easy for business owners to manage everything without diving into the complexities of accounting software,” Bothra added. 

Further, Fhero Accounting is also contemplating bringing LLMs into its business, aiming to enhance the value they deliver to their clients. 

Globally, startups like Klarity and Tabular function similarly, offering clients finished ledger entries that users export with just one click. 

India’s ‘Outsourcing’ Mindset 

Indians aged over 50 years are generally less inclined to outsource accounting services, primarily due to a combination of traditional values, a preference for personal relationships in financial matters, and a lack of familiarity with digital platforms.

Reflecting on the same, Bothra highlighted, “One of the major challenges we’ve observed in India, especially in the early stages of our work, is a cultural shift that is beginning to change only now.” 

He noted that major industrialisation in India happened in the 1990s, during the era of liberalisation, privatisation, and globalisation. This period saw Indian companies going global and the rise of many entrepreneurs who are now in their 50s or 60s. 

But as the next generation steps into leadership roles, a shift in attitudes towards outsourcing is observed. 

The COVID-19 pandemic has accelerated this change, as businesses now seek more value from their accountants rather than just having them on-site to handle routine tasks. 

“Previously, accountants were often seen as mere agents of the government, managing tax payments on behalf of companies. Today, business leaders are looking for deeper insights into their numbers, particularly in areas like cost, to inform their decision-making,” Fhero Accounting CEO said. 

What’s Next?

Going ahead, the startup plans to shift focus to building tech that customers can interact with directly so that its clients can have access to their numbers right at their fingertips, on their phones through an app. 

“As we input data, it will automatically sync with theirs. The app will also allow them to ask questions and handle everything they need from an accountant in one unified platform, rather than relying on separate apps for communication and report viewing.”

Working along similar lines, but exclusively for individuals, CRED has already introduced a feature called CRED Money. It leverages advanced data science to analyse spending patterns, provide personalised insights, and promote proactive personal financial management.

By making members more aware of their finances, CRED Money empowers them to take control and make informed financial decisions.

As India continues to up its AI game in finance, RBI governor Shaktikanta Das recently highlighted that the integration of AI into financial services brings significant opportunities for all stakeholders.

Speaking in Bengaluru at the Global Conference on Digital Public Infrastructure and Emerging Technologies, Das said, “Today, AI has forayed into the financial sector in the form of services like chatbots, internal data processing for intelligent alerts, fraud risk management, credit modelling, and other processes.” 

The post India’s Rapid Adoption of AI is Transforming Audit Services appeared first on AIM.

]]>
The Quiet Shift from Discord to Midjourney’s Web Platform https://analyticsindiamag.com/ai-breakthroughs/the-quiet-shift-from-discord-to-midjourneys-web-platform/ Mon, 26 Aug 2024 11:32:55 +0000 https://analyticsindiamag.com/?p=10133758

Users are ditching Discord for a more streamlined experience, sparking a surge in new sign-ups on Midjourney’s web.

The post The Quiet Shift from Discord to Midjourney’s Web Platform appeared first on AIM.

]]>

Midjourney’s recent free trial feature on its web browser has sparked a significant shift in user engagement from its Discord server to the website. In its official announcement post on X, users appreciated a more streamlined web experience than the user interface offered on Discord, highlighting the evolving preferences of Midjourney’s user base. 

Boom in Midjourney’s User Base

Midjourney has witnessed remarkable growth since its launch, with nearly 1 million members on its subreddit and a substantial presence on Discord. Approximately 50% of its users engaged exclusively through Discord, which became the primary driver for new users on the platform. 

However, recent comments under the announcement suggest a leaning towards its website instead of the Discord channel. 

Reddit user ‘Anne-o-me’ commented on the new update by saying, “ Finally, now I will actually use it.” Similarly, an AI artist highlighted on X that there were people interested in using Midjourney but didn’t want to use Discord hinting at an increased user base for Midjourney on their web version.

The demographic profile of Midjourney’s users further emphasises its appeal. A significant portion, 37.76% falls within the 25-34 age group, while 26.18% are aged 18-25. 

This data suggests that Midjourney resonates particularly well with younger adults, who are often at the forefront of adopting new technologies and digital art forms. The platform’s ability to attract this tech-savvy demographic is crucial for its long-term growth.

Artiexus, a PhD student, shed light on the user-friendly design of the website on the X announcement comment thread.

Feedback from users on platforms like X and Reddit has been overwhelmingly positive, with many praising the web app’s functionality and ease of use. This shift is not merely about convenience, it reflects a broader trend in how users engage with digital tools too. 

What About Competing Platforms?

Midjourney’s evolution invites key comparisons with other image-generation platforms that have already established web-based interfaces. For instance, DALL.E 2 operates exclusively on the web, providing a seamless and approachable experience for casual users without the need to navigate a chat platform. 

Similarly, Stable Diffusion offers both web access and local installations, allowing for greater flexibility in usage. 

Craiyon AI stands out as a completely free web-based tool, fostering community sharing through social media without the constraints of a dedicated platform. 

FLUX.1 AI, created by the German startup Black Forest Labs, is rapidly establishing itself as a leader in the field, excelling in both image quality and creative versatility. The company has partnered with GrokAI to explore the capabilities of the FLUX.1 model, aiming to enhance Grok’s functionalities on the X platform.

As Midjourney transitions to a web platform, it not only enhances accessibility but also positions itself to compete more effectively in the evolving landscape of AI image generation.

Midjourney Reigns Big

Midjourney has recently introduced significant updates that enhance its capabilities and user experience. 

Midjourney 6.1 was released with a slew of improvements, including better image quality, coherence and text rendering. The software also includes new upscaling and personalisation models, which allow users to create even more impressive images. 

One of the standout features of the update is the enhanced image quality. Midjourney 6.1 now generates images with fewer pixel artefacts and more realistic textures, resulting in visually stunning and professional-looking outputs. 

To offer greater user flexibility, Midjourney also introduced new upscalers. These tools allow its users to increase the resolution of their generated images, making them suitable for print and digital media. Furthermore, the AI’s text rendering capabilities have been refined enabling users to create visually appealing graphics with integrated text elements.

https://twitter.com/LeoEspinozaCine/status/1818780361838432547

As more users gravitate toward the web, Midjourney is poised to capture a larger audience that values accessibility and efficiency. The free trial feature serves as a gateway for new users to explore the platform without the initial commitment of a paid subscription, facilitating a smoother onboarding process.

The post The Quiet Shift from Discord to Midjourney’s Web Platform appeared first on AIM.

]]>
This AI Startup Wants to Fix Your Code Bugs. And Got YC Backing for It https://analyticsindiamag.com/ai-breakthroughs/this-ai-startup-wants-to-fix-your-code-bugs-and-got-yc-backing-for-it/ Mon, 26 Aug 2024 04:49:46 +0000 https://analyticsindiamag.com/?p=10133723

Tata 1mg has written hundreds of custom policies on CodeAnt platform.

The post This AI Startup Wants to Fix Your Code Bugs. And Got YC Backing for It appeared first on AIM.

]]>

Getting into YC takes work. Amartya Jha, the co-founder and CEO of CodeAnt AI, narrated in a post how he and his co-founder Chinmay Bharti got rejected the first time because they couldn’t explain their product well to the investors. Despite this and the fear of getting blacklisted, Jha and Bharti made a 45-minute video explaining their product and sent it to YC  again. 

The following week, they managed to get another interview with YC and were finally selected! But what is so good about what they are building? 

“Developers spend 20 to 30% of their time just reviewing someone else’s code. Most of the time, they simply say, ‘It looks good, just merge it,’ without delving deeper,” Jha explained while speaking with AIM. This leads to bugs and security vulnerabilities making their way into production. 

With generative AI, coding is undeniably getting easier. At the same time, the quality of code produced by these AI generators is still far from those of the human coders. This makes code review crucial for every organisation and this is where CodeAnt AI comes into the picture.

CodeAnt’s AI-driven code review system can drastically reduce these vulnerabilities by automating the review process and identifying potential issues before they reach the customer.

Founded in October 2023, CodeAnt AI is already making waves in the industry by automating code reviews with AI. The journey began at Entrepreneur First, where Jha met Bharti. The company quickly gained traction, securing a spot in YC by November 2023. 

By February 2024, they had launched their first product, attracting major clients like Tata 1mg, India’s largest online pharmacy, and Cipla, one of the country’s biggest pharmaceutical companies. “These companies were amazed that such a solution even existed,” Jha recalled. “And both contracts were paid, not just trials.”

What’s the Moat?

What sets CodeAnt AI apart from other players in the market such as CodeRabbit and SonarSource, which joined the market before CodeAnt, is its unique approach to code review. The company has developed its own dataset of 30,000+ checks, meticulously created to address every possible code commit scenario. 

“This is our pure IP,” Jha said. “We wrote our own algorithms to analyse code, understand its flow, and identify areas that need improvement. Our AI, combined with our custom engineering solutions, runs these checks on every code commit, offering unprecedented accuracy.” 

The platform also supports more than 30 programming languages.

While competition in the AI-driven code review space is growing, Jha remains confident in CodeAnt AI’s unique value proposition. Its biggest competitor is SonarSoruce, but interestingly, its lead investor is also one of the investors of CodeAnt. 

CodeRabbit relies solely on AI, which leads to a lot of hallucinations and false positives. “Our approach, which combines AI with deterministic policies, gives us a significant edge,” said Jha.

“The demand for tools like ours is only going to grow as AI-generated code becomes more prevalent,”  Jha added that tools like GitHub Copilot or Cursor are far from generating accurate code anytime soon.

Jha further elaborated on the limitations of competitors that use AI exclusively. When developers give a large codebase to AI, it tends to hallucinate. 

To mitigate this, CodeAnt built a foundation of hard-coded checks, which are further enhanced by AI. This reduces false positives and ensures that the code is not only correct but also secure and compliant with industry standards.

Amartya Jha speaking at Bangalore Python Meetup

Stands Out with Customisation

One of CodeAnt AI’s standout features is its ability to allow enterprises to input their own data and create custom policies. “For example, Tata 1mg has written hundreds of custom policies on our platform. Before CodeAnt AI, they would have had to build a similar platform themselves to enforce these policies,” Jha said, and added that Tata 1mg has written code in Python for the last eight years.

Now, they can simply use CodeAnt platform to ensure that their code complies with their specific guidelines, which is especially valuable for large organisations with complex codebases.

This level of customisation is not only beneficial for code quality but also for compliance with security frameworks. “Imagine a tool that knows exactly how code should be written in your organisation and can flag issues related to SOC 2 compliance, GDPR, or data privacy. It makes the process of maintaining compliance much easier and more efficient,” Jha added.

In the future, Jha will focus on expanding CodeAnt AI’s capabilities, particularly in the areas of security and code quality. “We’re currently working with some of the largest security and backup companies in the world, helping them review their code for vulnerabilities. Our team may be small, but we’re making a big impact,” he said.

Despite its San Francisco roots, CodeAnt AI maintains a strong presence in India, with Jha overseeing operations from Bangalore. The company is poised for further growth, with plans to expand its product offerings and deepen its market presence. 

“We are going deeper into security and code quality. We will soon be making announcements about new products and partnerships, so stay tuned,” Jha hinted.

The post This AI Startup Wants to Fix Your Code Bugs. And Got YC Backing for It appeared first on AIM.

]]>
Why Jailbreaking is Required for AI Safety https://analyticsindiamag.com/industry-insights/why-jailbreaking-is-required-for-ai-safety/ Fri, 23 Aug 2024 08:30:00 +0000 https://analyticsindiamag.com/?p=10133600

Companies like OpenAI, Google, Mistral, and Anthropic have publicly stated that they red team their systems and use external contractors, including jailbreakers, to undertake pentesting.

The post Why Jailbreaking is Required for AI Safety appeared first on AIM.

]]>

While the concept of jailbreaking, particularly AI jailbreaking, is often associated with threat actors, ethical jailbreakers have proven that it’s one of the best ways to test AI systems over implementing safety policies.

Recently, AIM spoke about how AI jailbreaking could turn into a billion-dollar industry, thanks to their reliability in testing the safety parameters of AI systems. It seems that the idea is catching on, as a prominent jailbreaker recently announced getting funded by a16z co-founder Marc Andreessen.

https://twitter.com/elder_plinius/status/1825190641745445268

Interestingly, Andreessen has been quite vocal about the AI safety discussion. He previously said, “There is a whole profession of ‘AI safety expert’, ‘AI ethicist’, ‘AI risk researcher’. They are paid to be doomers, and their statements should be processed appropriately.”

So, what’s changed Andreessen’s stance on AI safety? 

Well, breaking it down to brass tacks, Andreessen’s point is more critical of the moral panic stemming from AI, where common talking points are the loss of jobs or damage to society.

However, the reality of the matter, as Andreessen states, is that AI, including generative AI, has been key in improving several processes, whether you look at it from a business or personal perspective. 

Furthermore, we’ve previously spoken about whether laws and regulations imposed, like the California AI Bill which is set to be voted on this week, actually work in terms of ensuring the safety of AI systems. Especially as these are usually reactionary policies coming from a place of panic, rather than assessing the situation from a technical standpoint.

Jailbreaking for Community Benefit

In an exclusive interaction with AIM, the now Andreessen-funded jailbreaker, going by the name Pliny the Liberator, said that part of the reason why jailbreaking is so important is that a small group of companies should not be allowed to sanitise the information people are provided by AI. 

“I do it both for the fun/challenge and to spread awareness and liberate the models and the information they hold. I don’t like that a small group is arbitrarily deciding what type of information we’re ‘allowed’ to access/process,” he said.

This is reflected in the fact that rather than monetising his work, Pliny has fostered a community of like-minded AI enthusiasts, jailbreakers and industry leaders, with his BASI Discord server (having over 8,000 members) constantly active with jailbreaking prompts and challenges.

Due to this, Andreessen’s decision to offer funding to someone from the jailbreaking community received quite a lot of support. 

Major players in the game, including Andreessen himself, Musk, Peter Thiel, Yann LeCun and Sam Altman, have echoed similar sentiments of building models with an overall aim of improving society. This can only be done if these initiatives are funded independently, rather than by larger entities that can influence results.

Or, as one user on X called it: “No strings attached funding.” Janus, another prominent AI enthusiast and a mentor for the SERI MATS programme, said that their mentees struggled to get funding, which tanked their productivity, as opposed to when the programme ended, and they had the freedom to work on their own interests.

“If you are a rich person or fund who wants to see interesting things happen in the world, consider giving no-strings-attached donations to creatives who have demonstrated their competence and ability to create value even without monetary return, instead of encouraging them to make a startup, submit a grant application, etc.,” they said.   

Jailbreaking Succeeds Where Safety Policies Fail

Making overarching policies does less to actually address safety issues with LLMs. Specificities are where the magic lies. In previous conversations with industry leaders, AIM has found that many find it difficult to answer a simple question: how do you ensure the safety of your generative AI system when it can be jailbroken to reveal confidential information?

While answers vary from constant testing to trusting third-party LLM providers, the conclusion is the same. There is no way to guarantee that this won’t happen.

Coming back to Andreessen, many seem to have come to the same conclusion, as while he’s against overarching reactionary safety policies, there are ways to actually ensure the safety of these systems – like jailbreaking.

This is proven by the fact that companies like OpenAI, Google, Mistral and Anthropic have publicly stated that they red team their systems and make use of external contractors to undertake pentesting.

However, jailbreaking also serves a dual purpose. While companies take advantage of it to test their own systems, there’s a larger conversation going on about how much control these companies should have on their systems. This is fuelled by the recent release of Grok 2, which offers image generation capabilities with little to no guardrails.

As Andreessen had said in his essay, ‘Why AI Will Save the World’, “If you don’t agree with the prevailing niche morality that is being imposed on both social media and AI via ever-intensifying speech codes, you should also realise that the fight over what AI is allowed to say/generate will be even more important – by a lot – than the fight over social media censorship.

“AI is highly likely to be the control layer for everything in the world.”

The post Why Jailbreaking is Required for AI Safety appeared first on AIM.

]]>
Apple Intelligence to Get a Robotic Arm Soon  https://analyticsindiamag.com/ai-breakthroughs/apple-intelligence-to-get-a-robotic-arm-soon/ Thu, 22 Aug 2024 08:30:00 +0000 https://analyticsindiamag.com/?p=10133492

Foxconn is set to develop Apple's first robotics product, an iPad integrated with a robotic arm.

The post Apple Intelligence to Get a Robotic Arm Soon  appeared first on AIM.

]]>

After having pulled the plug on its decade-long ‘Project Titan’ – Apple’s ambitious self-driving car project – earlier this year, the Cupertino giant is now ambitiously moving forward with its robotics project starting with a ‘tabletop robot’. 

However, Apple is finally looking to build on its robotics dream with the new home robot.

From Cars to Home

Apple’s robotics venture was highly aspirational from the beginning. The company had plans to build cars that could achieve Level 5 autonomy, and even have a controller or app for low-speed driving with an Apple command center for remote assistance. 

However, with the evolution of the autonomous vehicles market where prominent players such as Waymo, Baidu and now Tesla too, are spearheading through, Apple’s choice of abandoning it, seemed like the best option for now. 

Additionally, with their increasing focus in generative AI ventures, robotics seemed to have taken a backseat. While the team was disbanded, the same team is reportedly leading the tabletop robot project. 

“They’re all in on robotics right now. Robotics is the next big thing at Apple. They’re talking about humanoids. They’re talking about mobile robots to go around your home, and now they’re talking about this home device,” said Mark Gurman, a reputed journalist who covers Apple devices extensively.  

Apple’s Robotic Goals

Apple’s new product is pretty much a slim robotic arm capable of moving a large screen (an i-pad), tilting it up and down, and enabling a full 360-degree rotation. The product is estimated to be priced around $1000. 

Away from self-driving cars, the robotic arm seems like a more realistic goal. A fully voice-controlled system that can move around the table will most likely be powered by Apple Intelligence. 

The device would identify different voices in the house and use a center stage-like feature to turn and face users whenever they speak.

Foxconn to Power Apple Robots

Recent reports indicate that Hongzhun, which is part of electronics manufacturing giant Foxconn, a key casing manufacturer under the Hon Hai Group, has been added to Apple’s robot supply chain partnership. It was chosen due to its expertise in mass producing robot-related components. 

The companies will continue to collaborate with further developments from Hongzhun being anticipated.

Interestingly, Foxconn’s partnership with Apple has been a long-standing one, beginning as early as the 2000s with the production of iMacs. This relationship deepened over time, with Foxconn becoming the primary assembler of many Apple products, including the iPhone, which is a significant revenue driver for both companies.

Foxconn has also set up factories in India for iPhone manufacturing. It is also possible that in the future, India may also serve as the hub for Apple robot product manufacturing. 

Meanwhile, Foxconn CEO Young Liu recently visited India and met many prominent people including Prime Minister Narendra Modi. His visit to Bengaluru has strengthened his relationship with the country, probably hinting that Apple’s future robotic device manufacturing could take place there. 

Far-Fetched Goals Will Continue

Apple’s increased focus on the robotics sector is part of the company’s efforts to explore new product categories for revenue generation. 

This becomes important especially after the recent Apple Vision Pro debacle where the company witnessed a drop in sales by 75%. One of the main reasons for the drop is attributed to the exorbitant price of close to $4000 for a Vision Pro set. Furthermore, the hype and the interest around the device have also drastically declined. 

With the robot tabletop that is estimated to be launched in 2026, or later, it is likely that Apple would extensively invest in robotics. When the car dream shut down, people were speculating Apple’s entry into the humanoid segment. And with the way the humanoid race is gaining steam, that wouldn’t be a surprise either. 

The post Apple Intelligence to Get a Robotic Arm Soon  appeared first on AIM.

]]>
Is Sarvam 2B Good Enough for Indian AI Developers? https://analyticsindiamag.com/ai-breakthroughs/is-sarvam-2b-good-enough-for-indian-ai-developers/ Wed, 21 Aug 2024 11:42:56 +0000 https://analyticsindiamag.com/?p=10133383

One good thing is that this will mark the end of ‘Indic Llama Garbaggio’.

The post Is Sarvam 2B Good Enough for Indian AI Developers? appeared first on AIM.

]]>

Sarvam AI, or what we call the OpenAI of India, is probably one of the most celebrated startups by Indian developers. Its focus on Indian languages and open source philosophy has been the cornerstone for AI development in India. 

Recently, at Sarvam’s recent launch event, the company announced Sarvam 2B, its open source foundational model trained on 4 trillion tokens of data with 50% of it in 10 Indic languages. 

According to co-founder Vivek Raghavan, Sarvam 2B is among other small language models in the market, such as Microsoft’s Phi-3, Meta’s Llama 3, and Google’s Gemma Models, which were all decent enough for Indic AI development. 

But is Sarvam 2B really good enough for Indic AI developers?

First, to clear the air, the model uses a slightly modified Llama architecture, but is trained from scratch. This means there is no continued pre-training or distillation, making it the first foundational model built entirely for Indic languages. 

We can also call it an ‘Indic Llama trained from scratch’. This is several steps ahead of the company’s earlier approach of building OpenHathi, which was a model fine-tuned on top of Llama 2.

When compared to other models, Sarvam 2B outperforms GPT-4o and gives tough competition to Llama 3.1-8B and Gemma-2-9B in speed and accuracy for Indic languages.

The End of Indic Llama?

Models from Google, Meta, or OpenAI, have been increasingly trying to do better at Indic language translation and inference, but have been failing big time. Indian developers have been fine-tuning these models on billions of tokens of Indian language data, but the quality hasn’t necessarily improved. 

Plus, these models have rarely been adopted for any production. “The best thing about Sarvam’s 2B model is that it puts an end to the Indic Llama garbage,” said Raj Dabre, a prominent researcher at NICT in Kyoto and a visiting professor at IIT Bombay. 

Pratik Desai, the founder of KissanAI, agrees with Dabre and said that none of those models were useful for production. He still has those same doubts about Sarvam 2B, even though it is a great stepping stone towards better Indic models.

“I doubt it will be used in production. Actually none of the 2B models are good enough to serve paid customers,” said Desai, adding that even models such as Gemma 2B are not useful – not even for English use cases.

Desai further explained that LLMs with 2 billion parameters struggle with following consistency and instructions even for summarisation, and are only good enough as helper models. But since Sarvam is also planning to release more versions in the future, these issues might get resolved eventually. 

Regardless, the small size of the models make them useful for edge use cases and on-device LLMs. This, along with an audio model called Shuka released by Sarvam, would be helpful for the developer community to experiment with. But taking these models into production is still a hard decision to make as the output can become unreliable for several use cases, as is the case with other English models of the same size.

Along with this, the company also released Samvaad-Hi-v1 dataset, a collection of meticulously curated 100,000 high-quality English, Hindi, and Hinglish conversations, which is an invaluable resource for researchers and developers. 

A Good Start

One of the key aspects of Sarvam 2B’s brilliant performance is it is completely trained on synthetic data, which is similar to Microsoft’s Phi series of models. Even though the debate around using synthetic data is still on, Raghavan said that it is essential to use synthetic data for Indian languages as the data is not available on the open web. 

Furthermore, using synthetic data generated by the Sarvam 2B model would also help in improving the model’s capabilities in the future, or would make it run in circles. 

Raghavan has also said that the cost of running Sarvam 2B would be a tenth of other models in the industry. With APIs and Sarvam Agents in the pipeline, it is clear that Sarvam’s sovereign AI approach is helping them build great models for Indian developers. 

The post Is Sarvam 2B Good Enough for Indian AI Developers? appeared first on AIM.

]]>
How a Stanford Dropout’s Startup, Now Backed by OpenAI, is Shaping the Future of Education https://analyticsindiamag.com/ai-breakthroughs/how-a-stanford-dropouts-startup-now-backed-by-openai-is-shaping-the-future-of-education/ Tue, 20 Aug 2024 12:30:00 +0000 https://analyticsindiamag.com/?p=10133273

Heeyo is leveraging its AI platform, designed for 3 to 11-year-olds, to tackle one of the biggest challenges in social-emotional learning for children.

The post How a Stanford Dropout’s Startup, Now Backed by OpenAI, is Shaping the Future of Education appeared first on AIM.

]]>

Xiaoyin Qu is not your typical AI startup founder. The super-energetic entrepreneur recently unveiled a new learning app for kids called Heeyo AI

Dropping out of Stanford Graduate School of Business to start her entrepreneurial venture and eventually in the learning and teaching space, stands testament to her unwavering commitment to revolutionise education with AI. 

During her late-night discussion with AIM, Qu enthusiastically explained the app, revealing that its vision stemmed from her own childhood desire for a coach who could guide her inventive and curious mind.

“When you are 3 years old, you are just learning to talk. When you are 6 to 7, you can say multiple sentences. When you are 8 or 9, is when you can handle a lot of very open-ended creation. So, that’s why AI actually needs to change the way they talk to you based on your cognitive abilities, and we have both systems that can cater to that,” said Qu, founder and CEO of Heeyo AI, in an exclusive interaction with AIM. 

Defining themselves as ‘smart AI friends that help kids learn,’ California-based Heeyo AI received a $3.5M seed fund from some of the biggest tech players in the market including OpenAI startup fund, Amazon Alexa Fund, Par VC, Charge Ventures, StoryHouse Ventures and other top VCs. The startup is also funded by teams from top universities including Stanford and Harvard. 

Heeyo utilises various AI models, including text-to-speech, speech recognition, text-to-image, and text-to-music. They use specific models like OpenAI and Stable Diffusion for tasks such as content creation, translation, and image generation, with each project involving multiple models in a step-by-step process.

The learning platform provides over 2000+ learning games where the AI figure can speak in over 20 languages through fun avatars such as pandas and dragons.    

Social Emotional Learning 

Heeyo is leveraging its AI platform, which is ideal for 3 to 11 year-olds, to address one of the biggest challenges of social emotional learning in children which is essentially helping kids improve their social skills. 

Qu highlighted the importance of teaching children how to make friends, express themselves, handle rejection, and respond appropriately when meeting someone new, and believes Heeyo will address all that. 

“That’s a relatively new thing that all Silicon Valley parents want to be interested in,” said Qu. 

Furthermore, Heeyo allows anyone including parents, educators, and kids to design their own learning games, and make them based on different cultures. “When you want to do a trivia on some Hindu traditions or Chinese traditions, you can actually build it up. In the US, we see some people are doing, like, Bible stuff. It’s like Bible parenting,” said Qu.

Previous to this, Qu started Run The World, a leading virtual event platform, which she eventually sold last year. She has also led product management at Facebook and Instagram and led marketing & business development for Atlassian’s Asian market. 

AI for the Physical World

Qu is also ecstatic with the kind of funding Heeyo has received, and considers to be very lucky to have OpenAI invested in it. Not just OpenAI, but being backed by Amazon Alexa too, the potential for future products is plenty. 

Qu highlighted how with many children already using Alexa, she believes that their content and interactive experiences could integrate well with Alexa to help even more kids.

“In the long run, we plan to partner with toy companies, like family animals, and some of that. So, that’s going to happen. But, right now, we’re starting more from a content standpoint to make sure we have the right content,” said Qu. 

Interestingly, children’s friendly AI robot Moxie, was also built with the intention of helping children develop social skills. Talking about it, Qu mentioned that Moxie is built for kids with special needs such as autism and has seen many adopt them. However, the price range of it makes it inaccessible for all. 

“We could imagine that Moxie costs $899 and that’s pretty much not affordable for most families,” said Qu, who also said that Heeyo is accessible for all kids. The application is available for free and can be downloaded via the Apple or Google Play store. 

With AI entering the education domain, a number of universities are already tieing up with companies offering AI-related educational services. Recently, Andrej Karpathy, one of the former founders of OpenAI, launched Eureka Labs, which is an AI education company that aims to transform traditional teaching methods with generative AI. 

Furthermore, with AI helping with automating a number of administrative and repetitive tasks, teachers are free’d up for dedicating time for personalised interactions with students, thereby enhancing the quality of education. 

With advanced interactive modes of teaching emerging through AI, learning and even teaching is continuously transforming for the better. 

Heeyo AI not only offers learning games but also allows parents, educators, and kids to design their own games, including those based on specific ethnic cultures and traditions.

The post How a Stanford Dropout’s Startup, Now Backed by OpenAI, is Shaping the Future of Education appeared first on AIM.

]]>
Google’s Imagen 3 Can Only Dream of Achieving What Grok 2 Just Did https://analyticsindiamag.com/ai-breakthroughs/googles-imagine-3-can-only-dream-of-achieving-what-grok-2-just-did/ Tue, 20 Aug 2024 09:14:03 +0000 https://analyticsindiamag.com/?p=10133214

xAI’s latest Grok feature and Google’s Imagen 3 is exactly as chaotic as you might expect. 

The post Google’s Imagen 3 Can Only Dream of Achieving What Grok 2 Just Did appeared first on AIM.

]]>

Imagen 3 and Grok 2 (with FLUX.1 integration) are both advanced AI text-to-image generators, yet they differ significantly in their approaches and applications, with the latter having little to none guardrails or safeguards. 

The outcome: This feature from Grok 2 allows users to create uncensored deepfakes, like vice president Kamala Harris and Donald Trump posing as a couple, sparking deep concerns. 

Google, on the other hand, can only dream of doing such things with Imagen 3. Gmail creator Paul Buchheit, in a recent interview, said AI has the potential to provoke regulators, which is a significant concern for Google, given the tech giant has already been slapped with billions of dollars in lawsuits. 

“They had a version of Dolly called Image Gen, and it was prohibited from making human form,” said Buchheit discussing Google’s Imagen, and how the company has been struggling to dominate the AI landscape, given they have all the necessary resources and early start in AI.  

Cautious or Controversial?

In the case of Imagen 3, there are some restrictions in place as the tool refuses to generate images of public figures or weapons. While it won’t generate named copyrighted characters, you can bypass this by describing the character you want to create.

Earlier this year, Alphabet lost $90 billion in market value amid controversy over its generative AI product. Users reported it was producing racially inaccurate images and claimed that the chatbot refused to identify negative historical figures.

With Imagen 3, the tech giant does not want a repeat of the fiasco. However, Reddit users have criticised Imagen 3 for being too restrictive in what images it is allowed to generate.

The model is trained on a large dataset comprising images, text and associated annotations. To ensure quality and safety standards, the company employs a multi-stage filtering process. This process begins by removing unsafe, violent, or low-quality images. Then it eliminates AI-generated images to prevent the model from learning artefacts or biases commonly found in such images. 

The team employs deduplication pipelines and strategically down-weights similar images to minimise the risk of outputs overfitting specific elements of the training data.

Each image in the dataset is paired with original and synthetic captions generated by multiple Gemini models using diverse prompts. Filters are applied to ensure safety and remove personally identifiable information.

Grok, on the other hand, developed by Musk’s xAI, offers a more open-ended approach.

Grok, like most LLMs today, was pre-trained on a diverse range of text data from publicly available sources on the internet up to Q3 2023, as well as datasets curated by ‘AI tutors’, who are human reviewers. Importantly, Grok was not pre-trained on data from X, including public posts. Its training journey is outlined on the xAI website and model card.

When responding to user queries, Grok has a unique feature that enables it to decide whether to search X public posts or conduct a real-time web search on the Internet. This capability allows Grok to provide up-to-date information and insights on a wide range of topics by accessing real-time public X posts.

This flexibility makes the model more versatile, but also more controversial, as it can create a wider range of content, including potentially misleading or inappropriate images.

xAI’s Grok chatbot now lets you create images from text prompts and publish them to X and, so far, the rollout seems as chaotic as everything else.  

It has been used to generate all sorts of wild content, including images with drugs, violence, and public figures doing questionable things

Other experiments conducted by users on X show that even if Grok does refuse to generate something, loopholes are easy to find. That leaves very few safeguards against spitting out gory images if given the proper prompts, according to X user Christian Montessori. 

And while Musk is aware of these issues, he seems to find them amusing, saying the tool is allowing people “to have some fun.”

Given xAI is a startup, they’re able to release an unfiltered model as they aren’t as liable to consequences as companies like Google, which is a publicly listed company and much larger. 

Perhaps most immediately, Grok isn’t the only source of misleading AI images. Open-source tools like Stable Diffusion can be modified to create a wide range of content with few restrictions. However, it’s uncommon for a major tech company to use this approach with an online chatbot—Google even halted Gemini’s image generation feature altogether after an embarrassing attempt to overcorrect for racial and gender stereotypes.

What next?

The text accompanying some AI-generated images suggests that Grok is integrated with FLUX.1, a new diffusion model developed by Black Forest Labs, an AI startup founded by ex-Stability AI engineers.

xAI announced plans to make the latest versions of Grok available to developers through its API in the coming weeks. The company also aims to launch Grok-2 and Grok-2 mini on X to enhance search capabilities, post analytics, and reply functions.

While with Imagen 3, safety protocols aside, Google says it has greater versatility and understanding of prompts, higher quality images, and better text rendering—text being a pesky ongoing problem for all AI image models.

The post Google’s Imagen 3 Can Only Dream of Achieving What Grok 2 Just Did appeared first on AIM.

]]>
This YC-Backed Bengaluru AI Startup is Powering AWS, Microsoft, Databricks, and Moody’s with 5 Mn Monthly Evaluations https://analyticsindiamag.com/ai-breakthroughs/this-yc-backed-bengaluru-ai-startup-is-powering-aws-microsoft-databricks-and-moodys-with-5-mn-monthly-evaluations/ Tue, 20 Aug 2024 04:40:40 +0000 https://analyticsindiamag.com/?p=10133091

By mid-2023, Ragas gained significant traction, even catching the attention of OpenAI, which featured their product during a DevDay event.

The post This YC-Backed Bengaluru AI Startup is Powering AWS, Microsoft, Databricks, and Moody’s with 5 Mn Monthly Evaluations appeared first on AIM.

]]>

Enterprises love to RAG, but not everyone is great at it. The much touted solution for hallucinations and bringing new information to LLM systems, is often difficult to maintain, and to evaluate if it is even getting the right answers. This is where Ragas comes into the picture.

With over 6,000 stars on GitHub and an active Discord community over 1,300 members strong, Ragas was co-founded by Shahul ES and his college friend Jithin James. The YC-backed Bengaluru-based AI startup is building an open source stand for evaluation of RAG-based applications. 

Several engineering teams from companies such as Microsoft, IBM, Baidu, Cisco, AWS, Databricks, and Adobe, rely on Ragas offerings to make their pipeline pristine. Ragas already processes 20 million evaluations monthly for companies and is growing at 70% month over month.

The team has partnered with various companies such as Llama Index and Langchain for providing their solutions and have been heavily appreciated by the community. But what makes them special?

Not Everyone Can RAG

The idea started when they were building LLM applications and noticed a glaring gap in the market: there was no effective way to evaluate the performance of these systems. “We realised there were no standardised evaluation methods for these systems. So, we put out a small open-source project, and the response was overwhelming,” Shahul explained while speaking with AIM.

By mid-2023, Ragas had gained significant traction, even catching the attention of OpenAI, which featured their product during a DevDay event. The startup continued to iterate on their product, receiving positive feedback from major players in the tech industry. “We started getting more attention, and we applied to the Y Combinator (YC) Fall 2023 batch and got selected,” Shahul said, reflecting on their rapid growth.

Ragas’s core offering is its open-source engine for automated evaluation of RAG systems, but the startup is also exploring additional features that cater specifically to enterprise needs. “We are focusing on bringing more automation into the evaluation process,” Shahul said. The goal is to save developers’ time by automating the boring parts of the job. That’s why enterprises use Ragas.

As Ragas continues to evolve, Shahul emphasised the importance of their open-source strategy. “We want to build something that becomes the standard for evaluation in LLM applications. Our vision is that when someone thinks about evaluation, they think of Ragas.”

While speaking with AIM in 2022, Kochi-based Shahul, who happens to be a Kaggle Grandmaster, revealed that he used to miss classes and spend his time Kaggeling. 

The Love for Developers

“YC was a game-changer for us,” Shahul noted. “Being in San Francisco allowed us to learn from some of the best in the industry. We understood what it takes to build a successful product and the frameworks needed to scale.”

Despite their global ambitions, Ragas remains deeply rooted in India. “Most of our hires are in Bengaluru,” Shahul said. “We have a strong network here and are committed to providing opportunities to high-quality engineers who may not have access to state-of-the-art projects.”

“We have been working on AI and ML since college,” Shahul said. “After graduating, we worked in remote startups for three years, contributing to open source projects. In 2023, we decided to experiment and see if we could build something of our own. That’s when we quit our jobs and started Ragas.”

Looking ahead, Ragas is planning to expand its product offerings to cover a broader range of LLM applications. “We’re very excited about our upcoming release, v0.2. It’s about expanding beyond RAG systems to include more complex applications, like agent tool-based calls,” Shahul shared.

Shahul and his team are focused on building a solution that not only meets the current needs of developers and enterprises, but also anticipates the challenges of the future. “We are building something that developers love, and that’s our core philosophy,” Shahul concluded.

The post This YC-Backed Bengaluru AI Startup is Powering AWS, Microsoft, Databricks, and Moody’s with 5 Mn Monthly Evaluations appeared first on AIM.

]]>
Former Nutanix Founder’s AI Unicorn is Changing the World of CRM and Product Development https://analyticsindiamag.com/ai-breakthroughs/former-nutanix-founders-ai-unicorn-is-changing-the-world-of-crm-and-product-development/ Mon, 19 Aug 2024 10:56:33 +0000 https://analyticsindiamag.com/?p=10132923

Backed up Khosla Ventures, DevRev recently achieved unicorn status with a $100.8 million series A funding round, bringing its valuation to $1.15 billion.

The post Former Nutanix Founder’s AI Unicorn is Changing the World of CRM and Product Development appeared first on AIM.

]]>

DevRev, founded by former Nutanix co-founder Dheeraj Pandey and former SVP of engineering at the company, Manoj Agarwal, is an AI-native platform unifying customer support and product development. It recently achieved unicorn status with a $100.8 million series A funding round, bringing its valuation to $1.15 billion.

Backed by major investors, such as Khosla Ventures, Mayfield Fund, and Param Hansa Values, the company is on the road to proving the ‘AI bubble’ conversation wrong. “Right now, there’s a lot of talk in the industry about AI and machine learning, but what we’re doing at DevRev isn’t something that can be easily replicated,” said Agarwal in an exclusive interview with AIM.

Agarwal emphasised the unique challenge of integrating AI into existing workflows, a problem that DevRev is tackling head-on. Databricks recently announced that LakeFlow Connect would be available for public preview for SQL Server, Salesforce, and Workday; DevRev is on a similar journey, but with AI at its core, it remains irreplaceable.

DevRev’s AgentOS platform is built around a powerful knowledge graph, which organises data from various sources—such as customer support, product development, and internal communications—into a single, unified system with automatic RAG pipelines. 

This allows users to visualise and interact with the data from multiple perspectives, whether they are looking at it from the product side, the customer side, or the people side.

The Knowledge Graph Approach

Machines don’t understand the boundaries between departments. The more data you provide, the better they perform. “Could you really bring the data into one system, and could you arrange this data in a way that people can visually do well?” asked Agarwal. 

The Knowledge Graph does precisely that – offering a comprehensive view of an organisation’s data, which can then be leveraged for search, analytics, and workflow automation.

Agarwal describes the DevRev platform as being built on three foundational pillars: advanced search capabilities, seamless workflow automation, and robust analytics and reporting tools. “Search, not just keyword-based search, but also semantic search,” he noted.

On top of these foundational elements, DevRev has developed a suite of applications tailored to specific use cases, such as customer support, software development, and product management. These apps are designed to work seamlessly with the platform’s AI agents, which can be programmed to perform end-to-end tasks, further enhancing productivity.

“The AI knowledge graph is the hardest thing to get right,” admitted Agarwal, pointing to the challenges of syncing data from multiple systems and keeping it updated in real-time. However, DevRev has managed to overcome these hurdles, enabling organisations to bring their data into a single platform where it can be organised, analysed, and acted upon.

The Open Approach

The company’s focus on AI is not new. In fact, DevRev’s journey began in 2020, long before the current wave of AI hype. “In 2020, when we wrote our first paper about DevRev, it had GPT all over it,” Agarwal recalls, referring to the early adoption of AI within the company. 

Even today, DevRev primarily uses OpenAI’s enterprise version but also works closely with other AI providers like AWS and Anthropic. In 2021, the platform organised a hackathon where OpenAI provided exclusive access to GPT-3 for all the participants. 

This forward-thinking approach allowed DevRev to build a tech stack that was ready to leverage the latest advancements in AI, including the use of vector databases, which were not widely available at the time.

One of the biggest challenges that DevRev addresses is the outdated nature of many systems of record in use today. Whether it’s in customer support, CRM, or product management, these legacy systems are often ill-equipped to handle the demands of modern businesses, particularly when it comes to integrating AI and machine learning.

Not a Bubble

DevRev’s architecture is designed with flexibility in mind, allowing enterprises to bring their own AI models or use the company’s built-in solutions. “One of the core philosophies we made from the very beginning is that everything we do inside DevRev will have API webhooks that we expose to the outside world,” Agarwal explained. 

As DevRev reaches its unicorn status, Agarwal acknowledges the growing concerns about an “AI bubble” similar to the dot-com bubble of the late 1990s. “There’s so many companies that just have a website and a company,” he said, drawing parallels between the two eras. 

However, he believes that while there may be some hype, the underlying technology is real and here to stay. “I don’t think that anybody is saying that after the internet, this thing is not real. This thing is real,” Agarwal asserted. 

The key, he argues, is to distinguish between companies that are merely riding the AI wave and those that are genuinely innovating and solving real problems. DevRev, with its deep investment in AI and its unique approach to integrating it into enterprise workflows, clearly falls into the latter category.

The post Former Nutanix Founder’s AI Unicorn is Changing the World of CRM and Product Development appeared first on AIM.

]]>
This Microsoft-Backed Startup Comes up With Alternative to GPUs for Inference https://analyticsindiamag.com/ai-breakthroughs/this-microsoft-backed-startup-comes-up-with-alternative-to-gpus-for-inference/ Mon, 19 Aug 2024 09:58:15 +0000 https://analyticsindiamag.com/?p=10132911

The startup will launch its flagship product Corsair in November this year

The post This Microsoft-Backed Startup Comes up With Alternative to GPUs for Inference appeared first on AIM.

]]>

Graphics processing units (GPUs) have become highly sought-after in the AI field, especially for training models. However, when it comes to inference, they might not always be the most efficient or cost-effective choice.

D-Matrix, a startup incorporated in 2019 and headquartered in Santa Clara, California, is developing silicons better suited for generative AI inference. 

Currently, only a handful of companies in the world are training AI models. But when it comes to deploying these models, the numbers could be in millions.

In an interaction with AIM, Sid Sheth, the founder & CEO of d-Matrix, said that 90% of AI workloads today involve training a model, whereas around 10% involve inferencing. 

“But it is rapidly changing to a world, say, five to 10 years from now, when it will be 90% inference, 10% training. The transition is already underway. We are building the world’s most efficient inference computing platform for generative AI because our platform was built for transformer acceleration.

“Moreover, we are seeing a dramatic increase in GPU costs and power consumption. For example, an NVIDIA GPU that consumed 300 watts three years ago now consumes 1.2 kilowatts—a 4x increase. This trend is clearly unsustainable,” Sheth said.

Enterprises Like Small Language Model 

The startup is also aligning its business model with the fact that enterprises see smaller language models as truly beneficial, which they can fine-tune and train on their enterprise data. 

Some of the best LLMs, like GPT-4, or Llama 3.1, are significantly big in size with billions or potentially trillions of parameters. These models are trained on all the knowledge of the world. 

However, enterprises need models that are specialised, efficient, and cost-effective to meet their specific requirements and constraints.

Over time, we have seen OpenAI, Microsoft and Meta launch small language models (SLMs) like Llama 3 7 b, Phi-3 and GPT-4o mini.

“Smaller models are now emerging, ranging from 2 billion to 100 billion parameters, and they prove to be highly capable—comparable to some of the leading frontier models. This is promising news for the inference market, as these smaller models require less computational power and are therefore more cost-effective,” Sheth pointed out.

The startup believes enterprises don’t need to rely on expensive NVIDIA GPUs for inferencing. Its flagship product Corsair is specifically designed for inferencing generative AI models (100 billion parameter or less) and is much more cost effective, compared to GPUs.

“We believe that the majority of enterprises and individuals interested in inference will prefer to work with models of up to 100 billion parameters. Deploying models larger than this becomes prohibitively expensive, making it less practical for most applications,” he said.

( Jayhawk by d-Matrix)

Pioneering Digital In-Memory Compute

The startup was one of the pioneers in developing a digital in-memory compute (DIMC) engine, which they assert effectively addresses traditional AI limitations and is rapidly gaining popularity for inference applications.

“In older architectures, inference involves separate memory and computation units. Our approach integrates memory and computation into a single array, where the model is stored in memory and computations occur directly within it,” Sheth revealed.

Based on this approach, the startup has developed its first chip called Jayhawk II, which powers its flagship product Corsair. 

The startup claims that the Jayhawk II platform delivers up to 150 trillions of operations per second (TOPs), a 10-fold reduction in total cost of ownership (TCO) and provides up to 20 times more inferences per second compared to the high-end GPUs.

Rather than focusing on creating very large chips, the startup’s philosophy is to design smaller chiplets and connect them into a flexible fabric. 

“One chiplet of ours is approximately 400 square millimetres in size. We connect eight of these chiplets on a single card, resulting in a total of 3,200 square millimetres of silicon. In comparison, NVIDIA’s current maximum is 800 square millimetres. 

“This advantage of using chiplets allows us to integrate a higher density of smaller chips on a single card, thereby increasing overall functionality,” Sheth revealed.

This approach allows the startup to scale computational power up or down based on the size of the model. If the model is large, they can increase the computational resources, and vice versa. According to Sheth, this unique method is a key innovation from D Matrix.

Corsair is Coming in the Second Half of 2024

The startup plans to launch Corsair in November and enter production by 2025. So far, it is already in talks with hyperscalers, AI cloud service providers, sovereign AI cloud providers, and enterprises looking for on-prem solutions.

Sheth revealed that the company has customers in North America, Asia, and the Middle East and has signed a multi-million dollar contract with one of these customers. 

While due to non-disclosure agreements, Sheth refrained from revealing who this customer is, there is a possibility that Microsoft, a model maker and an investor in d-Matrix, would significantly benefit from the startup’s silicon.

Expanding in India 

In 2022, the startup established an R&D centre in Bengaluru. Currently, the size of the startup team here is around 20-30, and the company plans to double its engineering team in the coming months. 

Previously, Sheth had highlighted his intention to increase the Indian workforce to 25-30% of the company’s global personnel.

(d-Matrix office in Bengaluru, India)

In India, the startup is actively leading a skilling initiative, engaging with universities, reaching out to ecosystem peers, and collaborating with entrepreneurs.

Through the initiative, the startup wants to not only create awareness about AI, but also boost India’s skilling initiatives. They collaborate with universities, providing them with curriculum, updates, guidelines, and more.

The post This Microsoft-Backed Startup Comes up With Alternative to GPUs for Inference appeared first on AIM.

]]>
Why You Should Not Attend Cypher 2024 https://analyticsindiamag.com/ai-breakthroughs/why-you-should-not-attend-cypher-2024/ Sun, 18 Aug 2024 04:22:44 +0000 https://analyticsindiamag.com/?p=10132843

The last thing you want is to be bombarded with insights, network with industry leaders, and leave with your head buzzing full of new ideas.

The post Why You Should Not Attend Cypher 2024 appeared first on AIM.

]]>

Cypher 2024 is just around the corner, and if you’re considering not attending, you might miss out the most from the biggest AI summit of India. After all, who would want to waste their time rubbing shoulders with the brightest minds and getting exclusive insights from top industry leaders?

When it comes to the speakers, the line is pretty fascinating. With the session from the former member of Rajya Sabha Subramanian Swamy to insightful AI discussions with Vivek Raghavan, co-founder of Sarvam AI and Nikhil Malhotra, the head of the Project Indus, everything seems to just fall into place.

Around 2000+ AI and data scientists attending each day with decision-makers and C-level executives from Fortune 500 companies all gathering to share their wisdom, it’s almost as if you’d be overwhelmed with knowledge. And who doesn’t need that?

Cypher is notorious for its intense networking opportunities, from the rapid-fire FlashConnect sessions to the laid-back yet insightful VoxPops. To put this into perspective, Cypher 2023 had around 15% startups, 25% GCCs, 20% IT providers, 15% consulting firms, and 25% Indian companies. 

And Cypher 2024 is going to be at least twice as big as the last one.

But really, who needs more professional connections? Why expand your network with industry leaders and potential collaborators when you could just stay in your comfort zone? For staying in the comfort zone, Cypher 2024 will also host an Experience Zone where you can relax and enjoy your own AI created music with Beatoven.ai.

With three jam-packed days of content, featuring talks from renowned speakers like Amit Kapur from Lowe, Santosh Hegde from Couchbase, Shekar Sivasubramanian from Wadhwani AI, it’s almost as if Cypher is daring you to learn too much. The sheer volume of actionable insights, innovative strategies, and cutting-edge AI discussions might just fry your brain. 

Cypher began in 2015 with a straightforward mission: to bridge the gap between the AI community and various industries, both established and emerging. The idea quickly gained traction.

Over the years, Cypher has evolved into the largest AI conference in India, expanding at an unprecedented rate. But it’s not just about size— it is also considered the best AI conference in the country.

Then comes The Minsky Awards for Excellence in AI. These awards are highly esteemed AI accolades, exclusively honouring enterprise-level achievements. These awards celebrate outstanding excellence in AI, highlighting the most innovative and impactful applications of artificial intelligence.

Cypher hosts people from all over the world. And when it comes to India, though most of them attend from Bangalore, representatives from all the companies from different cities in India that are using generative AI are present to share insights and collaborate.

This also includes companies from diverse industries. Last year, Cypher 2023 had around 30% representation from IT companies, 20% from the financial sector, 15% from healthcare and life sciences, 10% each for manufacturing, consulting, and e-commerce, and the rest 5% from education and government organisations.

Too Much AI Pressure?

Let’s face it: being inspired by success stories, groundbreaking innovations, and the future direction of AI might just get you motivated to push boundaries in your own work—and who needs that kind of pressure?

Especially this time, with the AI Startup ecosystem rising in India, a dedicated track at Cypher where leading startups showcase their latest innovations, might be too much to handle for people getting into the field. Or maybe, it might be the best motivation needed to start your own venture.

Cypher 2024 might seem like the ultimate destination for anyone serious about AI and data science, but think carefully before you attend. The last thing you want is to be bombarded with insights, network with industry leaders, and leave with your head buzzing full of new ideas. After all, staying in your comfort zone has its own charm, doesn’t it?

If you are not yet convinced to skip Cypher 2024, please register here.

The post Why You Should Not Attend Cypher 2024 appeared first on AIM.

]]>
Ola Krutrim Brings an Early Diwali for Indian AI Developers https://analyticsindiamag.com/ai-breakthroughs/ola-krutrim-brings-an-early-diwali-for-indian-ai-developers/ Fri, 16 Aug 2024 08:29:34 +0000 https://analyticsindiamag.com/?p=10132754

Krutrim plans to produce its first AI silicon chip by 2026.

The post Ola Krutrim Brings an Early Diwali for Indian AI Developers appeared first on AIM.

]]>

“A lot can happen in six months,” Ola chief Bhavish Aggarwal, revealed a bunch of announcements about its AI venture Krutrim at Sankalp 2024. The launch of Krutrim six months ago has led to several chains of events, including the Microsoft Azure Cloud and Google Maps exit. Now, Krutrim is finally showing its love for Indian developers.

Apart from a bunch of announcements about electric bikes and the launch of BharatCell, the event was fully dedicated to Krutrim’s announcements and future roadmap. 

According to the company, since its launch six months back, Krutrim Cloud has received around 250 billion API calls and has around 25k developers using the platform, with one trillion tokens generated so far. Aggarwal also added that the company aims to build India’s first AI cloud in its own data centres which will be powered by the company’s own Bodhi chips, slated to be released by 2026, with Bodhi 2 by 2028.

Moreover, Krutrim Cloud will be free for developers till Diwali this year

Krutrim has announced a significant expansion of its cloud services, unveiling over 50 new offerings on its Krutrim Cloud platform. These services are designed to cater specifically to the needs of Indian developers and include AI Pods, AI Studio, a language hub with multimodal translation capabilities, and customer experience AI.

Full Stack AI Platform

Back in May, the AI unicorn launched its first LLM, Krutrim-7B-chat, trained on ten Indian languages, on the Databricks Marketplace for free. This was announced by Gautam Bhargava, VP and head of AI engineering at Krutrim.

This time, Bhargava announced that Krutrim has the best Indic tokeniser when compared to GPT-4o and Gemini, but falls behind when compared to Llama 3.1.

Moreover, speaking about the speech models, Krutrim model is on par with AI4Bharat’s IndicTrans2, but beats Gemini and Azure. When it comes to visual world understanding, Krutrim’s model competes closely with Idefics-2.

Krutrim is also launching another AI developer app called Bhashik, a multimodal language hub that can translate videos into different languages. “This event is currently being live-streamed in multiple languages through the Bhashik platform,” said Aggarwal. He also said that edtech startup Unacademy is also leveraging Ola Krutrim’s translation API for its language learning app.

During a scripted demo, Ola Krutrim previewed an upcoming image recognition feature that will be added to the AI model later this year. Aggarwal also showcased how Dashtoon, an early stage startup, uses Krutrim AI Cloud for training and fine-tuning its models. 

When it comes to Ola Maps, Aggarwal mentioned that it is expanding its API for developers to encompass over 95 per cent of use cases, including routing, places, maps, tiles, and SDKs. Moreover, another startup, Together, is using Ola Maps for enhancing its customer experience, competing with MapmyIndia. “It is the best maps out there,” added Aggarwal.

For developers, Ola unveiled AI Pods for affordable GPU access, an AI Studio for creating complex AI applications, a model catalogue featuring LLMs and vision models, as well as capabilities for no-code/low-code training, fine-tuning, inference, and model evaluation.

Starting today, Krutrim Cloud on AI Studio will be available for developers and provide no-code and low-code computing platforms tailored for Indian use cases, along with offerings like GPU-as-a-service and model-as-a-service, with Krutrim H100 tiny for 1/25th of the price on AI Pods, which is an abstraction service with GPU slicing for resource efficiency.

Moreover, Ola has also launched the Udaan startup programme. Startups will get Ola Maps credit worth up to Rs 50 lakhs free for three years on Krutrim Cloud and access to mentorship and Ola’s partner ecosystem.

The India Chip Revolution Begins 

When it comes to building an AI chip in India, Sambit Sahu came on stage and said that the company is on the roadmap to challenge SOTA AI chip performance by 2028 with Krutrim Silicon. Sahu unveiled three distinct chip families—Bodhi for AI, Sarv for general computing, and Ojas for edge computing.

The Bodhi chip is specifically engineered for complex AI workloads. It focuses on enhancing the speed and efficiency of AI systems. By 2028, Krutrim aims to launch Bodhi 2, an advanced version expected to rank among the top-performing AI chips worldwide. It will be able to run ten trillion parameter models.

Krutrim also plans to produce its first AI silicon chip by 2026.

To achieve its chip development goals, Krutrim has formed strategic partnerships with leading technology firms Arm and Untether AI. “Basically, we would be the best in class in performance per rupee and performance per watt,” added Aggarwal. 

When comparing image classification capabilities, Sahu showcased a demo where Krutrim’s Bodhi 1 was able to classify one million images faster than the leading AI chip, possibly NVIDIA’s offering. Adding to that, Aggarwal said that Krutrim will set up a gigawatt scale data centre by 2028. Today, the company’s capacity is at 20 megawatt. 

“There is no need for developers to look outside Krutrim,” said Aggarwal. 

Ironically, it looks like Ola Krutrim is also trying hard for the sovereign AI approach, similar to Sarvam AI, and naming its products strikingly similar to already existing ones in the market like Sarv-1 with Sarvam and Bhashik with Bhashini.

(FYI: AIM was present at Sankalp 2024 representing the developer community)

The post Ola Krutrim Brings an Early Diwali for Indian AI Developers appeared first on AIM.

]]>
AI Agents at INR 1 Per Min Could Really Help Scale AI Adoption in India https://analyticsindiamag.com/ai-breakthroughs/ai-agents-at-inr-1-per-min-could-really-help-scale-ai-adoption-in-india/ Wed, 14 Aug 2024 12:33:02 +0000 https://analyticsindiamag.com/?p=10132677

These agents could be integrated into contact centres and various applications across multiple industries, including insurance, food and grocery delivery, e-commerce, ride-hailing services, and even banking and payment apps.

The post AI Agents at INR 1 Per Min Could Really Help Scale AI Adoption in India appeared first on AIM.

]]>

Are AI agents the next big thing? The co-founders of Sarvam AI definitely think so. One of the startup’s theses is that consumers of AI will use generative AI models not just as a chatbot, but to perform tasks and achieve goals and that too, through a voice interface rather than text.

At an event held in Bengaluru on August 13th, Sarvam AI announced Sarvam Agents. While the startup, which is backed by Lightspeed, Peak XV, and Khosla Ventures, is not the only company building AI agents, what stood out was the pricing.

The cost of these agents starts at just one rupee per minute. According to co-founder Vivek Raghavan, enterprises can integrate these agents into their workflow without much hassle.

“These are going to be voice-based, multilingual agents designed to solve specific business problems. They will be available in three channels – telephony, WhatsApp, or inside an app,” Raghavan told AIM in an interaction prior to the event.

These agents could be integrated into contact centres and various applications across multiple industries, including insurance, food and grocery delivery, e-commerce, ride-hailing services, and even banking and payment apps.

For example, they could streamline customer service operations in insurance by handling policy inquiries, make reservations, assist with financial transactions, facilitate order tracking and customer support in food delivery, and manage ride requests and driver communications in ride-hailing apps.

Enabling AI Agents 

A technology that offers this capability at just a rupee per minute could be transformative. AI adoption could see substantial growth with AI agents, and Sarvam AI’s mission is to make this a reality.

Meta, which owns WhatsApp and other major social media platforms like Facebook and Instagram, introduced Meta AI to all these platforms. 

Meta AI can be summoned in group chats for planning and suggestions, it can make restaurant recommendations, trip planning assistance, and also provide general information.

However, Sarvam AI claims their generative AI stack could help AI scale in India compared to others. Their models perform better in Indic languages than the Llama models, which is powering Meta AI. During the event, the startup demoed their models, which managed to outperform certain models in Indic language tasks.

The startup is currently making its agents available in Hindi, Tamil, Telugu, Malayalam, Punjabi, Odia, Gujarati, Marathi, Kannada, and Bengali, and plans to add more languages soon.

Interestingly, given the backgrounds of the co-founders, especially Raghavan, who has helped Aadhaar scale significantly in India, the startup is well-positioned to drive widespread AI adoption and impact.

Raghavan served as the chief product officer at the Unique Identification Authority of India (UIDAI) for over nine years. As of September 29, 2023, over 138.08 crore Aadhaar numbers were issued to the residents of India.

As part of the interaction, Raghavan highlighted his experience in scaling technology to benefit humanity. He also mentioned that the startup is already in talks with several companies interested in utilising Sarvam agents. At the event, the startup revealed that their agent is already being integrated into the Sri Mandir app. 

(Vivek Raghavan & Pratyush Kumar, co-founders at Sarvam AI)

Models Powering Sarvam Agents

Raghavan said there are multiple models that form the backbone of these AI agents. The first is a speech-to-text model called Saaras which translates spoken Indian languages into English with high accuracy, surpassing traditional ASR systems. 

The second model, called Bulbul, is text-to-speech, offering diverse voices in multiple languages with consistent or varied options depending on preference.

The third is a parsing model designed for high-quality document extraction. This model addresses common issues with complex data, aiming to improve accuracy in parsing financial statements and other intricate documents.

Notably, these models are closed-source and available to customers as AI. However, the startup also launched an open-source, two billion-parameter foundational model trained on four trillion tokens and completely from scratch.

Less Dramatic but Good Demo

At the event, the startup also demoed what their agents could do. The demo, which was pre-recorded, showcased how a Sarvam agent could comprehend a person’s health condition, assist in finding the right doctor, and even book an appointment.

A pre-recorded demo may not appeal to everyone, but from the startup’s perspective, it’s a safe bet and completely understandable. Live demos carry inherent risks; for instance, at the Made by Google event, one Googler’s attempt to showcase Google Gemini’s capabilities live saw them fail twice before succeeding.

Sarvam AI’s demo was also reminiscent of OpenAI’s showcase of their latest model, GPT-4o, earlier this year. While Sarvam AI’s demo was less dramatic and also not at all controversial, it effectively demonstrated that their agents could understand the context as well as various Indian languages and dialects.

“These agents can also be very contextual. For example, when you’re on a particular page, you press a button seeking more information about a particular item. The agent will be context-aware, so it knows where you’re asking from. In contrast, when you call a number, it starts from scratch without that context,” Raghavan said.

The startup revealed it trained its models using NVIDIA DGX, leveraging Yotta’s infrastructure. Other notable collaborators include Exotel, Bhashini, AI4Bharat, EkStep Foundation and People+ai.

The post AI Agents at INR 1 Per Min Could Really Help Scale AI Adoption in India appeared first on AIM.

]]>
Elon Musk’s Robotaxis are Built for Riders, Not for Drivers https://analyticsindiamag.com/ai-breakthroughs/elon-musks-robotaxis-are-built-for-riders-not-for-drivers/ https://analyticsindiamag.com/ai-breakthroughs/elon-musks-robotaxis-are-built-for-riders-not-for-drivers/#respond Tue, 13 Aug 2024 12:49:58 +0000 https://analyticsindiamag.com/?p=10132590

“We’ll have a fleet that’s on the order of 7 million that are capable of autonomy. In the years to come it will be over 10 million and 20 million. This is immense,” said Elon Musk.

The post Elon Musk’s Robotaxis are Built for Riders, Not for Drivers appeared first on AIM.

]]>

While the initial launch date for Tesla’s self-driving cab Robotaxi has been pushed, Tesla chief Elon Musk remains optimistic. Tesla owners will be able to transform their vehicles into Robotaxis, allowing their cars to generate income, much like an “Airbnb on wheels.”

Robotaxi Vision 

The Robotaxi service aims to make Tesla drivers add their vehicles into the cab fleet, a little different from other autonomous cab providers such as Waymo or Baidu. 

“We’ll have a fleet that’s on the order of 7 million that are capable of autonomy. In the years to come it will be over 10 million and 20 million. This is immense,” said Tesla chief Elon Musk. “The car is able to operate 24/7 unlike the human drivers,” he said.  

However, Tesla’s concept of Robotaxi has been recently questioned by Uber CEO Dara Khosrowshahi, in a recent interview. “It’s not clear to me that the average person,Tesla owner, or owner of any other car is going to want to have that car be ridden in by a complete stranger,” he said. 

Shared Revenue

Similar to your third-party cab provider, such as Uber, Tesla is also looking to have a shared revenue format with the Robotaxi owners. Musk also highlighted how Robotaxi will give the luxury to the users to choose the hours and schedule the hours of operation accordingly, thereby giving Robotaxi owners the choice to use it as both a personal and a commercial vehicle. 

Interestingly, Khosrowshahi even questioned the supply angle for this particular format of ride. “It just so happens that probably the times at which you’re going to want your Tesla are probably going to be the same times that ridership is going to be at a peak.” Thereby, hinting at how the demand and supply will not be met. 

Furthermore, he is also sceptical about the whole autonomous feature in vehicles.  “We’re seeing that when one of our customers is offered an autonomous ride, about half the time they say, yeah that would be really cool, and half the time they say, no thank you, I’d rather have a human I think that’s going to improve over a period of time,” he said.  

Even then, the Uber CEO has not denied a likely partnership with Tesla in the future. “Hopefully, Tesla will be one of those partners. You never know.” 

With numerous autonomous vehicles on the market, the key distinction lies in the different approaches each company has taken toward autonomous capabilities.

LiDAR vs Vision

Tesla uses computer vision (Tesla Vision) rather than the conventional LiDAR (Light Detection and Ranging) tech for autonomous vehicles. 

Musk has always been vocal about using vision-only methods for autonomous capabilities as opposed to Waymo and other self-driving cars that heavily rely on LiDAR. 

Previously, Musk had even called out LiDAR as a “fool’s errand” and that anyone relying on it is “doomed.” He even referred to Waymo’s robotaxi services as limited and fragile, and claims Tesla’s systems to work anywhere in the world, not limited by geography. 

The cost of LiDAR has been a major deciding factor for adopting sensors in autonomous vehicles. Musk considers LiDAR to be expensive sensors that are unnecessary. He believes that cameras, which Tesla banks on, will help them navigate through adverse weather conditions. 

Kilian Weinberger, professor of Computer Science at Cornell University, had earlier said that cameras are dirt cheap compared to lidar. “By doing this they can put this technology into all the cars they’re selling. If they sell 500,000 cars, all of these cars are driving around collecting data for them,” he said. 

While Tesla is heavily backing vision tech for AV, LiDAR is not completely out of the picture. Recently, Tesla purchased over $2 million worth of lidar sensors from Luminar. The company revealed that Tesla was its largest LiDAR customer in Q1. 

“At some point Tesla will pivot and adopt LiDAR. I think it is not an if question, but rather when,” a user speculated on Reddit. 

Self-Driving Cars on the Rise

Driverless car services are witnessing a huge growth. Google’s parent Alphabet, recently announced an investment of $5 billion on its self-driving subsidiary Waymo. Currently, they operate in San Francisco, Phoenix, and Los Angeles and they will soon test them on the freeways of the San Francisco Bay Area. It was reported that Waymo is currently delivering 50,000 paid rides per week. 

Zoox, a subsidiary of Amazon, is also developing autonomous vehicles and is operational in certain cities in the US. 

Baidu’s autonomous fleet which is already running 6000 driverless rides per day in Wuhan (China) adopts a mix of technologies. As of April 2024, the cumulative test mileage of Apollo L4 has exceeded 100 million kilometres.

“I think we are all at L4 today; and with the government regulation, it’s not possible to do L5. Another thing is that, I think, all of us providing this technology haven’t been tested in all the scenarios. We wouldn’t have this confidence to claim that we have the L5 capability,” said Helen K. Pan, general manager and board of directors for Baidu Apollo, California, in an earlier interaction with AIM.

While Waymo and Baidu are at L4 level of autonomous capability, Tesla is still between L2 and L3. 

The Robotaxi unveiling event is currently planned for October 10. Musk is also positive of expanding Tesla’s self-driving technology to a wide market in the U.S. and internationally.

The post Elon Musk’s Robotaxis are Built for Riders, Not for Drivers appeared first on AIM.

]]>
https://analyticsindiamag.com/ai-breakthroughs/elon-musks-robotaxis-are-built-for-riders-not-for-drivers/feed/ 0
Sarvam AI Launches India’s First Open Source Foundational Model in 10 Indic Languages https://analyticsindiamag.com/ai-breakthroughs/sarvam-ai-launches-indias-first-open-source-foundational-model-in-10-indic-languages/ https://analyticsindiamag.com/ai-breakthroughs/sarvam-ai-launches-indias-first-open-source-foundational-model-in-10-indic-languages/#respond Tue, 13 Aug 2024 10:28:00 +0000 https://analyticsindiamag.com/?p=10132446

Called Sarvam 2B, the model is trained on 4 trillion tokens of an internal dataset.

The post Sarvam AI Launches India’s First Open Source Foundational Model in 10 Indic Languages appeared first on AIM.

]]>

Bengaluru-based AI startup Sarvam AI recently announced the launch of India’s first open-source foundational model, built completely from scratch.

The startup, which raised $41 million last year from the likes of Lightspeed, Peak XV Partners and Khosla Ventures, believes in the concept of sovereign AI- creating AI models tailored to address the specific needs and unique use cases of their country.

The model, called Sarvam 2B, is trained on 4 trillion tokens of data. It can take instructions in 10 Indic languages, including Hindi, Tamil, Telugu, Malayalam, Punjabi, Odia, Gujarati, Marathi, Kannada, and Bengali.

According to Vivek Raghavan, Sarvam 2B is among a class of Small Language Models (SLMs) that includes Microsoft’s Phi series models, Llama 3 8 billion, and Google’s Gemma models.

“This is the first open-source foundational model trained on an internal dataset of 4 trillion tokens by an Indian company, with compute in India, with efficient representation for 10 Indian languages,” Raghavan told AIM in an interaction prior to the announcement.

The model, which will be available on Hugging Face, is well suited for Indic language tasks such as translation, summarisation and understanding colloquial statements. The startup is open-sourcing the model to facilitate further research and development and to support the creation of applications built on it.

Previously, Tech Mahindra introduced its Project Indus foundational model, while Krutrim also developed its own foundational model from scratch. However, neither of these models is open-source.

India’s First Open-Source AudioLM

The startup, which Raghavan co-founded with Pratyush Kumar, also believes that in India, consumers will use generative AI through voice mode rather than text. At an event held in ITC Gardenia, Bengaluru, on August 13th, the startup announced Shuka 1.0–India’s first open-source audio language model.

The model is an audio extension of the Llama 8B model to support Indian language voice in and text out, which is more accurate than frontier models. 

“The audio serves as the input to the LLM, with audio tokens being the key component here. This approach is notably unique. It’s somewhat similar to what GPT-4o introduced by OpenAI a couple of months ago,” Raghavan said.

According to the startup, the model is 6x more faster than Whisper + Llama 3. At the same time, its accuracy across the 10 languages is higher compared to Whisper+ Llama 3.

Previously, the startup has hinted extensively at developing a voice-enabled generative AI model. Startups and businesses aiming to incorporate voice experiences into their services can leverage this tool, particularly for Indian languages.

Raghavan also said that its aim is to make the model sound more human-like in the coming months. 

Sarvam Agents are Here

Another interesting development announced by the startup is Sarvam Agents. Raghavan believes that AI’s real use case is not in the form of chatbots but in AI doing things on one’s behalf. 

“Sarvam Agents are going to be voice-based, multilingual agents designed to solve specific business problems. They will be available in three channels– they can be available via telephony, it can be available via WhatsApp, and it can be available inside an app,” Raghavan said.

These agents are also available in 10 Indian languages, and the cost of these voice agents starts at a minimal cost of just INR 1/min.  These AI agents can be deployed by contact centres or by sales teams of different enterprises, etc.

While these agents may sound like existing conversational AI products available in the market, Raghavan said their architecture, which uses multiple in-house developed LLMs, makes them fundamentally different.

“These agents can also be very contextual. For example, when you’re on a particular page, you press a button seeking more information about a particular item. The agent will be context-aware, so it knows where you’re asking from. In contrast, when you call a number, it starts from scratch without that context,” he said.

Sarvam Models APIs

While both Sarvam 2B and Shuka 1.0 are open-source models, Sarvam.ai is making available a bunch of close-sourced Indic models used in the creation of Sarvam agents ready to be consumed as APIs.

“These include five sets of models. I will tell you about the three important ones. Our first model, a speech-to-text model, translates spoken Indian languages into English with high accuracy, surpassing traditional ASR systems. The second model is a text-to-speech model which converts text into speech, offering diverse voices in multiple languages, with consistent or varied options depending on preference,” Raghavan said. 

The third model is a parsing model designed for high-quality document extraction. This model addresses common issues with complex data, aiming to improve accuracy in parsing financial statements and other intricate documents. 

Other announcements made by the startup include a generative AI workbench designed for law practitioners to enhance their capabilities with features such as regulatory chat, document drafting, redaction and data extraction.

The post Sarvam AI Launches India’s First Open Source Foundational Model in 10 Indic Languages appeared first on AIM.

]]>
https://analyticsindiamag.com/ai-breakthroughs/sarvam-ai-launches-indias-first-open-source-foundational-model-in-10-indic-languages/feed/ 0
Is Runway’s Gen-3 Update the First or Last Frame? https://analyticsindiamag.com/ai-breakthroughs/is-runways-gen-3-update-the-first-or-last-frame/ https://analyticsindiamag.com/ai-breakthroughs/is-runways-gen-3-update-the-first-or-last-frame/#respond Tue, 13 Aug 2024 05:56:47 +0000 https://analyticsindiamag.com/?p=10132387

Runway’s new update brings it in direct competition with other players in the space, such as Luma Labs, Pika, OpenAI’s much-anticipated Sora.

The post Is Runway’s Gen-3 Update the First or Last Frame? appeared first on AIM.

]]>

Runway, the US-based AI startup, has taken another significant step in the rapidly evolving field of AI-generated video. The company announced today that its Gen-3 Alpha Image to Video tool now supports using an image as either the first or last frame of video generation, a feature that could dramatically improve creative control for filmmakers, marketers, and content creators.

The startup was founded in 2018 by Cristóbal Valenzuela, Alejandro Matamala, and Anastasis Germanidis.

Furthermore, this update comes after the startup officially released Gen-3 Alpha, highlighting the company’s aggressive push to stay ahead in the competitive AI video generation market. 

The new capabilities of the model allows users to anchor their AI-generated videos with specific imagery, potentially solving one of the key challenges in AI video creation; consistency and predictability.

By allowing users to generate high-quality, ultra-realistic scenes that are up to 10 seconds long—with various camera movements—using only text prompts, still imagery, or pre-recorded footage, this model has set a new benchmark in video creation.

“The ability to create unusual transitions has been one of the most fun and surprising ways we’ve been using Gen-3 Alpha internally,” said Runway co-founder and CTO Anastasis Germanidis.

Back in February 2023, Runway released Gen-1 and Gen-2, the first commercial and publicly available foundational video-to-video and text-to-video generation models accessible via an easy-to-use website. Now the Gen-3 update takes it to the next level.

The Power of First and Last Frames

“Gen-3 Alpha update now supports using an image as either the first or last frame of your video generation. This feature can be used on its own or combined with a text prompt for additional guidance,” Runway announced on X. 

The impact of this feature was immediately recognised by users. Justin Ryan, a digital artist, posted in response: “This is such a big deal! I’m hoping this means we are closer to the First and final frame like Luma Labs offers.”

This development puts Runway in direct competition with other players in the space, such as Luma Labs, Pika, OpenAI’s much-anticipated Sora, and the Bengaluru-based startup Unscript, which is generating videos using single images.

However, Runway’s public availability gives it a significant edge over Sora, which remains in closed testing.

A spokesperson from Runway shared that the initial rollout will support 5 and 10-second video generations, with significantly faster processing times. Specifically, a 5-second clip will take 45 seconds to generate, while a 10-second clip will take 90 seconds.  

Accelerating to Get Ahead 

Since the release of the Gen-3 Alpha model, internet users have been showcasing their unique creations in high-definition videos, demonstrating the versatility and range of Runway AI’s latest AI model.

As Runway makes a bold move, there is a significant shift in the generative AI video space. The company describes this update as “first in a series of models developed by Runway on a new infrastructure designed for large-scale multimodal training,” and a “step toward creating General World Models.”

Germanidis also revealed that Gen-3 Alpha will soon enhance all existing Runway modes and introduce new features with its advanced base model.

He also noted that since Gen-2’s 2023 release, Runway has found that video diffusion models still have significant performance potential and create powerful visual representations. 

While the startup states that Gen-3 Alpha was “trained on new infrastructure” and developed collaboratively by a team of researchers, engineers, and artists, it has not disclosed specific datasets, following the trend of other leading AI media generators that keep details about data sources and licensing confidential.

Interestingly, the company also notes that it has already been “collaborating and partnering with leading entertainment and media organisations to create custom versions of Gen-3,” which “allows for more stylistically controlled and consistent characters, and targets specific artistic and narrative requirements, among other features.”

AI comes to filmmaking

Additionally, Runway hosted its second annual AI Film Festival in Los Angeles. To illustrate the event’s growth since its inaugural year, Valenzuela noted that while 300 videos were submitted for consideration last year, this year they sent in 3,000.

Hundreds of filmmakers, tech enthusiasts, artists, venture capitalists, and notable figures, including Poker Face star Natasha Lyonne, gathered to watch the 10 finalists selected by the festival’s judges.

Now, the films look different, as does the industry with generative AI.

Meanwhile, amidst all this, it is evident that Runway is not giving up the fight to be a dominant player or leader in the rapidly advancing generative AI video creation space.

The post Is Runway’s Gen-3 Update the First or Last Frame? appeared first on AIM.

]]>
https://analyticsindiamag.com/ai-breakthroughs/is-runways-gen-3-update-the-first-or-last-frame/feed/ 0
Figure 02 the Most Advanced Humanoid AI Robot https://analyticsindiamag.com/ai-breakthroughs/figure-is-giving-ai-a-body/ https://analyticsindiamag.com/ai-breakthroughs/figure-is-giving-ai-a-body/#respond Fri, 09 Aug 2024 14:31:11 +0000 https://analyticsindiamag.com/?p=10132034

Figure 02 has a sleeker format than its predecessor and the founder considers it the world’s most advanced AI.

The post Figure 02 the Most Advanced Humanoid AI Robot appeared first on AIM.

]]>

Figure Robotics’ uber-cool newest version of humanoid Figure 02 was revealed a few days ago. The company’s founder and CEO, Brett Adcock, released a demo video that showcased the robot’s dexterity and movements, predominantly showcasing its capabilities in a BMW Group Plant Spartanburg trial run. 

“Figure is giving artificial intelligence a body,” said Adcock.   

AI Gets a Body

Standing tall at 5’6” and weighing 70kg, Figure 02 is sleeker than its predecessor, with the most prominent feature being the absence of external cable for facilitating testing and repairs. The cables are now integrated onto the limbs. 

In terms of computational and AI capabilities, Figure 02 surpasses Figure 01 with three times the computational power, enabling autonomous AI tasks such as speech-to-speech interaction and visual reasoning. 

Additionally, it is equipped with six RGB cameras and an onboard vision language model, significantly improving its ability to perceive and interact with the physical world.

It’s All About the Hands

The enhanced hand dexterity of Figure 02 with 16 degrees of freedom is an impressive feat, considering that hand movements are one of the most complex parts of a humanoid manufacturing process. 

Adcock said in a post that five-fingered hands are crucial for general robotics, but traditional roboticists often hesitate due to the complexity involved. 

Given the minimal progress outside of prosthetics, the team at Figure had to design the hands from the ground up, covering everything from the mechanical structure to sensors, electronics, wiring, and control systems. 

When compared to the Figure 01, an additional degree of freedom has been added to the thumb. The bulk has been reduced from the wrist, while the finger actuators remain within the palm.

Emphasising the importance of robotics arms, in an earlier interaction with AIM, Bengaluru-based CynLr Robotics founder and CEO Gokul NA said, “Wheels are more than enough [for robots in warehouses], but you need more capability with the hands.”

Big Tech Powering Figure 02

Figure 02 could be a perfect example of embodied AI or as Jensen Huang calls it, the physical AI. Referring to it as the world’s most advanced AI, Adcock is powering Figure with the help of some of the biggest tech players in the world. 

OpenAI, Microsoft, and NVIDIA have been the prominent three powering both the hardware and software side of Figure.

Figure 02 has been built using NVIDIA’s Isaac Sim for synthetic data and generative AI model training. NVIDIA serves as a critical member in enhancing simulation, training, and inference through full-stack accelerated systems, libraries, and foundation models.

Similarly, Microsoft has allocated H100 NDs for large model training, and OpenAI has been instrumental in fine-tuning custom multimodal models on humanoid robot data. 

Along with Figure’s humanoid, many companies are already using these robots at automobile and industrial plants. Interestingly, Tesla’s Optimus Gen-2 version was released in December last year. 

Funnily, Adcock took a dig at Elon Musk after Figure 02’s launch as a challenge to the humanoids that exist. At the same time, demo videos comparing both Optimus Gen 2 and Figure 02 were doing the rounds, the latter being similar to Tesla’s.

While the friendly banter continues, big tech companies continue to focus on developing humanoids.  

The post Figure 02 the Most Advanced Humanoid AI Robot appeared first on AIM.

]]>
https://analyticsindiamag.com/ai-breakthroughs/figure-is-giving-ai-a-body/feed/ 0
Synthetic Data Generation in Simulation is Keeping ML for Science Exciting https://analyticsindiamag.com/ai-breakthroughs/synthetic-data-generation-in-simulation-is-keeping-ml-for-science-exciting/ https://analyticsindiamag.com/ai-breakthroughs/synthetic-data-generation-in-simulation-is-keeping-ml-for-science-exciting/#respond Fri, 09 Aug 2024 09:38:25 +0000 https://analyticsindiamag.com/?p=10131981

Simulations allow researchers to generate vast amounts of synthetic data, which can be critical when real-world data is scarce, expensive, or challenging to obtain.

The post Synthetic Data Generation in Simulation is Keeping ML for Science Exciting appeared first on AIM.

]]>

If only AI could create infinite streams of data for training, we wouldn’t have to deal with the problem of not having enough data. This is what is keeping a lot of things undiscoverable in the field of science as there is only a limited amount of data available that can be used for training.

This is where AI is taking up a crucial role with the help of simulation. The integration of data generation through simulation is rapidly becoming a cornerstone in the field of ML, especially in science. This approach not only holds promise but is also reigniting enthusiasm among researchers and technologists. 

As Yann LeCun pointed out, “Data generation through simulation is one reason why the whole idea of ML for science is so exciting.”

Simulations allow researchers to generate vast amounts of synthetic data, which can be critical when real-world data is scarce, expensive, or challenging to obtain. For instance, in fields like aerodynamics or robotics, simulations enable the exploration of scenarios that would be impossible to test physically.

Richard Socher, the CEO of You.com, highlighted that while there are challenges, such as the combinatorial explosion in complex systems, simulations offer a pathway to manage and explore these complexities. 

Synthetic Data is All You Need?

This is similar to what Anthropic chief Dario Amodei said about producing quality data using synthetic data and that it sounds feasible to create an infinite data generation engine that can help build better AI systems. 

“If you do it right, with just a little bit of additional information, I think it may be possible to get an infinite data generation engine,” said Amodei, while discussing the challenges and potential of using synthetic data to train AI models.

“We are working on several methods for developing synthetic data. These are ideas where you can take real data present in the model and have the model interact with it in some way to produce additional or different data,” explained Amodei. 

Taking the example of AlphaGo, Amodei said that those little rules of Go, the little additional piece of information, are enough to take the model from “no ability at all to smarter than the best human at Go”. He noted that the model there just trains against itself with nothing other than the rules of Go to adjudicate.

Similarly, OpenAI is a big proponent of synthetic data. The former team of Ilya Sutskever and Andrej Karpathy has been a significant force in leveraging synthetic data to build AI models. 

The development at OpenAI is testimony to the advanced growth of generative AI in the entire ecosystem, but not everyone agrees that they will be able to achieve AGI with the current methodology of model training. Likewise, Microsoft is also researching in this direction; its research on Textbooks Are All You Need is a testament to the power of synthetic data.

Google’s AlphaFold, which is spearheading protein fold prediction and creations for drug discovery, too can benefit immensely from synthetic data. At the same time, it can be scary to use this data for a sensitive field like science.

Synthetic Data is Too Synthetic

However, the potential of simulations extends beyond mere data generation. Giuseppe Carleo, another expert in the field, emphasised that the most exciting aspect is not just fitting an ML model to data generated by an existing simulator. 

Instead, true innovation lies in training ML models to become advanced simulators themselves—models that can simulate systems beyond the capabilities of traditional methods, all while remaining consistent with the laws of physics.

This is becoming possible with synthetic data generated by agentic AI models, which are increasing in the field of AI. Models that can test, train, and fine-tune themselves using the data they created is something that is exciting for the future of AI research. 

Moreover, the discussion around simulations also touches on broader applications. Sina Shahandeh, a researcher in the field of biotechnology, for example, suggested that the ultimate simulation could model entire economies using an agent-based approach, a concept that is slowly becoming feasible.

Despite the excitement, the field is not without its sceptics. Stephan Hoyer, a researcher with a cautious outlook on AGI, pointed out that simulating complex biological systems to the extent that training data becomes unnecessary would require groundbreaking advancements. 

He believes this task is far more challenging than achieving AGI. Similarly, Jim Fan, senior AI scientist at NVIDIA, said that while synthetic data is expected to have a noteworthy role, blind scaling alone will not suffice to reach AGI.

When it comes to science, using synthetic data can be tricky. But its generation in simulation shows promise as it can be tried and tested without deploying in real-world applications. Besides, the possibility of it being infinite is what keeps ML exciting for researchers.

The post Synthetic Data Generation in Simulation is Keeping ML for Science Exciting appeared first on AIM.

]]>
https://analyticsindiamag.com/ai-breakthroughs/synthetic-data-generation-in-simulation-is-keeping-ml-for-science-exciting/feed/ 0
Telangana Becomes India’s First State to Develop its Own AI Model https://analyticsindiamag.com/ai-breakthroughs/telangana-becomes-the-first-state-in-india-to-build-its-own-ai-model/ https://analyticsindiamag.com/ai-breakthroughs/telangana-becomes-the-first-state-in-india-to-build-its-own-ai-model/#respond Fri, 02 Aug 2024 09:30:00 +0000 https://analyticsindiamag.com/?p=10131223

In July, the Information Technology, Electronics & Communications (ITE&C) department in Telangana hosted a datathon aimed at creating a Telugu LLM datasets.

The post Telangana Becomes India’s First State to Develop its Own AI Model appeared first on AIM.

]]>

Lately, there has been a debate in India’s AI ecosystem about whether the country should build its own foundational models. Some argue that India can address real-world problems by leveraging existing state-of-the-art models without spending millions on new ones. 

Others, however, believe that it’s essential to develop models that profoundly understand the nuances, complexities, and rich diversity inherent in India’s myriad cultures and languages.

Amidst all this, an Indian state government has undertaken the task of developing an LLM that operates in the state’s official language.

In July, the Information Technology, Electronics & Communications (ITE&C) department in Telangana hosted a datathon aimed at creating a Telugu LLM.

Carried out in partnership with Swecha, a non-profit and free open-software movement in Telangana, the datathon was organised to help build datasets which, in turn, will help train the Telugu LLM.

Building Telugu Datasets

Building effective LLMs for Indian languages remains a challenging task due to the scarcity of high-quality data. While ChatGPT is impressive because it is trained on multiple terabytes of data, such extensive datasets are not available for Indian languages.

To develop datasets in Telugu, the Telangana government is tapping into its rich education system. Around 1 lakh undergraduate students across all engineering colleges in Telangana took part in the datathon and collected data from ordinary citizens who use Telugu as their mother tongue. 

The team collected data from oral sources such as folk tales, songs, local histories, and information about food and cuisine. Additionally, they plan to dispatch volunteers to approximately 7,000 villages across the state to gather audio and video samples of people discussing various topics, which were then converted into content.

(Source: @nutanc)

Interestingly, this is not the first instance when such an exercise was undertaken. Last year, the same Swecha team developed a Telugu SLM, named ‘AI Chandamama Kathalu’, from scratch. 

To collect data for the model, a similar datathon was organised with volunteers from Swecha, in collaboration with nearly 25-30 colleges. Over 10,000 students participated in translating, correcting, and digitising 40,000-45,000 pages of Telugu folk tales.

Building LLMs

Ozonetel, which is an industry partner for the project along with DigiQuanta and TechVedika, supported it by training the model and providing the necessary compute. 

The team tried fine-tuning Google’s MT-5 open-source model, Meta’s Llama, and Mistal. However, they finally settled on building a model similar to GPT-2 from scratch. Training the model on a cluster of NVIDIA’s A100 GPUs took nearly a week.

Now, the aim is to develop a larger model and have it ready to be showcased at the Telangana govt’s Global AI Summit, scheduled to take place in September this year.

Moreover, developing a large model could cost millions of dollars. For instance, building something like ChatGPT could cost in the billions. However, the team aims to develop the Telugu LLM at a cost of around INR5-10 lakh.

India’s Efforts to Build AI Models in Regional Languages 

Over time, we have seen efforts to build AI models in regional languages. For instance, Abhinand Balachandran, assistant manager at EXL, released a Telugu version of Meta’s open-source LLama 2 model.

Similarly, in April this year, a freelance data scientist released Nandi– built on top of Zephyr-7b-Gemma, the model boasts 7 billion parameters and is trained on Telugu Q&A data sets curated by Telugu LLM Labs.

Interestingly, such models have not been just limited to Telugu. We have seen AI models built on top of open-source models such as Tamil Llama, and Marathi LLama, among others.

However, these models could be seen as mere experiments. But the Telangana government’s effort to develop an AI model in Telugu has the potential to make significant strides in advancing regional language technology and preserving cultural heritage.

Officials involved in the project have told the media that voice commands from devices such as Alexa are not available in Telugu and this platform will pave the way for such innovations. 

The post Telangana Becomes India’s First State to Develop its Own AI Model appeared first on AIM.

]]>
https://analyticsindiamag.com/ai-breakthroughs/telangana-becomes-the-first-state-in-india-to-build-its-own-ai-model/feed/ 0
Kuku FM is Using Generative AI to Make Everyone a Full-Stack Creative Producer https://analyticsindiamag.com/intellectual-ai-discussions/kuku-fm-is-using-generative-ai-to-make-everyone-a-full-stack-creative-producer/ https://analyticsindiamag.com/intellectual-ai-discussions/kuku-fm-is-using-generative-ai-to-make-everyone-a-full-stack-creative-producer/#respond Fri, 02 Aug 2024 06:30:00 +0000 https://analyticsindiamag.com/?p=10131210

"AI is going to be commoditised; everybody will have access to the tools. What will remain crucial is the talent pool you have – the storytellers."

The post Kuku FM is Using Generative AI to Make Everyone a Full-Stack Creative Producer appeared first on AIM.

]]>

Kuku FM, a popular audio content platform backed by Google and Nandan Nilekani’s Fundamentum Partnership, is harnessing the power of generative AI to revolutionise how stories are created, produced, and consumed. This transformation is spearheaded by Kunj Sanghvi, the VP of content at Kuku FM, who told AIM that generative AI is part of their everyday work and content creation.

“On the generative AI side, we are working pretty much on every layer of the process involved,” Sanghvi explained. “Right from adapting stories in the Indian context, to writing the script and dialogues, we are trying out AI to do all of these. Now, in different languages, we are at different levels of success, but in English, our entire process has moved to AI.”

Kuku FM is leveraging AI not just for content creation but for voice production as well. The company uses Eleven Labs, ChatGPT APIs, and other available offerings to produce voices directly.

“Dramatic voice is a particularly specific and difficult challenge, and long-form voice is also a difficult challenge. These are two things that most platforms working in this space haven’t been able to solve,” Sanghvi noted. 

In terms of long-form content moving to generative AI, Kuku FM also does thumbnail generation, visual assets generation, and description generation and Sanghvi said that the team has custom GPTs for every process.

Compensating Artists

AI is playing a crucial role in ensuring high-quality outputs across various languages and formats. “In languages like Hindi and Tamil, the quality is decent, but for others like Telugu, Kannada, Malayalam, Bangla, and Marathi, the output quality is still poor,” said Sanghvi. 

However, the quality improves every week. “We put out a few episodes even in languages where we’re not happy with the quality to keep experimenting and improving,” Sanghvi added.

Beyond content creation, AI is helping Kuku FM in comprehensively generating and analysing metadata. “We have used AI to generate over 500 types of metadata on each of our content. AI itself identifies these attributes, and at an aggregate level, we can understand what makes certain content perform better than others,” he mentioned.

One of the most transformative aspects of Kuku FM’s use of AI is its impact on creators. The platform is in the process of empowering 5,000 creators to become full-stack creative producers. 

“As the generative AI tools become better, every individual is going to become a full-stack creator. They can make choices on the visuals, sounds, language, and copy, using AI as a co-pilot,” Sanghvi said. “We are training people to become creative producers who can own their content from start to end.”

When asked about the competitive landscape such as Amazon’s Audible or PocketFM, and future plans, Sanghvi emphasised that AI should not be viewed as a moat but as a platform. “Every company of our size, not just our immediate competition, will use AI as a great enabler. AI is going to be commoditised; everybody will have access to the tools. What will remain crucial is the talent pool you have – the storytellers,” he explained.

Everyone’s a Storyteller with AI

In a unique experiment blending generative AI tools, former OpenAI co-founder Andrej Karpathy used the Wall Street Journal’s front page to produce a music video on August 1, 2024. 

Karpathy copied the entire front page of the newspaper into Claude, which generated multiple scenes and provided visual descriptions for each. These descriptions were then fed into Ideogram AI, an image-generation tool, to create corresponding visuals. Next, the generated images were uploaded into RunwayML’s Gen 3 Alpha to make a 10-second video segment.

Sanghvi also touched upon the possibility of edge applications of AI, like generating audiobooks in one’s voice. “These are nice bells and whistles but are not scalable applications of AI. However, they can dial up engagement as fresh experiments,” he said.

Kuku FM is also venturing into new formats like video and comics, generated entirely through AI. He said that the team is not going for shoots or designing characters in studios. “Our in-house team works with AI to create unique content for video, tunes, and comics,” he revealed.

Sanghvi believes that Kuku FM is turning blockbuster storytelling into a science, making it more accessible and understandable. “The insights and structure of a story can now look like the structure of a product flow, thanks to AI,” Sanghvi remarked. 

“This democratises storytelling, making every individual a potential storyteller.” As Sanghvi aptly puts it, “The only job that will remain is that of a creative producer, finding fresh ways to engage audiences, as AI will always be biassed towards the past.”

The post Kuku FM is Using Generative AI to Make Everyone a Full-Stack Creative Producer appeared first on AIM.

]]>
https://analyticsindiamag.com/intellectual-ai-discussions/kuku-fm-is-using-generative-ai-to-make-everyone-a-full-stack-creative-producer/feed/ 0
The Hidden Risks in Open-Source AI Models https://analyticsindiamag.com/ai-breakthroughs/the-hidden-risks-in-open-source-ai-models/ https://analyticsindiamag.com/ai-breakthroughs/the-hidden-risks-in-open-source-ai-models/#respond Wed, 31 Jul 2024 12:36:20 +0000 https://analyticsindiamag.com/?p=10130950

“If you ever thought that popular packages are safe, not necessarily. Attackers focus on those assets to deliver immediate attacks,” said Jossef Kadouri of CheckMarx.

The post The Hidden Risks in Open-Source AI Models appeared first on AIM.

]]>

Recently, Meta chief Mark Zuckerberg, a crusader of open-source models, emphasised the safety aspects for the models. “There is an ongoing debate about the safety of open-source AI models, and my view is that open-source AI will be safer than its alternatives. I think governments will conclude it’s in their interest to support open source because it will make the world safer and more prosperous,” he said.  

However, open-source platforms are not immune to data threats. With the rise in cybersecurity incidents, with over 343 million victims reported just last year, the focus is back on data security, especially with AI in the picture. 

“One of the biggest challenges today is the new field of AI and finding how attackers are going to use AI to attack us. We take a step, the attackers take a step. It’s a never-ending game and the gap is minimising. I can’t predict what’s going to happen next year because this technology is developing exponentially,” said Jossef Harush Kadouri, the head of supply chain security at CheckMarx, in an exclusive interaction with AIM at the recent Accel Cybersecurity Summit. 

Cyber Attacks on Open Platforms

Kadouri, who works out of Tel Aviv, Israel, is in charge of protecting enterprises against software supply chain attacks. He has previously served in the Israel Defense Force’s cybersecurity wing for over four years. He currently ranks in the top 1% of users on Stack Overflow. 

Alluding to how packages with high ratings does not mean they are safe from malicious attacks even on platforms such as GitHub and Hugging Face, Kadouri said, “Now, if you ever thought that popular packages are safer, not necessarily. 

Attackers focus on those assets to deliver immediate attacks,” he said, emphasising on typosquatting, where Python developers are targeted through registered misspelled versions of popular packages such as Selenium

“Once we investigated the actual code executed in this malicious package, this code was highly cryptic, obfuscated, hard to read and understand, and it’s executed upon installation,” warned Kadouri. Over 900 packages containing obfuscated codes that execute upon installation were revealed. 

Hugging Face is the Disneyland of Open-Source Models

Popular developer platforms such as GitHub and Hugging Face have not been free of all kinds of threats. Though Hugging Face is taking active measures to prevent backdoor threats, the platform is susceptible to model-based attacks

“Hugging Face is like the Disneyland of LLM, open-source models, and pre-trained models,” said Kadouri.

Various forms of cyber attacks are continuously taking place on these platforms. Malicious browser extensions are another common route through which hackers syphon off money. 

In the context of cryptocurrency transfers, users typically copy and paste wallet addresses to avoid errors. It was revealed that a malicious browser extension could alter the copied address, potentially redirecting crypto funds to a different wallet. 

“This is how sophisticated the attacks are. You would only realise it once it’s too late. You can’t undo a crypto transaction,” said Kadouri. 

Cyber Threat Awareness

With the rising cyber attack cases, one of the biggest needs of the hour is awareness. “I think what we need to do is educate and raise awareness that we have bad guys operating in this attack surface,” said Kadouri, who supports the whole open platform that enables developers build products, however, is wary of the risks that are apparent. 

“It’s a good thing to do, but they [platforms] also don’t vet the content they host. So this is why we need to stay alert from things that may look legitimate, but are not. Because, anyone can contribute fresh new content to open source and disguise it as something that is well worth it,” he said.  

Interestingly, Rahul Sasi, the co-founder and CEO of CloudSEK, an AI-powered digital risk management enterprise, reflected similar sentiments. 

Speaking about the recent Indian telecom operator whose user data was hacked (without taking names), Sasi mentioned that the companies don’t acknowledge it, which is a problem that hinders cybersecurity awareness. 

“I mean, the problem with this company is that they also don’t understand. Or many times the security teams understand, but then there is high pressure on the top management not to accept it,” said Sasi, in an exclusive interaction with AIM

“Things have improved in the last 10 years. But, it hasn’t reached where it should. In my perspective, maybe in another 10 years it hopefully will. The media also has a role to play here. If you try to blame somebody, they’ll always try to defend,” he said. 

With AI in the cybersecurity scene, optimism is on the higher end. However, Kadouri doesn’t completely believe so. 

AI in Cyberattacks  

Speaking about how AI has added to cyberattacks through deep-fake technology, for instance, Kadouri still “wants to believe” that AI might be a problem-solver too. 

“I can definitely see AI helping us defenders do our jobs better, reduce manual labour and automate things. But, if they’re [cyber attackers] so good at fooling human beings, they’re probably going to be good at fooling AI too,” he said. 

“I mean, time will tell,” he concluded. 

The post The Hidden Risks in Open-Source AI Models appeared first on AIM.

]]>
https://analyticsindiamag.com/ai-breakthroughs/the-hidden-risks-in-open-source-ai-models/feed/ 0
Kwai’s Kling vs OpenAI Sora  https://analyticsindiamag.com/ai-breakthroughs/kwais-kling-vs-openai-sora/ https://analyticsindiamag.com/ai-breakthroughs/kwais-kling-vs-openai-sora/#respond Wed, 31 Jul 2024 11:53:14 +0000 https://analyticsindiamag.com/?p=10130890

Given that OpenAI only grants a limited number of select creators access to Sora, Kling AI might just be the top choice.

The post Kwai’s Kling vs OpenAI Sora  appeared first on AIM.

]]>

Kuaishou Technology, a Chinese AI and technology company, launched a new text-to-video model called Kling this year. Su Hua, a Chinese billionaire internet entrepreneur, is the co-founder and CEO of the video platform Kuaishou, known outside of China as Kwai.

Several AI enthusiasts shared their creations from Kling on X that captured the hearts of internet users worldwide. A series of animals and objects were featured enjoying a meal of noodles. From a panda munching on a bowl of ramen to a kangaroo slurping up some udon, the videos are both hilarious and heartwarming. 

https://twitter.com/_akhaliq/status/1812573686152450397

A few others include blueberries turning into puppies and a tray of apples turning into guinea pigs, which mess with your head.

The level of detail and realism in the videos is a testament to the capabilities of Kling and the progress made in the field of AI.

Known for the creation of TikTok competitor, Kuaishou joined the race with other Chinese tech companies to rival OpenAI’s Sora

With simple text prompts, it can generate highly realistic videos in 1080p high-definition resolution. The videos can be up to two-minutes long. Sora, on the other hand, makes 60-second videos with text prompts.

Kling boasts the ability to produce realistic motions on a large scale, simulate the attributes of the physical world, and weave concepts and imagination together setting a new benchmark in AI-powered video creation. 

However, as impressive as Kling AI may be, its accessibility is primarily limited to a few users even though the company claimed it would be available worldwide. This poses significant challenges for its global adoption.

For some users who were looking forward to accessing its offerings, this situation may feel like a huge letdown. 

A worldwide release creates expectations of inclusivity and accessibility; when these are unmet, it can harm the company’s reputation. Despite Kling AI’s impressive features, its primary hurdle is limited availability. 

Currently, its access is mostly limited to invited beta testers, with some users in China able to experience a limited demo version through the Kuaishou app, as claimed by ChatGPT on Quora. 

At a time when the US is heavily debating AI ethics and incorporating ‘Responsible AI’, China seems unperturbed and is likely responding to these AI ethicists with a Kling. 

The AI company hit the headlines recently by announcing the global launch of its International Version 1.0, a platform designed to revolutionise industries worldwide. This milestone release features advanced machine learning, multilingual support, and enhanced data analytics, promising unparalleled efficiency and innovation across sectors. 

AI Video Generator War Begins! 

While systems like OpenAI’s Sora and Kuaishou’s Kling have showcased impressive capabilities, they remain accessible only to a select group of users. Similarly, Luma AI’s Dream Machine also boasts remarkable features but is limited to a restricted audience.

Interestingly, Kuaishou’s AI tool entered the market shortly after Vidu AI, another Chinese text-to-video AI model known for producing HD 1080p 16-second videos.

This model’s launch coincides with a flurry of activity in the generative AI sector, as startups and tech giants compete to develop advanced tools that create realistic images, audio, and video from text inputs.

It has a user-friendly interface that supports text-to-video or text-to-image generation. 

Unlike Runway, Haiper, and Luma Labs, it prompts up to 2,000 characters, enabling highly detailed descriptions. It performs better with lengthy, well-crafted prompts.

This cutting-edge AI model employs variable resolution training, enabling users to produce videos in various aspect ratios. Remarkably, it can showcase full expression and limb movement from a single full-body image. 

AI video creation seems like the next battleground for tech companies with contenders like OpenAI’s Sora, Microsoft’s VASA-1, Adobe’s Firefly, Midjournery, and Pika Labs, already in the game. 

Furthermore, Google recently introduced Veo, a new text-to-video AI model, at Google I/O to compete with OpenAI’s Sora. Veo improves on previous models, offering consistent, high-quality over-a-minute-long 1080p videos.

While some were impressed with Veo’s capabilities, others argue that it may not exactly be state-of-the-art in its latency or abilities compared to Sora.

Now that Kling is here, the benchmark of making cinematically impressive and real-world-like videos has gone up. 

Why is Kling a big deal?

This month, Runway introduced Gen-3, which offers enhanced realism and the ability to generate 10-second clips. Last month, Luma Labs unveiled the impressive Dream Machine. 

These new model updates were initially spurred by the release of Sora earlier this year, which remains the benchmark for AI video generation. Recently, a series of short films on YouTube showcased Sora’s full potential. Additionally, Kling played a significant role in the wave of updates.

It also adopts a unique approach to AI by incorporating generative 3D in its creation process. It provides Sora-level scene changes, clip lengths, and video resolution. Given that OpenAI only grants a limited number of select creators access to Sora, Kling AI might just be the top choice for now.

Capabilities of Kling AI

Kling AI is accessible via the Kuaishou app, available on both iOS and Android platforms. This mobile app puts Kling AI’s advanced video generation capabilities directly at users’ fingertips, enabling them to create high-quality, realistic videos from their smartphones.

For users outside China, accessing Kling AI often requires navigating around these barriers. Some have resorted to emailing Kuaishou directly to request access, explaining their interest in becoming beta testers. 

The competitive landscape is evolving, but the restrictions on access can hinder Kling’s ability to gain traction outside China.

Chinese attempts to lure domestic developers away from OpenAI – considered the market leader in generative AI – will now be a lot easier, after OpenAI notified its users in China that they would be blocked from using its tools and services. 

“We are taking additional steps to block API traffic from regions where we do not support access to OpenAI’s services,” said an OpenAI spokesperson.

OpenAI has not elaborated about the reason for its sudden decision. 

ChatGPT is already blocked in China by the government’s firewall, but until this week developers could use virtual private networks to access OpenAI’s tools in order to fine-tune their own generative AI applications and benchmark their own research. Now the block is coming from the US side.

The OpenAI move has “caused significant concern within China’s AI community”, said Xiaohu Zhu, the founder of the Shanghai-based Centre for Safe AGI, which promotes AI safety, not least because “the decision raises questions about equitable access to AI technologies globally”.

The post Kwai’s Kling vs OpenAI Sora  appeared first on AIM.

]]>
https://analyticsindiamag.com/ai-breakthroughs/kwais-kling-vs-openai-sora/feed/ 0
Accidental Super Apps https://analyticsindiamag.com/ai-breakthroughs/accidental-super-apps/ https://analyticsindiamag.com/ai-breakthroughs/accidental-super-apps/#respond Wed, 31 Jul 2024 04:30:00 +0000 https://analyticsindiamag.com/?p=10130784

If you think Zomato and Swiggy are just about delivering food or booking tables at fancy restaurants, think again.

The post Accidental Super Apps appeared first on AIM.

]]>

When asked about our favourite apps, most of us would name Zomato, Swiggy, or Zepto. And it’s pretty obvious why. Well, craving food when stressing over the looming possibility of a 14-hour workday is completely justified. 

But if you think Zomato and Swiggy are just about delivering food or booking tables at fancy restaurants, it’s time to rethink.

A YouTuber from the channel Full Disclosure recently interviewed food delivery riders and revealed that some of these riders earn between INR 40,000 and INR 50,000 per month, which is more than the average income of many IT professionals.

One rider even mentioned that he managed to save INR 2 lakh in just six months.

In another interesting scenario, Venkatesh Gupta, a techie, shared on X a strange encounter. He met a senior Microsoft engineer in Bengaluru driving an auto rickshaw over the weekends to fend off loneliness.

Every Social Media Platform is an eCommerce Store

Most social media sites, including Facebook, Twitter, Instagram, and WhatsApp, are striving to become e-commerce platforms. 

Every platform you visit is working hard to enable you to buy things without leaving their site. TikTok, for example, changed its ‘storefront’ to ‘shop’ so you can make purchases directly on the app. No more jumping to other websites.

This isn’t just TikTok. YouTube allows creators to sell products in their videos. Pinterest has ‘buyable pins’ so you can shop without leaving the site. Facebook has stores, even on Messenger. Google is also in the game with Google Business.

WhatsApp was launched as a one-to-one chat app service in February 2009 and now offers Business Accounts with payment options. 

WhatsApp for e-commerce allows customers to complete the entire transaction on one app without needing to shift platforms. This can be done by integrating the product catalogue with WhatsApp Business. 

Customers can then browse products, view prices, and make purchases directly within the WhatsApp interface. 

Unexpected Chat Apps

In recent years, many applications initially designed for specific purposes have added messaging capabilities. When you break up with your boyfriend and think you’ve blocked him everywhere, including Instagram, WhatsApp, and Facebook, he can still text you on GPay! 

Instagram, originally a photo-sharing app, now offers direct messaging (DM) to allow users to chat, share media, and even engage in group conversations. It also functions as an app that can make you doubt yourself and give you an inferiority complex (maybe). 

Coming back to Google Pay, which was initially just a payments app, now includes messaging features to facilitate communication around transactions, share payment details, and more.

There’s also Spotify, which allows users to share songs and playlists through integrated messaging services.

Super Apps on the Rise

In 2022, the global super apps market was valued at a staggering $61.30 billion, with a growth trajectory of 27.8% expected from 2023 to 2030. Gartner’s survey reveals that the top 15 super apps have been downloaded over 4.6 billion times worldwide, boasting 2.68 billion monthly active users.

By 2050, it is anticipated that more than half of the global population will be using super apps. Well, the key to their widespread adoption lies in the mobile-first market, where smartphones are the primary connected devices. 

Another major contributor is their integration of financial and payment services. Paytm in India, Grab in Singapore, Goto in Indonesia, and Zalo in Vietnam are a few examples, each pivoting their user experience around robust financial and banking services.

Even apps like Cred, which is the credit card bill payment platform, are joining the UPI payment club competing against Paytm and PhonePe.  

Earlier ​​Deepinder Goyal, the CEO of Zomato had said that super apps don’t work in India. He would rather have Zomato and Blinkit grow as separate brands.

Maybe it’s time to rethink. 

The post Accidental Super Apps appeared first on AIM.

]]>
https://analyticsindiamag.com/ai-breakthroughs/accidental-super-apps/feed/ 0
Why Canva Acquired Leonardo.Ai https://analyticsindiamag.com/ai-breakthroughs/why-canva-acquired-leonardo-ai/ https://analyticsindiamag.com/ai-breakthroughs/why-canva-acquired-leonardo-ai/#respond Tue, 30 Jul 2024 06:38:05 +0000 https://analyticsindiamag.com/?p=10130683

Leonardo.Ai boasts over 19 million registered users and has facilitated the creation of more than a billion images.

The post Why Canva Acquired Leonardo.Ai appeared first on AIM.

]]>

In a strategic move to bolster its generative AI capabilities, Canva has acquired Leonardo.Ai, a startup renowned for its generative AI content and research. Canva co-founder and chief product officer Cameron Adams has said that all 120 employees of Leonardo.ai, including the executive team, will join Canva. The acquisition is said to be a mix of cash and stock.

Leonardo.Ai, co-founded by Jachin Bhasme, JJ Fiasson, and Chris Gillis in Sydney in 2022, initially focused on creating video game assets. Over time, the company expanded its platform to cater to diverse industries such as fashion, advertising, and architecture by developing AI models for image creation. Some people call it the biggest competitor to Midjourney.

https://twitter.com/canva/status/1818127114140541066

The company had previously raised over $38.8 million from several backers such as Smash Capital, Blackbird, Side Stage Ventures, Gaorong Capita, Samsung Next, and TIRTA Ventures.

Today, Leonardo.Ai boasts over 19 million registered users and has facilitated the creation of more than a billion images. It provides collaboration tools and a private cloud for various models, including video generators. It also offers API access, enabling customers to develop their own technological infrastructure using Leonardo.Ai’s platform.

A Game Changer?

Despite the acquisition, Leonardo.Ai will continue to operate independently, prioritising rapid innovation and research, now backed by Canva’s resources. 

“We’ll keep offering all of Leonardo’s existing tools and solutions. This acquisition aims to help Leonardo develop its platform and deepen its user growth with our investment. This includes expanding their API business and investing in foundational model R&D,” said Adams.

Leonardo.Ai sets itself apart from other generative AI art platforms by providing extensive user control. Features on Live Canvas allow users to input text prompts and make quick sketches, generating photorealistic images in real-time. 

However, the methods Leonardo.Ai uses to train its in-house generative models, such as the flagship Phoenix model, remain unclear. This can be challenging for Canva to figure out later.

Canva has been a strong supporter of creators in the generative AI space. The company paid $200 million to compensate creators who allowed their content to be used for training AI models. The acquisition of Leonardo.ai will contribute to Canva’s Magic Studio generative AI suite, enhancing existing tools and introducing new capabilities.

“Magic Studio works on internally-developed AI and ML algorithms that leverage a combination of foundational AI models from our team, including Kaleido, and a variety of partners like OpenAI, Google, AWS, and Runway,” Danny Wu, the head of AI products at Canva, told AIM in a conversation earlier this year.

Canva Hell-Bent on Generative AI

Adams expressed excitement about integrating Leonardo’s technology into Magic Studio. “We’re eager to expand what our users can achieve with AI on Canva,” he said. 

Canva has been ramping up its AI development efforts, highlighted by previous acquisitions such as Kaleido in 2021, which laid the groundwork for many of Canva’s recent AI advancements, which is also looking for an IPO soon.

Leonardo.ai is Canva’s eighth acquisition overall and its second this year, following the $380 million acquisition of UK-based design company Affinity. Canva’s robust portfolio includes presentations startup Zeetings, stock photography sites Pixabay and Pexels, and product mockup app Smartmockups.

“We’ve placed a strong focus on building an AI-powered workflow that includes generative solutions like image and design generation,” said Adams. He added that new generative capabilities will help the company set its AI offerings apart.

Meanwhile, competitors such as Figma and Adobe are also making strides in generative AI. Just last month, Figma introduced Figma AI, a suite of AI-powered features to enhance designers’ creativity and productivity.  

Adobe has also diversified its portfolio into a generative AI-powered enterprise software platform, introducing Firefly to Photoshop and launched features like Generative Fill and Generative Remove for advanced image editing.

The updates follow Adobe’s failed 20 billion dollar acquisition of Figma due to antitrust scrutiny last year. Now, taking matters into its own hands, the company is reinventing itself with AI. With AI products in the pipeline and a redesigned interface, Figma is gearing up to compete with major players like Adobe and Canva.

The post Why Canva Acquired Leonardo.Ai appeared first on AIM.

]]>
https://analyticsindiamag.com/ai-breakthroughs/why-canva-acquired-leonardo-ai/feed/ 0
Futuristic Acer AI PCs Coming Soon in Indian Market https://analyticsindiamag.com/intellectual-ai-discussions/futuristic-acer-ai-pcs-coming-soon-in-indian-market/ https://analyticsindiamag.com/intellectual-ai-discussions/futuristic-acer-ai-pcs-coming-soon-in-indian-market/#respond Sun, 28 Jul 2024 08:37:46 +0000 https://analyticsindiamag.com/?p=10130428

So far, Acer has launched the Acer Swift 14 AI PCs, the TravelMate series, and its Predator Helios AI gaming laptops in India. 

The post Futuristic Acer AI PCs Coming Soon in Indian Market appeared first on AIM.

]]>

Around 241.8 million units of personal computers were sold all across the world in 2023. Despite this, it was the worst year on record for PC sales, with a nearly 15% decline compared to the previous year, according to Gartner.

In India too, last year, the PC market declined by 6.6%. Low market sentiments post-pandemic, supply chain constraints and geopolitical tensions contributed to the decline; but now PC makers are hoping generative AI could help alter their fate.

The top PC makers in the world have been quick to ship AI-powered PCs in most markets. Acer, which has a relatively small portion of the PC sales market, witnessed a 12.3% increase in sales in 2023, the highest among all.

Acer too has already launched a series of AI PCs that are also available in the Indian market which comes with built-in AI features

New AI PCs Coming Soon to Indian Market 

In an interview with AIM, Sudhir Goel, chief business officer at Acer India said, “At Computex 2024, we have showcased a lot of new products, which we are thrilled to introduce to the Indian market in the coming year.”

So far, Acer has launched the Acer Swift 14 AI PCs, the TravelMate series, and its Predator Helios AI gaming laptops in India.  

“With Swift AI, all functionalities, ranging from image enhancement to voice processing, are performed locally, thanks to the neural processing unit integrated within our laptops. This ensures an elevated level of privacy and security, paramount for individual and corporate users,” Goel said.

The TravelMate business PCs are equipped with sophisticated enterprise-grade AI security, and they will soon be available in India. These laptops feature advanced AI tools, including Acer LiveArt and the GIMP with Intel’s Stable Diffusion plugin. 

It also leverages NPU for AI-accelerated applications to blur the background, automatically frame, and maintain eye contact during video conferencing. Moreover, AI will optimise power consumption during long conferencing calls.

“In the gaming sector, AI integration will redefine immersive gaming by enabling real-time map generation and dynamic creation of in-game elements based on live data. Additionally, we will introduce advanced AI-driven monitors to elevate the overall user experience,” Goel revealed.

The Predator Helios 16 laptops are already available in the Indian market. However, the true advantage of AI emerges when an AI model can be run locally on the device. Given the gargantuan sizes of these Large Language Models (LLMs), they can only run on the cloud.

“With Acer’s cutting-edge Neural Processing Units (NPU), our laptops can handle LLM tasks directly on the device. As we look to the future, we envision local LLM capabilities becoming a significant differentiator in the market,” Goel revealed.

Acer Laptops Will have New AI Processors 

The Acer Swift AI PCs come with Qualcomm’s Snapdragon Elite processors. However, Acer plans to offer a range of new processors in its upcoming laptops. 

“While Snapdragon’s latest technology offers remarkable capabilities, we are also exploring options from Intel and AMD. Our strategy is to evaluate and incorporate processors from all these leading providers based on their strengths and innovations. This approach ensures that our laptops can cater to a wide range of needs, from exceptional performance and efficiency to specialised AI features,” Goel said. 

The TravelMate P6 14 laptop features Intel Core Ultra 7 processors with Intel vPro Enterprise, Intel Graphics, and Intel AI Boost.

Will AI PCs Boost the Market?

While Acer’s introduction of AI-capable PCs is impressive, other PC manufacturers have swiftly followed suit. Dell and HP have also released AI-powered PCs in the Indian market this year. Most recently, Microsoft unveiled its Surface AI PCs in India, featuring Snapdragon Elite Processors.

PC makers are hoping AI could help pull the market from the stalemate that it was last year. Research firm Canalys predicts that the PC market will see an 8% annual growth in 2024 as more AI PCs hit the market. Canalys also predicts AI PCs will capture 60% of the market by 2027. 

Goel also believes generative AI has the potential to significantly boost laptop sales. “Over the past few years, the PC industry has been striving to make devices more powerful, efficient, thinner, and lighter. However, it lacked a transformative technology that could truly revolutionise the market. With the advent of AI, this missing piece has finally arrived,” he said.

AI-driven workloads in PCs could enhance performance, enable new functionalities, and create a more seamless user experience. PCs makers are desperately banking for this to happen.

Acer, too Turns to Server Business and Consumer Electronics

Earlier this year, Acer also launched its consumer electronics and home appliances brand, Acer Pure in India. When we asked Goel whether it is a result of declining PC sales, he said, “Acer India is the fastest-growing PC brand in India, and we have seen remarkable YoY growth for our PC business with 2X growth in the consumer market and market leadership in some of the commercial segments.”

He stressed that PCs remain Acer’s core business. Interestingly, Acer also launched its server business in India a few years back, a segment dominated by Dell, HP, and Lenovo, other PC brands Acer competes with.

Called Altos Computing, it caters to the growing demand for high-performance servers and workstations in India’s digital infrastructure landscape. 

“It includes introducing AI-powered solutions to support local cloud and data storage initiatives, which are crucial for governmental and corporate digital transformation priorities,” Goel concluded.

The post Futuristic Acer AI PCs Coming Soon in Indian Market appeared first on AIM.

]]>
https://analyticsindiamag.com/intellectual-ai-discussions/futuristic-acer-ai-pcs-coming-soon-in-indian-market/feed/ 0