UHG
Search
Close this search box.

How Llama 2 Became the Heartbeat of IBM’s Strategy

How many more companies will ditch commercial alternatives and embrace Llama 2?

Share

IBM Watson Health
Listen to this story

IBM recently announced that it would host Meta’s Llama 2-chat 70 billion parameter model in the watsonx.ai studio, with early access available to select clients and partners. 

Enterprises are now embracing the trend of generative AI to bolster their business strategies. To harness its potential effectively, they require streamlined methods for training and constructing their own LLMs using their accumulated years of data. To address this challenge, various cloud providers, including AWS and Azure, have stepped up to offer assistance.

OpenAI’s partnership with Microsoft provided them with GPT-4 while AWS multi-LLM approach gave them the options to choose from a buffet of models like AI21, Cohere, Anthropic Claude 2, and Stability AI SDXL 1.0. Apart from well known clouds several other service providers popped up recently.

Enterprises sought a reliable solution they could trust from service providers.  Recently, AI enthusiasts have devised methods to train and construct Llama 2 models, yet the critical concern remains: Can these approaches be relied upon to handle the data with trustworthiness?

A few days back, AI expert  Santiago  tweeted “You can now test Llama 2 in less than 10 minutes,” introducing Monster API, a new tool that lets you effortlessly access powerful generative AI models such as Falcon, Llama, Stable Diffusion, and GPT J and others, without having to worry about managing the generative AI models or scaling them up to handle lots of requests. 

However, new initiatives like this are too risky for established companies to trust on and they have not proved their ability to scale the business. 

IBM has Customers’ Trust 

IBM is dedicated to prioritizing trust and security as it introduces its generative AI features. As an example, when users use the Llama 2 model in the prompt lab of watsonx.ai, they can activate the AI guardrails function. This helps in automatically filtering out harmful language from both the input prompt text and the generated output of the model. 

In an exclusive conversation with AIM, Geeta Gurnani, IBM Technology CTO and technical sales leader, IBM India & South Asia, said IBM is introducing an AI governance toolkit which is expected to be generally available later this year, which will help operationalise governance to mitigate the risk, time and cost associated with manual processes and provides the documentation necessary to drive transparent and explainable outcomes. 

“It will also have mechanisms to protect customer privacy, proactively detect model bias and drift, and help organisations meet their ethics standards.” she said. 

Why Llama 2 and not GPT-4 

Llama2 has gained popularity among the enterprise. This can be backed by the fact that it is available on Amazon Sagemaker, Databricks, Watsonx.ai, and even on Microsoft’s Azure which is the backbone of proprietary LLM GPT-4. 

Furthermore, the partnership between Meta and several prominent companies like Amazon, Hugging Face, NVIDIA, Qualcomm, Zoom, and Dropbox, as well as academic leaders, underscores the significance of open-source software.

Even OpenAI’s Karpathy, a prominent figure in the field of deep learning couldn’t resist himself from using Llama 2 which led to him creating Baby Llama. aka llama.c, where he had been exploring the concept of running large language models (LLMs) on a single computer as part of his recent experiments. Moreover, he even hinted that OpenAI might release open source models in the near future. 

In a similar vein, AI expert Santiago expressed that Llama 2 possesses all the elements for potential success: being open-source, having a commercial license, allowing cost-effective GPU usage, and enabling comprehensive control over the entire utilization process.

“I’ve talked to two startups migrating from proprietary models into Llama 2. How many more companies will ditch commercial alternatives and embrace Llama 2?,” he questioned 

GPT-4 is exclusively accessible through Microsoft Azure OpenAI Service, but enterprises can also purchase the GPT-4 API provided by OpenAI. Nonetheless, the limitation of GPT-4 is its closed-source nature, preventing users from creating their own models or experimenting with its code. Unlike Llama 2, which is free for commercial use, GPT-4 APIs come with a price tag. The charges are calculated per 1000 tokens, amounting to $0.03 for input and $0.06 for output. 

For a slightly complex use case, the monthly inference cost for a GPT 4 API can be anywhere between $ 250,000 to $ 300,000 per month inference cost for a GPT-4 API (having 16K context length) for a complex use case as per AIM Research. Therefore, when using the ChatGPT API, it’s essential to keep track of the token usage and manage it effectively to control costs, just as you would with a website integration.

Initially, we observed a trend where companies leaned towards Azure this quarter for GPT-4, exclusively available there, which subsequently boosted Azure cloud’s revenue. However, things took an intriguing turn when Microsoft partnered with Meta to host Llama 2. This underscores the fact that open source LLMs possess a unique advantage that shouldn’t be overlooked.

📣 Want to advertise in AIM? Book here

Picture of Siddharth Jindal

Siddharth Jindal

Siddharth is a media graduate who loves to explore tech through journalism and putting forward ideas worth pondering about in the era of artificial intelligence.
Related Posts
Association of Data Scientists
Tailored Generative AI Training for Your Team
Upcoming Large format Conference
Sep 25-27, 2024 | 📍 Bangalore, India
Download the easiest way to
stay informed

Subscribe to The Belamy: Our Weekly Newsletter

Biggest AI stories, delivered to your inbox every week.

Flagship Events

Rising 2024 | DE&I in Tech Summit
April 4 and 5, 2024 | 📍 Hilton Convention Center, Manyata Tech Park, Bangalore
Data Engineering Summit 2024
May 30 and 31, 2024 | 📍 Bangalore, India
MachineCon USA 2024
26 July 2024 | 583 Park Avenue, New York
MachineCon GCC Summit 2024
June 28 2024 | 📍Bangalore, India
Cypher USA 2024
Nov 21-22 2024 | 📍Santa Clara Convention Center, California, USA
Cypher India 2024
September 25-27, 2024 | 📍Bangalore, India
discord icon
AI Forum for India
Our Discord Community for AI Ecosystem, In collaboration with NVIDIA.