The ‘Linux Moment’ in AI Has Finally Arrived, Says Meta Chief Mark Zuckerberg

Developers can run inference on Llama 3.1 405B on their own infrastructure at roughly 50% of the cost of using closed models like GPT-4o.

Share

Illustration by Nikhil Kumar

Published on July 24, 2024

by Siddharth Jindal

A few days ago, AIM asked a pertinent question: “When will the ‘Linux moment’ in AI arrive?” It looks like it has arrived sooner than expected. Yesterday, Meta released Llama 3.1, with 405 billion parameters, its largest and best model yet.

The state-of-the-art model outperformed OpenAI’s GPT-4o on several benchmarks, making it the best open-source model available.

Llama 3.1 is competitive with leading foundation models like GPT-4, GPT-4o, and Claude 3.5 Sonnet across various benchmarks. The smaller models in the Llama 3.1 series also perform on par with both closed and open models of similar parameter sizes.

Meta chief Mark Zuckerberg is confident that Llama 3.1 will have a similar impact on the AI ecosystem as Linux had on the operating system world.

“Today, Linux is the industry standard foundation for both cloud computing and the operating systems that run most mobile devices – and we all benefit from superior products because of it,” he said, adding, “I believe that AI will develop in a similar way. Today, several tech companies are developing leading closed models. But open source is quickly closing the gap.”

“I think open source AI is going to become the industry standard just like Linux did. It gives you control to customise and run your own models. You don’t have to send your data to another, and it’s more affordable,” he added.

Echoing similar sentiments, OpenAI co-founder Andrej Karpathy said, “I’d like to say that it is still very early days, that we are back in the ~1980s of computing all over again, that LLMs are a next major computing paradigm, and Meta is clearly positioning itself to be the open ecosystem leader of it.”

He elaborated on the applications of AI models, stating that people will prompt and retrieve information from the models, fine-tune them, distil them into smaller expert models for specific tasks and applications, and study, benchmark, and optimise these models.

Recently, Karpathy also discussed how AI kernels could potentially replace current operating systems.

https://twitter.com/karpathy/status/1707437820045062561?lang=en

Karpathy envisions an operating system where the LLM acts as the kernel, managing and coordinating system resources and user interactions, and AI agents function as applications, providing various services and functionalities.

In this system, natural language serves as the primary programming and interaction interface, allowing users to communicate with the system in plain English or other languages.

LLM OS. Bear with me I'm still cooking.

Specs:
– LLM: OpenAI GPT-4 Turbo 256 core (batch size) processor @ 20Hz (tok/s)
– RAM: 128Ktok
– Filesystem: Ada002 pic.twitter.com/6n95iwE9fR
— Andrej Karpathy (@karpathy) November 11, 2023

AI Agents, Everywhere

Zuckerberg anticipates that Meta AI will become the leading chatbot by the end of this year, given that ChatGPT currently boasts over 100 million users. Interestingly, Meta has not yet disclosed any usage statistics for its own assistant.

Meta anticipates that billions of AI agents will be developed worldwide using open-source models. “Our vision is that there should be a lot of different AIs and AI services out there, not just one singular AI, and that really informs the open-source approach,” said Zuckerberg.

“A lot of what we’re focused on is giving every creator and every small business the ability to create AI agents for themselves, making it so that every person on our platforms can create their own AI agents to interact with,” he added.

Vinod Khosla has been dreaming about it too. He envisions a future in which internet access will mostly be through agents acting for consumers, carrying out tasks, and fending off marketers and bots. “Tens of billions of agents on the internet will be normal,” he wrote.

“Eventually, all our interactions with the digital world will be mediated by AI assistants. This means that AI assistants will constitute a repository of all human knowledge and culture; they will constitute a shared infrastructure like the internet is today,” said LeCun.

He urged platforms to be open-source and said that we cannot have a small number of AI assistants controlling the entire digital diet of every citizen across the world, taking a dig at OpenAI and a few other companies without naming them.

“This will be extremely dangerous for diversity of thought, for democracy, and for just about everything,” he added.

Surprisingly, for the first time in history, Meta also updated its licence to allow developers to use the outputs from Llama models — including 405B — to improve other models.

“We’re excited about how this will enable new advancements in the field through synthetic data generation and model distillation workflows, capabilities that have never been achieved at this scale in open source,” said the company.

Last week, a faulty sensor configuration update by CrowdStrike caused a significant Microsoft Windows outage, impacting global transport, finance, and medical sectors. Looking ahead, if AI agents are managed by a single entity and cloud infrastructure, a similar failure could re-occur.

“I think the main thing that people are going to do [with Llama 3.1 405], especially because it’s open source, is use it as a teacher to train smaller models that they use in different applications,” said Zuckerberg.

“If you just think about all the startups out there, or all the enterprises or even governments that are trying to do different things, they probably all need to, at some level, build custom models for what they’re doing,” he added.

The idea was suggested earlier by Karpathy, who explained that in the future, as larger models help refine and optimise the training process, smaller models will emerge. “The models have to first get larger before they can get smaller because we need their (automated) help to refactor and mould the training data into ideal, synthetic formats.”

Last week, we saw the release of several small models that can be run locally without relying on the cloud. Small language models, or SLMs, are expected to become the future alongside generalised models like GPT-4 or Claude 3.5 Sonnet.

There is no doubt that OpenAI felt pressure as it made fine-tuning available for its latest model, GPT-4o mini.

“Customise GPT-4o mini for your application with fine-tuning. Available today to tier 4 and 5 users, we plan to gradually expand access to all tiers. The first 2M training tokens a day are free, through Sept 23,” the company posted on X.

Meanwhile, Zuckerberg has claimed that developers can run inference on Llama 3.1 405B on their own infrastructure at roughly 50% of the cost of using closed models like GPT-4o, for both user-facing and offline inference tasks.

Jealous Much?

One thing is clear, OpenAI can’t stay silent when its competitors are on the verge of releasing a new model. Throughout the day, the company has been emphasising safety and preparedness.

“We won’t release a new model if it crosses a ‘medium’ risk threshold until we implement sufficient safety interventions,” the company said. OpenAI appears to be subtly suggesting that open-source models might present risks to society.

Mira Murati, the chief technology officer of OpenAI, during a recent interview at the AI Everywhere event at Dartmouth College, said that OpenAI gives the government early access to new AI models, and they have been in favour of more regulation.

“We’ve been advocating for more regulation on frontier models, which will have these amazing capabilities and also have a downside because of misuse. We’ve been very open with policymakers and working with regulators on that,” she said.

On the other hand, many are concerned about potential regulations stifling innovation. “I hope foolish regulations like California’s proposed SB1047 don’t stop such innovations,” said Deeplearning.ai founder Andrew Ng on Llama 3.1 release.

Thank you Meta and the Llama team for your huge contributions to open-source! Llama 3.1 with increased context length and improved capabilities is a wonderful gift to everyone.

I hope foolish regulations don't like California's proposed SB1047 don't stop such innovations.
— Andrew Ng (@AndrewYNg) July 23, 2024

Meanwhile, Meta has decided not to release its multimodal Llama AI model in the European Union due to regulatory concerns.

The company cited the lack of clarity and predictability in the EU’s regulatory framework for AI, which includes existing regulations like the General Data Protection Regulation and forthcoming ones such as the AI Act.

📣 Want to advertise in AIM? Book here

Siddharth Jindal

Siddharth is a media graduate who loves to explore tech through journalism and putting forward ideas worth pondering about in the era of artificial intelligence.