Last updated September 8, 2023
In Innovation in AI

Apple Springs a Surprise, Embraces Open-Source Training Method

Apple recently released the code for its internal software used for training its new LLM on GitHub

Share

Published on September 8, 2023

by Vandana Nair

Listen to this story

When the news of Apple working on its generative AI tools and chatbot appeared two months ago, the positive market sentiments pushed Apple’s shares to a record high of $198.23, reflecting a gain of 2.3%. However, apart from Apple using Ajax for its LLM and employees internally naming it AppleGPT, no other details on the model were released.

In a new development, as per a report by The Information, Apple is training Ajax GPT on more than 200 billion parameters, believed to be more powerful than GPT-3.5, and, hear here, Apple did something it has never done before — open sourced its code on GitHub!

An Unprecedented Move

In July, Apple discreetly uploaded the code for AXLearn on GitHub, making it accessible to the public for training their own large language models without the need to start from scratch. AXLearn, an internal software developed by Apple over the past year for training Ajax GPT, is a machine-learning framework. It serves as a pre-built tool for rapidly training machine-learning models. Ajax is a derivative of JAX, an open-source framework created by Google researchers, and some of the components of AXLearn are specifically designed for optimisation on Google TPUs.

While Apple might be way ahead in bringing innovative solutions, there is a rotten side that puts company’s priorities before anything else. Apple has been infamous for fostering a closed-source environment. None of their technologies or codes have been open to the public. When big-tech companies are releasing superior open source models such as Meta’s Llama-2, Anthropic’s Claude-2, Falcon, Vicuna and others, Apple has always stuck to their conventional route of secrecy, something OpenAI has also been following. Apple’s close-source approach has been criticised by the tech community, labelling the company as one that benefits from research released by big tech but never gives anything in return.

Apple’s decision to open-source its training software, AXLearn, is a significant step from its secrecy approach. This move could foster collaboration and innovation within the AI research community and reflect a broader trend of openness in AI development.

While the exact motive behind Apple’s decision to release the code on GitHub remains undisclosed, it is evident that the company’s substantial investment, amounting to millions of dollars spent daily on AI development, reflects its determination to compete vigorously in the AI race.

Interestingly, last month the company filed for the trademark “AXLearn” in Hong Kong.

Emulating Google Culture

Apple’s head of AI John Giannandrea, and Ruoming Pang, the lead of its conversational AI team called ‘Foundational Model’, both bring extensive experience from their previous roles at Google. Giannandrea brought his vision of making Apple like Google where employees had more freedom to conduct diverse research, publish papers and explore innovative ideas. Apple’s prior limitations in these areas had hindered talent growth and recruitment.

Reportedly, Apple has also hired talent from Google and Meta’s AI platform teams. In the past two years, at least seven of the 18 contributors to AXLearn on GitHub, previously worked at either Google or Meta. Apple has likely tweaked its approach to foster talent through the research community, which makes open-sourcing the right way ahead.

Decoding The Clues

Piecing together available information, it appears that Apple has formed two new teams that are working on language and image models. Apple’s recent AI research paper hints towards work on software capable of generating images, videos and 3D scenes, also implying a multimodal AI.

However, uncertainties remain on the integration of LLM into Apple’s products. Apple has always leaned towards bringing its new software on its devices, but integrating a 200-billion parameter LLM that requires more storage space and computing power on an iPhone, is not plausible. It is possible that the company might work on smaller models for phone integration or that the model will be used for something else, the details of which remain elusive.

📣 Want to advertise in AIM? Book here

Vandana Nair

As a rare blend of engineering, MBA, and journalism degree, Vandana Nair brings a unique combination of technical know-how, business acumen, and storytelling skills to the table. Her insatiable curiosity for all things startups, businesses, and AI technologies ensures that there's always a fresh and insightful perspective to her reporting.