Last updated August 20, 2024
In AI News

Galactica: Meta’s Answer to GPT-3 for Science

Meta releases Galactica, a large language model for science research

Share

Published on November 17, 2022

by Ayush Jain

Listen to this story

Two years ago, OpenAI released the GPT-3 model trained on 175 billion parameters. Since then, large language models (LLMs) have been the rage.

On Wednesday, MetaAI and Papers with Code announced the release of Galactica, an open-source large language model trained on scientific knowledge, with 120 billion parameters. The AI-generative tool will aid academic researchers by producing extensive literature reviews, generating Wiki articles on any topic, accessing lecture notes on scientific texts, producing answers to questions, solving complex mathematical solutions, annotating molecules and proteins, and more.

Galactica is trained on a large number of scientific papers, research materials, knowledge bases and numerous other sources, including scientific texts as well as modalities such as proteins and compounds.

Any output can be generated based on Galactica’s vast database by simply entering the prompt at galactica.org.

The new model is designed to tackle the issue of information overload when accessing scientific information through search engines, where there is no proper organisation of scientific knowledge. In comparison, Galactica is built with the mission of organising science—that is, a model that can store, combine and reason about scientific knowledge.

Research published shows that Galactica outperforms other model on different metrics:

(i) It beats the latest GPT-3 by 68.2% versus 49.0% on technical knowledge probes such as LaTeX equations.

(ii) In the measure of reasoning, it also surpasses Chinchilla on mathematical MMLU with 41.3% to Chinchilla’s 35.7%, and PaLm 540B on MATH with a score of 20.4% versus 8.8%.

(iii) It is also found to be better than BLOOM and OPT-175B on BIG-bench despite not being trained on general corpus.

The paper can be accessed here.

However, the AI community has been quick to address the issues surrounding this model. David Chapman took to twitter to explain how bad the output generated was based on some examples from the Hacker News discussion forum:

https://twitter.com/meaningness/status/1592634519269822464?s=61&t=HAnqXuDNjN0UmPNGu1LM5g

But, outside the issues with the model, the scientific community has also lauded Meta’s efforts in collating and indexing scientific works, databases, and code bases.

Large language model breakthroughs

Besides GPT-3 and Galactica, LLM such as YaLM are trained on 100 billion parameters, while models such as BLOOM and PaLM are trained on 176 billion, and 540 billion parameters respectively. We have also seen the rise of protein language models solving the decade old protein folding problem, and in the most recent development, the GenSLM model, which is able to predict Covid variants.

Moreover, we are amidst an age of ‘text-to-anything’, which is trained on massive language models and developed by companies like OpenAI, Microsoft, Google, etc. To the long list, we can now add ‘text-to-science-research’ as the new AI tool disrupting existing processes of scientific research and publication.

📣 Want to advertise in AIM? Book here

Ayush Jain

Ayush is interested in knowing how technology shapes and defines our culture, and our understanding of the world. He believes in exploring reality at the intersections of technology and art, science, and politics.