Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ


Choose your language

InfoQ Homepage News Galactica: Large Language Model for Scientific Knowledge

Galactica: Large Language Model for Scientific Knowledge


Meta AI and Papers with Code recently released Galactica, a 120-billion-parameter scientific-language model which can search and summarize academic literature, solve math problems, and write scientific code.

Galactica's architecture is based on a transformer, an attention mechanism which draws global dependencies between input and output. Although, this model is a decode-only setup with some changes. Some of the changes compared with the original transformer include using  GeLU as an activation function, learnt position embedding,  a vocabulary using byte pair encoding method and no bias parameter on dense-kernel or layer-norms.

The researchers trained the model using a tokenization process with various modalities ( natural language versus math formulas versus molecular sequences, etc.). They used a special tokenization including things like identifying math operation characters or mark start/end of different types of sequences. The source material for the dataset included 48-million papers, textbooks, reference materials, compounds, proteins and other sources of scientific knowledge. They implemented a special token to identify sections of step-by-step reasoning, which encourages Galactica to apply an internal working memory of sorts, which it would otherwise not be able to do.

There have been multiple large language models (LLM) released in the last year with billions of parameters, not specialized per se in the science domain. Some of the models benchmarked in Galactica paper are OPT, BLOOM , GPT-3, Chinchilla and PaLM. Galactic performs well on reasoning , outperforming Chinchilla on the mathematical MMLU dataset by 41.3% to 35.7% and PaLM-540B on MATH with a score of 20.4% versus 8.8%. Despite not being trained on a general corpus, Galactica outperforms BLOOM and OPT-175B on the BIG-bench dataset benchmark. The Gradient offers a detailed look at evaluating natural-language-processing models.

Galactica is available as a Python package or a web interface for providing prompts. The first one can be installed as follows: 

pip install galai

A small script for using the standard mode (6.7-billion parameters) is:

import galai as gal

model = gal.load_model("standard")
model.generate("Scaled dot product attention:\n\n\\[")
# Scaled dot product attention:\n\n\\[ \\displaystyle\\text{Attention}(Q,K,V)=\\text{softmax}(\\frac{QK^{T}}{\\sqrt{d_{k}}}%\n)V \\]

If you want to test model reasoning performance you can use the following script:

model.generate("A force of 0.6N is applied to an object, which accelerates at 3m/s. What is its mass? <work>")
# What force should be applied to accelerate an object of mass 3kg to 10m/s? <work>\nWe can use Newton's second law: F = ma. We can substitute variables to get:\n\n\\[ F = \\left(66kg

Source : Galactica

Like many other LLMs , Galactica has a few limitations. It can trend toward using toxic language, a behavior known as "hallucination". Although, the team responsible mentioned that it is less toxic than other LLMs like OPT. Other limitations are frequency bias and overconfidence, especially about highly specialized scientific content. The first bias means it only recommends highly cited scientific papers. 

On social media, there has been quite a buzz around the topic:

For further insight about Galactica's demo exploration, there is for instance this tweet as well as this by Patrick Mineault, a former AI researcher at Google. There is also a good Twitter thread by Michael Black discussing Galatica's performance. Finally, there is a youtube review by Yannic Kilcher and a discussion about Galactica’s language toxicity here.

About the Author

Rate this Article


Hello stranger!

You need to Register an InfoQ account or or login to post comments. But there's so much more behind being registered.

Get the most out of the InfoQ experience.

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Community comments

  • Galactica has been heavily criticized and the demo had to be retired after a few days

    by Michele Mauro,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    Several authors that work in the AI Ethics space (Abeba Birhane, Willie Agnew , Gary Marcus ) have heavily criticized the Galactica model because the demo has demonstrated how easy it can "hallucinate" and produce very scientific-sounding text with extremely dangerous content, while certain "politically sensible" queries return very suspicious results.
    This tests have highlighted at least a lack of testing on the model, and raised suspicions of bias in some of the input materials. The article should have devoted a few more lines on this criticism.

  • Re: Galactica has been heavily criticized and the demo had to be retired af

    by Daniel Bryant,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    Hi Michele,

    Many thanks for your comment. I definitely appreciate the issues/dangers that you mentioned, and although we did write a paragraph about the "hallucinations" and potential for bias, can I ask what else you would have liked to have seen covered, please? And point taken about the lack of testing.

    We generally aim to keep our news coverage to ~500 words, and so we're constantly refining what kind of info to include that is useful to our readers :)

    By the way, we're always looking for new part-time news writers to join our team -- would you be interested in this?

    Best wishes,

    InfoQ News Manager

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p