Stability AI Releases 1.6 Billion Parameter Language Model Stable LM 2

Stability AI released two sets of pre-trained model weights for Stable LM 2, a 1.6B parameter language model. Stable LM 2 is trained on 2 trillion tokens of text data from seven languages and can be run on common laptop computers.

Stable LM 2 is available in two versions: the base model and an instruction-tuned version called Stable LM 2 Zephyr. The base model was trained on data in Dutch, French, German, Italian, Portuguese, Spanish, and English. Stability AI used "recent algorithmic advancements in language modeling" to give the small model capabilities rivaling larger models. Stable LM 2 is licensed for non-commercial use under Stability AI's Non-Commercial Research Community License, or for commercial use with a Stability AI Membership. According to Stability AI:

By releasing one of the most powerful small language models to date and providing complete transparency on its training details, we aim to empower developers and model creators to experiment and iterate quickly. It is important to note that, due to the nature of small, low-capacity language models, Stable LM 2 1.6B may similarly exhibit common issues such as high hallucination rates or potential toxic language. We ask the community to keep this in mind when building their applications and take appropriate measures to ensure they are developing responsibly.

OpenAI showed that language model capability scaled with the number of model parameters, leading to the development of large language models (LLMs) with trillions of parameters. However, the challenges involved in training and hosting these models have led to a trend toward "small language models," such as Meta's Llama 2 and Microsoft's Phi-2. These models are typically small enough to run on a single machine, such as a laptop, and are often free to use for non-commercial purposes.

InfoQ covered the release of the first version of Stable LM in 2023, which included two sizes: 3B and 7B parameters. The new model is even smaller, but is trained on more data. The fine-tuned version, Stable LM 2 Zephyr, outperforms the original 3B parameter model on multilingual benchmarks. It also outperforms larger models, such as Falcon 40B, on the MT-Bench benchmark.

Stability AI's language model team lead Carlos Riquelme posted about the release on X, noting:

Language models evaluation is so tricky and fragile -- even more so in the multilingual setup. Definitely need some progress on this front. Quantifying and taming hallucinations is especially challenging in small models but should unlock their widespread use. Any ideas?

Stability AI CEO Emad Mostaque also posted about the model on X, highlighting the advantages of its size:

Try it with [retrieval augmented generation], in your browser, on your phone, on a potato etc. Easy to fine tune on your MacBook. Moderate reasoning & knowledge but sometimes that’s all you need...Particularly when you can specialise & stack them.

Stability AI also recently released Stable Code 3B, a code-generation language model, which is 60% smaller than CodeLLaMA 7B but with comparable performance. Stability AI claims the model is small enough to run "in real-time on modern laptops, even those without a dedicated GPU." This model is also part of the commercial Stability AI Membership.

The Stable LM 2 base and Stable LM 2 Zephyr models can be downloaded from Huggingface.

About the Author

Anthony Alford

Show moreShow less

InfoQ Software Architects' Newsletter

Login with:

Don't have an InfoQ account?

Write for InfoQ

About the Author

Anthony Alford

Rate this Article

This content is in the AI, ML & Data Engineering topic

Related Topics:

Related Editorial

Related Sponsored Content

Popular across InfoQ

The InfoQ Newsletter