Meta Open-Sources 175B Parameter Chatbot BlenderBot 3

Meta AI Research open-sourced BlenderBot 3, a 175B parameter chatbot that can learn from live interactions with users "in the wild." In evaluations by human judges, BlenderBot 3 achieves a 31% rating increase compared to the previous BlenderBot version.

The chatbot was described in a post on the Meta AI blog. BlenderBot 3 is based on the OPT-175B pre-trained language model. The bot can also access data from Internet searches and conversational long-term memory, reducing its propensity to "hallucinate" and improving the coherence of conversations. To help gather more training data, Meta has set up an interactive demo site, where users in the United States can chat with the bot; to prevent malicious users from engaging in toxic conversation, Meta has implemented several technologies for detecting offensive language and trolling. According to the Meta team:

We are committed to sharing organic conversational data collected from the interactive demo system as well as model snapshots in the future. We hope this work will help the wider AI community spur progress in building ever-improving intelligent AI systems that can interact with people in safe and helpful ways.

Large pre-trained language models such as GPT-3 have been shown to make good foundations in chatbots, especially when fine-tuned with dialog-oriented datasets. In 2021, Meta released BlenderBot 2.0 which augmented the language model with the ability to incorporate Internet search results and long-term conversation memory, which greatly improved the bot's factual consistency.

One shortcoming of the current approach is that the fine-tuning datasets are constructed by researchers collaborating with crowd-sourced workers, which naturally limits the size and scope of the data. Meta's goal is to collect "organic interactions" with users through a publicly available chat interface. Although this approach has risks, the Meta researchers have developed several technologies to reduce the influence of toxic users and improve the chatbot's ability to learn from these interactions.

BlenderBot 3 Safety Protocols

BlenderBot 3 Safety Protocols. Image source: https://arxiv.org/abs/2208.03188

First, the chatbot interface includes a "dislike" button, where users can indicate that the bot gave a bad response. This data is later incorporated into training future bot versions using a new technique called Director. In this scheme, the language generation model includes a language classifier which directs the generation away from making negative word sequences. Meta also studied several techniques for detecting input from trolls, or adversarial users, and mitigating their influence in the training data. Along with this research Meta developed the SafetyMix benchmark "to test troll identification methods."

In a Twitter thread discussing the work, research scientist Jason Weston explained why the interactive demo is only available in the United States:

What's unique in this project is that we will be releasing participating conversations & feedback with the (also released) model for the community to continue/improve this research. As you can imagine it's quite hard to get approvals for this. Still working on outside US.

The Meta researchers created three versions of BlenderBot 3, each with a different number of parameters in its language model. The 3B and 30B parameter versions, along with source code and training datasets, are available on the ParlAI chatbot framework site. Access to the full 175B parameter model is currently restricted and only granted to certain researchers on request.

About the Author

Anthony Alford

Show moreShow less

InfoQ Software Architects' Newsletter

Write for InfoQ

About the Author

Anthony Alford

Rate this Article

This content is in the AI, ML & Data Engineering topic

Related Topics:

Related Editorial

Related Sponsors

Popular across InfoQ

The InfoQ Newsletter