Facebook Open-Sources BlenderBot 2.0 Chatbot

Facebook AI Research (FAIR) open-sourced BlenderBot 2.0, an AI chatbot that has long-term memory and can use internet searches for supplemental conversational context. The new model outperforms version 1.0, the previous state-of-the-art chatbot, achieving 55% improvement in use of previous conversations, according to human evaluators.

Researchers Jason Weston and Kurt Shuster described the system in a recent blog post. To combat the problem of "goldfish memory" exhibited by many natural language processing (NLP) AI models, BlenderBot incorporates a long-term memory that can track conversation context over several weeks or even months. The bot can also perform internet searches to discover new information to add to the conversation. In addition to the code and models, FAIR is releasing two of the datasets used for training the bot. According to Weston and Shuster,

We think that these improvements in chatbots can advance the state of the art in applications such as virtual assistants and digital friends. We hope this release and the corresponding data sets will help the community collectively make further progress in these and many other directions.

Although large NLP models such as GPT-3 perform well on many NLP tasks, including answering questions and generating realistic stories, they lack a memory for conversational context; thus, any facts or data given by a user interacting with the model might be forgotten before the conversation is finished. Furthermore, although the models often "know" many general knowledge facts, this information is never updated after the model is trained. For example, a user asking GPT-3 "Who is Tom Brady?" would be told that he is quarterback of the New England Patriots, even though Brady has changed teams since GPT-3 was trained. Also, the models often "hallucinate" by stating as fact things that are verifiably false.

Like many NLP text generation models, BlenderBot 2.0 is based on an encoder/decoder or seq2seq neural architecture; however, instead of a single encoder that maps the text input into a context space, BlenderBot incorporates multiple encoders, which are used to map the text input as well as data from dialog history, internet searches, and the bot's memory. The combined context from the multiple encoders is decoded to produce text output. A separate decoder module generates data to be stored in memory. There is also a module that generates search queries to retrieve information from the internet and from memory.

BlenderBot 2.0 was trained with the same blended skill task as version 1.0, plus two additional tasks: Wizard of the Internet, which trains the model to generate search queries based on conversational context, and Multi-Session Chat, which trains it both to identify what knowledge to store in memory and to generate responses based on memory. Because BlenderBot 1.0 already outperformed other existing chatbots, the FAIR team compared the performance of version 2.0 to version 1.0 by having human evaluators score the interactions of the bots. Besides outperforming 1.0 on remembering previous conversations, the new model also reduced hallucinations from 9.1% to 3.0% and is "factually consistent" 12% more often.

Augmenting neural NLP models with additional knowledge is an active research area. In 2020, a team from Tsinghua worked with researchers in Canada to produce KEPLER, which was trained on the text content of Wikipedia combined with the structured Wikidata knowledge base. More recently, a team at MIT combined a GPT-3 deep-learning model with a symbolic world state model to improve the coherence of GPT-3's text generation, and Baidu incorporated a knowledge graph into NLP training to create ERNIE 3.0, resulting in a new high score on the SuperGLUE language understanding benchmark.

BlenderBot 2.0 source code, models, and training datasets are available as part of FAIR's ParlAI open-source chatbot framework.

InfoQ Software Architects' Newsletter

Login with:

Don't have an InfoQ account?

InfoQ Article Contest

Rate this Article

This content is in the AI, ML & Data Engineering topic

Related Topics:

Related Editorial

Related Sponsored Content

Popular across InfoQ

The InfoQ Newsletter