Blender, Facebook State-of-the-Art Human-Like Chatbot, Now Open Source

Blender is an open-domain chatbot developed at Facebook AI Research (FAIR), Facebook’s AI and machine learning division. According to FAIR, it is the first chatbot that has learned to blend several conversation skills, including the ability to show empathy and discuss nearly any topic, beating Google's chatbot in tests with human evaluators.

Some of the best current systems have made progress by training high-capacity neural models with millions or billions of parameters using huge text corpora sourced from the web. Our new recipe incorporates not just large-scale neural models, with up to 9.4 billion parameters — or 3.6x more than the largest existing system — but also equally important techniques for blending skills and detailed generation.

Blender was trained using previously available public domain conversations including 1.5 billion conversation examples. The resulting neural network was too large to fit on a single device, so Facebook engineers split it into smaller pieces to make it scale to even larger datasets.

Skill blending is, as mentioned, one of Blender key features.

Rather than being specialized in one single quality, a good open-domain conversational agent should be able to seamlessly blend them all into one cohesive conversational flow.

Based on a new dataset called Blended Skill Talk (BST), Blender's skill blending capabilities include displaying a consistent personality to improve naturalness of dialogues, using knowledge to conduct discussions on an open range of topics, and displaying empathy.

Blending these skills is a difficult challenge because systems must be able to switch between different tasks when appropriate, like adjusting tone if a person changes from joking to serious.

Another distinctive trait of Blender is the approach to generation strategies. Generation strategies are used ensure chatbots do not repeat themselves, provide too lengthy or shallow responses, or show displaying other shortcomings. Facebook engineers opted for a careful choice of search hyperparameters over sampling to reach an optimal balance of how lively or dull a conversation is.

Facebook engineers evaluated Blender by letting human evaluators compare its performance to Google's latest Meena chatbot. The test was conducted comparing Blender's and Meena's chat logs.

When presented with chats showing Meena in action and chats showing Blender in action, 67 percent of the evaluators said that our model sounds more human, and 75 percent said that they would rather have a long conversation with Blender than with Meena.

According to Facebook, Blender's edge over Meena can be explained based on Blender's skill blending and generation strategies. Strikingly, human evaluators preferred a conversation with Blender over a conversation with humans 49% of the time, while this figure decreases to 36% when using models unable to blending skills.

Evolution of human-like chatbots does not end with Blender, which still displays a number of shortcomings, like contradicting or repeating itself, or "hallucinate" knowledge, i.e. made-up facts.

We’re currently exploring ways to further improve the conversational quality of our models in longer conversations with new architectures and different loss functions. We’re also focused on building stronger classifiers to filter out harmful language in dialogues. And we’ve seen preliminary success in studies to help mitigate gender bias in chatbots.

Major areas of research for future development are mitigating gender bias, filtering out harmful language, and others. Facebook hopes that Blender can help the AI research community to further advance the state of the art of conversational chatbots.

InfoQ Software Architects' Newsletter

Login with:

Don't have an InfoQ account?

InfoQ Article Contest

Rate this Article

This content is in the AI, ML & Data Engineering topic

Related Topics:

Related Editorial

Related Sponsored Content

Popular across InfoQ

The InfoQ Newsletter