BT

Your opinion matters! Please fill in the InfoQ Survey!

Smart Replies for Member Messages at LinkedIn

| by Andrew Morgan Follow 0 Followers on Oct 30, 2017. Estimated reading time: 2 minutes | NOTICE: QCon.ai - Applied AI conference for Developers Apr 9-11, 2018, San Francisco. Join us!

A note to our readers: As per your request we have developed a set of features that allow you to reduce the noise, while not losing sight of anything that is important. Get email and web notifications by choosing the topics you are interested in.

LinkedIn has launched a new natural language processing (NLP) recommendation engine which is used to provide members with smart-reply recommendations to messages. The models and infrastructure development process has been documented in detail in a recent blog post by the engineering team.

Whilst a traditional approach to generating replies for messages would have been a sequence-to-sequence model (where replies are calculated word by word), LinkedIn's approach is to choose a reply from a finite inventory. Their engineers explain that this leads to treating the problem as multinomial classification as opposed to text generation, leading to the following advantages:

  • Simpler and easier to train
  • Faster to train, which is key in their use case of needing suggestions immediately
  • Less risk of an inappropriate reply

In order to create the set of candidate replies, LinkedIn first anonymized a set of conversations, replacing the appropriate parts with placeholders. For example, things like names in a personalized message become “RECIPIENT_FIRST_NAME”. They also put the messages through a standardization process which treats messages with the same meaning (Such as “Yup; ok!!!” and “Yes, ok!”) as equivalent, allowing them to be grouped based on that meaning.

In order to build their multinomial classification model, LinkedIn uses their own machine learning framework, Dagli. It uses a Java API to represent a machine learning pipeline as a directed acyclic graph, and is likely to be open sourced in the future.

One of the requirements of smart-reply is for only a single way of saying something to be suggested. For example, “yes”, “yup” and “yeah” all mean “yes”, so there would be no point in suggesting all three. The engineers have solved this problem by only returning a single message from a semantic group. For example, all of the "yes" like responses belong to the affirmative group, so only one of them would ever be suggested at once.

Another advantage of storing messages in semantic groups is evaluation. LinkedIn only needs to do a diff between predicted and actual reply groups of messages in order to see how accurate they are, focusing on the meaning rather than the specific text.

LinkedIn also point out that due to the amount of users sending messages within their system, there are huge scalability challenges in producing smart-replies as rapidly as possible. The way the team has got around this is by computing the replies in advance (when they are sent), and storing them in Expresso, their in-house NoSQL database. This avoids any expensive, on the fly computation, and allows smart-replies to be served up more or less instantaneously.

LinkedIn also has a mechanism in place in order to try and make sure their members' messages remain private. First, by anonymizing the messages, anything personal to the user should be gone before a message is used in training data. Second, there is an opt-out feature which means their message data is not used by the system at all.

The full architecture is documented online.


 

Rate this Article

Adoption Stage
Style

Hello stranger!

You need to Register an InfoQ account or or login to post comments. But there's so much more behind being registered.

Get the most out of the InfoQ experience.

Tell us what you think

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread
Community comments

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

Discuss

Login to InfoQ to interact with what matters most to you.


Recover your password...

Follow

Follow your favorite topics and editors

Quick overview of most important highlights in the industry and on the site.

Like

More signal, less noise

Build your own feed by choosing topics you want to read about and editors you want to hear from.

Notifications

Stay up-to-date

Set up your notifications and don't miss out on content that matters to you

BT