How YouTube's Recommendation Algorithm Works

| by Alex Giamas Follow 9 Followers on Sep 23, 2016. Estimated reading time: 2 minutes |

In a recent paper published by Google, YouTube engineers analyzed in greater detail the inner workings of YouTube’s recommendation algorithm. The paper was presented on the 10th ACM Conference on Recommender Systems last week in Boston.

YouTube recommendations are driven by Google Brain, which was recently opensourced as TensorFlow. By using TensorFlow one can experiment with different deep neural network architectures using distributed training. The system consists of two neural networks. The first one, candidate generation, takes as input user’s watch history and using collaborative filtering selects videos in the range of hundreds. An important distinction between development and final deployment to production is that during development Google uses offline metrics for the performance of algorithms but the final decision comes from live A/B testing between the best performing algorithms.

Candidate generation uses the implicit feedback of video watches by users to train the model. Explicit feedback such as a thumbs up or a thumbs down of a video are in general rare compared to implicit and this is an even bigger issue with long-tail videos that are not popular. To accelerate training of the model for newly uploaded videos, the age of each training example is fed in as a feature. Another key aspect for discovering and surfacing new content is to use all YouTube videos watched, even on partner sites, for training of the algorithm. This way collaborative filtering can pick up viral videos right away. Finally, by adding more features and depth like searches and age of video other than the actual watches, YouTube was able to improve offline holdout precision results.

The second neural network is used for Ranking the few hundreds of videos in order. This is much simpler as a problem to candidate generation as the number of videos is smaller and more information is available for each video and its relationship with the user. This system uses logistic regression to score each video and then A/B testing is continuously used for further improvement. The metric used here is expected watch time, as expected click can promote clickbait. To train it on watch time rather than clickthrough rate, the system uses a weighted variation of logistic regression with watch time as the weight for positive interactions and a unit weight for negative ones. This works out partly because the fraction of positive impressions is small compared to total.

YouTube’s recommendation system is one of the most sophisticated and heavily used recommendation systems in industry. The paper just scratches the surface but nonetheless gives several useful insights regarding engineering deep learning systems.

Rate this Article

Adoption Stage

Hello stranger!

You need to Register an InfoQ account or or login to post comments. But there's so much more behind being registered.

Get the most out of the InfoQ experience.

Tell us what you think

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread
Community comments

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread


Login to InfoQ to interact with what matters most to you.

Recover your password...


Follow your favorite topics and editors

Quick overview of most important highlights in the industry and on the site.


More signal, less noise

Build your own feed by choosing topics you want to read about and editors you want to hear from.


Stay up-to-date

Set up your notifications and don't miss out on content that matters to you