Real-Time Notifications at Twitter

Saurabh Pathak, engineering manager at Twitter, discussed the site's notification architecture at QCon London 2017. This included highlighting key challenges which are unique to Twitter, such as the bimodal nature of the social network, dealing with spikes, and the requirement to serve notifications in real-time.

Pathak explained that unlike a typical social network, Twitter is asymmetric. Some users can have millions of followers, where others can have less than a hundred. This makes notifications bimodal in nature, and also creates challenges in being able to serve them in real-time. For example, a popular celebrity Tweeting is going to generate a far greater load than a typical user.

Pathak explains that it’s these different types of users, combined with stringent performance requirements, that present the following architectural challenges:

Latency: People must receive notifications as soon as possible, as the aim of Twitter is to inform users about what is happening in real-time.
Fanout: A single tweet from somebody with millions of followers could trigger millions of notifications. The system must be able to deal with these large spikes.
Heterogeneous calls: Some internal calls, such as calls to caches, may only take a few milliseconds. However, calls to external services such as Google may take in excess of a half a second. How these types of calls are coupled must be considered when scaling.
Multi-datacentre: Twitter must be as resilient as possible, with users still receiving notifications even in the event of failover.

Before addressing these challenges, notifications are first divided into either a push or pull based model. The notification timeline that people see when visiting their Twitter feed adopts pull, whereas SMS and email are push.

For the pull based model, Pathak explained that notifications are usually be served from a cache, as timeline generation is an expensive operation. For this, Twitter makes use of Manhattan, Twitter's real-time distributed backing store, and a fork of Redis. By using caching, latency is kept to a minimum, providing a better user experience. The notifications timeline is also asynchronously replicated across data centres, the goal being that the user will see the same timeline even in the event of failover.

With push notifications, in order to deal with latency and spikes, Pathak explains that whilst they leverage horizontal scaling, they also make use of short lived caches. This helps in situations where there are millions of events for the same user.

In order to deal with heterogeneous calls, Twitter makes use of priority queues to stop important calls being blocked; a celebrity tweeting would never block somebody's login call. Also, different types of calls are profiled and grouped together, meaning if an external dependency is down then failures tend to be isolated.

Finally, Pathak concludes some key takeaways of the architecture to be:

Focus on as many asynchronous operations as possible, as they scale easier than synchronous operations.
When trading off between read and write time, consider writing only data which is unlikely to become stale, such as ID’s.
From the ground up, make sure the application will be able to support multiple data centres.

The full talk can be watched online, and is also followed by a talk on personalised notifications from Gary Lam, staff engineer at Twitter

InfoQ Software Architects' Newsletter

Follow us on

Rate this Article

This content is in the Performance & Scalability topic

Related Topics:

Related Editorial

Related Sponsors

Popular across InfoQ

The InfoQ Newsletter