Beyond Polling? Consider PubSub, Push and MOM
At OSCON '08, Evan 'Rabble' Henshaw-Plath and Kellan Elliott-McCrea presented "Beyond REST? Building Data Services with XMPP PubSub". Robert Kaye reported on the presentation:
Kellan talked about FriendFeed, a site that lets their users know when their friends share new items. In this example, Kellan pointed out that FriendFeed polls Flickr 2.9 million times in order to check on updates for 45 thousand users. And of those 45 thousand users, only 6.7 thousand are logged in at any one time. This of course, its a poor way of checking for changed content. Kellan says: "Polling sucks!"
To solve this problem its key to leave standard REST web services behind and find a way to use message passing, which is a direct communication way of notifying users of changed content.
Polling, in this context, means using a RESTful webservice to GET updates for each user. In contrast, PubSub (Publish/Subscribe) is an architectural approach that uses an asynchronous message passing protocol where publishers are decoupled from any subscribers. These characteristics make PubSub a scalable choice for scenarios where update notifications need to be sent to a large number of clients.
- XMPP works over persistent connections
- It it stateful (SSL becomes cheap)
- Designed as an event stream protocol
- Natively federated and asynchronous
- Identity, security and presence are built in.
- Jabber servers are built and deployed to do this stuff.
The presentation generated a lot of discussion. Kirk Wylie suggested that a MOM (Mesage Oriented Middleware) based system such as AMQP is really what is needed here, while Joshua Schacter (del.icio.us founder) added his voice to the debate by pointing out that a simpler approach based on HTTP Callbacks could be used:
Simply described, instead of polling frequently, a client would send a normal HTTP request with the resource to be subscribed to and an endpoint to deliver updates to:
Presumably the endpoint would then receive RSS item fragments when and only when that resource updated. For security, the exchange should include some kind of token, borrowing from the appropriate protocols. The subscription would lapse after, say, 24 hours, or that could be passed in as a parameter.
Rabble provided his thoughts on using Callbacks:
So there are a couple things, the ping back system is something which we had thought about putting a slide in for. Clearly being able to do ping backs / web hooks seems like a good pattern, maybe anti-pattern?
It works out that creating a big scale web hooks system you end up with aggregator / crawler problems. Definitely doable. With XMPP we get that functionality with federation and a nicer interface. We also can potentially add some delegated auth over it.
An approach such as Webhooks is potentially a lot simpler than using Jabber/XMPP, however Blaine Cook thought that the complexity is warranted:
If we're arguing that this system needs to be usable by 10-line PHP scripts, then a poorly implemented outgoing message queue is par for the course. At a moderate scale (10000 users, 50 contacts each, 1/5 off-site, 2 posts per day), you're looking at 2.3 remote HTTP requests per second, which isn't nothing.
While using PubSub for notifications is a good architectural approach, many took issue with the title of the presentation. Dare Obasanjo summed it up best, pointing out that REST is not a Golden Hammer:
[Thus] this isn't a case of REST not scaling as implied by Evan and Kellan's talk. This is a case of using the wrong tool to solve your problem because it happens to work well in a different scenario.
If nothing else, the discussion has highlighted the fact that consideration of the usage patterns of API consumers plays a large part in determining an appropriate design.
No polling, no callbacks, but persistent HTTP connections
Disadvantages of polling are already presented in the original article.
But callbacks are not applicable to browser environment at all as that would mean that browser will accept HTTP requests (unless you put an applet with embedded HTTP server on your page - not impossible but probably not very useful, and you need a signed applet).