Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ


Choose your language

InfoQ Homepage News Beyond Polling? Consider PubSub, Push and MOM

Beyond Polling? Consider PubSub, Push and MOM

This item in japanese


At OSCON '08, Evan 'Rabble' Henshaw-Plath and Kellan Elliott-McCrea presented "Beyond REST? Building Data Services with XMPP PubSub". Robert Kaye reported on the presentation:

Kellan talked about FriendFeed, a site that lets their users know when their friends share new items. In this example, Kellan pointed out that FriendFeed polls Flickr 2.9 million times in order to check on updates for 45 thousand users. And of those 45 thousand users, only 6.7 thousand are logged in at any one time. This of course, its a poor way of checking for changed content. Kellan says: "Polling sucks!"

To solve this problem its key to leave standard REST web services behind and find a way to use message passing, which is a direct communication way of notifying users of changed content.

Polling, in this context, means using a RESTful webservice to GET updates for each user. In contrast, PubSub (Publish/Subscribe) is an architectural approach that uses an asynchronous message passing protocol where publishers are decoupled from any subscribers. These characteristics make PubSub a scalable choice for scenarios where update notifications need to be sent to a large number of clients.

In the presentation, Evan and Rabble described the advantages of Jabber - a PubSub service based on XMPP (Extensible Messaging and Presence Protocol):

  1. XMPP works over persistent connections
  2. It it stateful (SSL becomes cheap)
  3. Designed as an event stream protocol
  4. Natively federated and asynchronous
  5. Identity, security and presence are built in.
  6. Jabber servers are built and deployed to do this stuff.

The presentation generated a lot of discussion. Kirk Wylie suggested that a MOM (Mesage Oriented Middleware) based system such as AMQP is really what is needed here, while Joshua Schacter ( founder) added his voice to the debate by pointing out that a simpler approach based on HTTP Callbacks could be used:

Simply described, instead of polling frequently, a client would send a normal HTTP request with the resource to be subscribed to and an endpoint to deliver updates to:

Presumably the endpoint would then receive RSS item fragments when and only when that resource updated. For security, the exchange should include some kind of token, borrowing from the appropriate protocols. The subscription would lapse after, say, 24 hours, or that could be passed in as a parameter.

Commentors pointed out that such a system already exists: Webhooks.

Rabble provided his thoughts on using Callbacks:

So there are a couple things, the ping back system is something which we had thought about putting a slide in for. Clearly being able to do ping backs / web hooks seems like a good pattern, maybe anti-pattern?

It works out that creating a big scale web hooks system you end up with aggregator / crawler problems. Definitely doable. With XMPP we get that functionality with federation and a nicer interface. We also can potentially add some delegated auth over it.

An approach such as Webhooks is potentially a lot simpler than using Jabber/XMPP, however Blaine Cook thought that the complexity is warranted:

If we're arguing that this system needs to be usable by 10-line PHP scripts, then a poorly implemented outgoing message queue is par for the course. At a moderate scale (10000 users, 50 contacts each, 1/5 off-site, 2 posts per day), you're looking at 2.3 remote HTTP requests per second, which isn't nothing.

While using PubSub for notifications is a good architectural approach, many took issue with the title of the presentation. Dare Obasanjo summed it up best, pointing out that REST is not a Golden Hammer:

[Thus] this isn't a case of REST not scaling as implied by Evan and Kellan's talk. This is a case of using the wrong tool to solve your problem because it happens to work well in a different scenario.

If nothing else, the discussion has highlighted the fact that consideration of the usage patterns of API consumers plays a large part in determining an appropriate design.

Rate this Article


Hello stranger!

You need to Register an InfoQ account or or login to post comments. But there's so much more behind being registered.

Get the most out of the InfoQ experience.

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Community comments

  • No polling, no callbacks, but persistent HTTP connections

    by Mileta Cekovic,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    Neither polling nor callbacks are good patterns for updating web pages in browser.
    Disadvantages of polling are already presented in the original article.
    But callbacks are not applicable to browser environment at all as that would mean that browser will accept HTTP requests (unless you put an applet with embedded HTTP server on your page - not impossible but probably not very useful, and you need a signed applet).

    The answer is in persistent HTTP connections where JavaScript chunks are written to the HTTP response and flushed from the server. Upon arrival to browser, java script chunks are executed, updating DOM tree of the page. This is a 'mambo-jumbo' technique that stinks, but the only scalable option for near real-time content in web pages. The drawback is that you can have only about up to 60000 concurrent per web server, because of the number active of persistent connections to single TCP/IP port.

  • Re: No polling, no callbacks, but persistent HTTP connections

    by Gavin Terrill,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    I think the scenario they were describing was for server->server interaction, however for browser->server HTTP connections using long held requests take a look at Comet.

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p