InfoQ

InfoQ

News

My Bookmarks

Login or Register to enable bookmarks for unlimited time.

The content has been bookmarked!

There was an error bookmarking this content! Please retry.

Beyond Polling? Consider PubSub, Push and MOM

Posted by Gavin Terrill on Jul 28, 2008

Sections
Operations & Infrastructure,
Enterprise Architecture,
Development,
Architecture & Design
Topics
Messaging ,
Architecture ,
Data Access ,
SOA ,
Performance & Scalability
Tags
AMQP ,
XMPP

At OSCON '08, Evan 'Rabble' Henshaw-Plath and Kellan Elliott-McCrea presented "Beyond REST? Building Data Services with XMPP PubSub". Robert Kaye reported on the presentation:

Kellan talked about FriendFeed, a site that lets their users know when their friends share new items. In this example, Kellan pointed out that FriendFeed polls Flickr 2.9 million times in order to check on updates for 45 thousand users. And of those 45 thousand users, only 6.7 thousand are logged in at any one time. This of course, its a poor way of checking for changed content. Kellan says: "Polling sucks!"

To solve this problem its key to leave standard REST web services behind and find a way to use message passing, which is a direct communication way of notifying users of changed content.

Polling, in this context, means using a RESTful webservice to GET updates for each user. In contrast, PubSub (Publish/Subscribe) is an architectural approach that uses an asynchronous message passing protocol where publishers are decoupled from any subscribers. These characteristics make PubSub a scalable choice for scenarios where update notifications need to be sent to a large number of clients.

In the presentation, Evan and Rabble described the advantages of Jabber - a PubSub service based on XMPP (Extensible Messaging and Presence Protocol):

  1. XMPP works over persistent connections
  2. It it stateful (SSL becomes cheap)
  3. Designed as an event stream protocol
  4. Natively federated and asynchronous
  5. Identity, security and presence are built in.
  6. Jabber servers are built and deployed to do this stuff.

The presentation generated a lot of discussion. Kirk Wylie suggested that a MOM (Mesage Oriented Middleware) based system such as AMQP is really what is needed here, while Joshua Schacter (del.icio.us founder) added his voice to the debate by pointing out that a simpler approach based on HTTP Callbacks could be used:

Simply described, instead of polling frequently, a client would send a normal HTTP request with the resource to be subscribed to and an endpoint to deliver updates to: http://your.app/subscribe?resource=/some/user&callback=http://my.app/endpoint

Presumably the endpoint would then receive RSS item fragments when and only when that resource updated. For security, the exchange should include some kind of token, borrowing from the appropriate protocols. The subscription would lapse after, say, 24 hours, or that could be passed in as a parameter.

Commentors pointed out that such a system already exists: Webhooks.

Rabble provided his thoughts on using Callbacks:

So there are a couple things, the ping back system is something which we had thought about putting a slide in for. Clearly being able to do ping backs / web hooks seems like a good pattern, maybe anti-pattern?

It works out that creating a big scale web hooks system you end up with aggregator / crawler problems. Definitely doable. With XMPP we get that functionality with federation and a nicer interface. We also can potentially add some delegated auth over it.

An approach such as Webhooks is potentially a lot simpler than using Jabber/XMPP, however Blaine Cook thought that the complexity is warranted:

If we're arguing that this system needs to be usable by 10-line PHP scripts, then a poorly implemented outgoing message queue is par for the course. At a moderate scale (10000 users, 50 contacts each, 1/5 off-site, 2 posts per day), you're looking at 2.3 remote HTTP requests per second, which isn't nothing.

While using PubSub for notifications is a good architectural approach, many took issue with the title of the presentation. Dare Obasanjo summed it up best, pointing out that REST is not a Golden Hammer:

[Thus] this isn't a case of REST not scaling as implied by Evan and Kellan's talk. This is a case of using the wrong tool to solve your problem because it happens to work well in a different scenario.

If nothing else, the discussion has highlighted the fact that consideration of the usage patterns of API consumers plays a large part in determining an appropriate design.

  • This article is part of a featured topic series on SOA
No polling, no callbacks, but persistent HTTP connections by Mileta Cekovic Posted
Re: No polling, no callbacks, but persistent HTTP connections by Gavin Terrill Posted
  1. Back to top

    No polling, no callbacks, but persistent HTTP connections

    by Mileta Cekovic

    Neither polling nor callbacks are good patterns for updating web pages in browser.
    Disadvantages of polling are already presented in the original article.
    But callbacks are not applicable to browser environment at all as that would mean that browser will accept HTTP requests (unless you put an applet with embedded HTTP server on your page - not impossible but probably not very useful, and you need a signed applet).

    The answer is in persistent HTTP connections where JavaScript chunks are written to the HTTP response and flushed from the server. Upon arrival to browser, java script chunks are executed, updating DOM tree of the page. This is a 'mambo-jumbo' technique that stinks, but the only scalable option for near real-time content in web pages. The drawback is that you can have only about up to 60000 concurrent per web server, because of the number active of persistent connections to single TCP/IP port.

  2. Back to top

    Re: No polling, no callbacks, but persistent HTTP connections

    by Gavin Terrill

    I think the scenario they were describing was for server->server interaction, however for browser->server HTTP connections using long held requests take a look at Comet.

Educational Content

New-age Transactional Systems - Not Your Grandpa's OLTP

John Hugg discusses high volume transaction processing applications with high and low frequency profiles, and how VoltDB can be used for that purpose.

Cool Code

Kevlin Henney examines code samples to see what can be learned from them starting from the premise that one won’t write great code unless he knows how to read it.

Collaboration: At the Extremities of Extreme

Jason Ayers share the observations he made watching a team of developers collaborating in real time on the same code base, pushing XP, pair programming and continuous integration to their extremes.

Yesod Web Framework

Michael Snoyman presents Yesod, a web framework written in Haskell and containing a web server, templating, ORM, libraries (templating, gravatar, etc.).

Transactions without Transactions

Richard Kreuter and Kyle Banker on how to avoid classical RDBMS transactional systems by using compensation mechanisms, transactional messaging or transactional procedures.

Attila Szegedi on JVM and GC Performance Tuning at Twitter

Attila Szegedi talks about performance tuning Java and Scala programs at Twitter: how to approach GC problems, the importance of asynchronous I/O, when to use MySQL/Cassandra/Redis, and much more.

10 tips on how to prevent business value risk

One category of risk that project teams need to ensure they address is business value failure – delivering a product that fails to provide value for the business investor.

Interview: Software Systems Architecture: Working With Stakeholders Using Viewpoints and Perspectives

InfoQ spoke to the authors of Software Systems Architecture on a couple of new topics, the System Context viewpoint and Agile, which have been added to the second edition.