A Comparison of Push vs Pull Ajax
The advent of AJAX has made it possible to develop browser based web applications with high user interactivity and low user-perceived latency. Real-time dynamic web data such as news headlines, stock tickers, and auction updates need to be propagated to the users as soon as possible. However, AJAX still suffers from the limitations of the web’s request/response architecture which prevents servers from pushing real-time dynamic web data.
Engin Bozdag, Ali Mesbah and Arie van Deursen of the Delft University of Technology have discussed the following approcahes to achieve web-based real time event notification:
1. HTTP Pull: In this traditional approach , the client checks with the server for latest data at regular user definable intervals . The pulling frequecncy needs to be high to ensure high data accuracy, but high pulling frequency may induce redundant checks leading to high network traffic. Low pulling frequency, on the other hand, may lead to missed updates. Ideally, the pulling interval should be equal to the rate at which the server state changes.
2. HTTP Streaming: This method consists of streaming server data in the response of a long-lived HTTP connection (Page Streaming) or an XMLHttpRequest connection (Service Streaming).
3. Reverse AJAX: Service Streaming, as applied to AJAX, is known as Reverse AJAX or COMET . It enables the server to send a message to the client when an event occurs, without the client having to explicitly request. The goal is to achieve a real-time update of the state changes. COMET uses the persistent connection feature in HTTP/1.1. With HTTP/1.1, unless specified otherwise, the TCP connection between the server and the browser is kept alive, until an explicit ‘close connection’ message is sent by one of the parties, or a timeout/network error occurs.
4. Long Polling: Also known as Asynchronous Polling, this method is a hybrid of pure server push and client pull. It is based on BAYEUX protocol. This protocol follows the topic based publish - subscribe scheme. After a subscription to a channel, the connection between the client and the server is kept open, for a defined amount of time. If no event occurs on the server side, a timeout occurs, and the server asks the client to reconnect asynchronously. If an event occurs, the server sends the data to the client, and the client reconnects.
In their experimental study, the authors compared Data Coherence, Server Performance, Network Performance and Data Misses of an AJAX application using a COMET push implementation (Dojo’s Cometd library), as opposed to a pure pull approcah.
The authors concluded that:
"...If we want high data coherence and high network performance, we should choose the push approach. However, push brings some scalability issues; the server application CPU usage is 7 times higher as in pull. According to our results, the server starts to saturate at 350-500 users. For larger number of users, load balancing and server clustering techniques are unavoidable.
With the pull approach, achieving total data coherence with high network performance is very difficult. If the pull interval is higher than the publish interval, some data miss will occur. If it is lower, network performance will suffer. Pull performs well only if the pull interval equals to publish interval. However, in order to achieve that, we need to know the exact publish interval beforehand. However, the publish interval is rarely static and predictable. This makes pull useful only in situations where the data is published frequently according to some pattern..."
Some other implementations of the Comet Ajax server-push model are:
- Orbited: An Open Source Distributed Comet Server.
- AjaxMessaging: Comet plugin for Ruby on Rails.
- Lightstreamer: Commercial implementation offering HTTP streaming based on the AJAX-COMET paradigm.
- Pjax: Push technology for Ajax.
"study" or speculation?
Huh? I'd like to hear how one justifies that a pull model can't perform "well" (for any reasonable definition of that term) just because it makes a few more network connections than would strictly be needed to serve all the data. Reading the paper, it looks like the authors just wanted a convenient way to dismiss something they aren't interested in. If that's the case, it would be more straight-forward to say "we aren't much interested in that" than to make up statements that are, at best, applicable only to a tiny subset of cases.