InfoQ

InfoQ

News

My Bookmarks

Login or Register to enable bookmarks for unlimited time.

The content has been bookmarked!

There was an error bookmarking this content! Please retry.

Comet: Sub-Second Latency with 10K+ Concurrent Users

Posted by Alexander Olaru on Jan 29, 2008

Sections
Development,
Architecture & Design
Topics
Performance & Scalability ,
Java ,
Rich Internet Apps
Tags
Dojo ,
Lightstreamer ,
Comet ,
Jetty

Also known as Reverse AJAX, Comet's main goal is to allow real-time updates on the client of state changes occurring on the server by leveraging the persistent connection feature of HTTP 1.1. As described in the past on Infoq.com, along with Comet, there are other "push technologies" that try to achieve the same goals.

Greg Wilkins and his team at Webtide, company formed by the lead developers of the open-source web server Jetty, have run a number of performance tests aimed at gauging Comet's scalability and wrote about their findings. More specifically, the tests involved running the Dojo Cometd implementation of Bayeux protocol on Jetty. The server running Cometd as well as the client machines (between 1 and 3) - generating together a load of an equivalent of up to 20,000 users - were each Large Instances of the Amazon EC2 virtual servers. The test results are graphically summarized below:


Following are a few highlights from these tests:

  • Sub-second latency was achievable even for 20,000 users. A tradeoff exists between latency and throughput. With 5,000 users the latency of 100ms at 2,000 messages/sec. increases to over 250ms at a throughput of 3,000 messages/sec.
  • The tested application was a simple chat room with up to 200 users/room. "The load was a 50 byte payload sent in a burst to 10 randomly selected chat rooms at an interval fixed for each test. The interval was selected so that a steady state was obtained with the server CPU at approximately 10% and 50% idle."
  • Greg acknowledged that "1 machine just can’t generate/handle the same load as 20K users each with their own computer and network infrastructure". To partially compensate for this limitation, a subset of the tests (see green circles above) simulated users running on 3 different machines.
  • For the tests with 3 client machines the latency measurements were taken from the machine that simulated 1,000 users. Although not specifically measured, Greg mentioned that the upper limit for the latency observed for the other 2 clients, handling the rest of 20K users, would have been the latency observed while running the test with one client machine.
  • A few modifications were needed to the Cometd demo bundled with Jetty 6.1.7. Some were related to alleviating the lock starvation on the thread pool on the server while others involved changes to setup steps.

As mentioned in a comment and one of Greg's prior posts, Jetty is able to asynchronously flush messages to the clients thus requiring fewer resources to service the same number of users. The thread pool code changes applied for these tests are available for download and Greg told Infoq that they will be part of the next Jetty release. He also added that Webtide is in the process of running similar tests via load balancers with more results to be made available soon.

Another interesting approach to address Comet scalability is that taken by Lightstreamer. Its implementation is based on a stand-alone server which does not rely on an underlying application or web server. Some web/application servers, extended to act like streaming engines, are based on a "one-thread-per-connection model". In comparison, Lightstreamer decouples the number of connections that the server can sustain from the number of threads that are employed, thus allowing it to scale to a very large number of clients.

In a conversation with Infoq, Alessandro Alinone - Lightstreamer's CTO, has shared that they have customers in the financial industry that achieve in production "an average of 10,000 concurrent users with an average update frequency of 3-5 updates per second per user." He added "that Lightstreamer is also employed as the core engine within TIBCO Ajax Message Service, through an OEM agreement. Therefore, interesting production scenarios are progressively arising on the TIBCO front too."

Along with the Server, Lightstreamer's back-end architecture includes:

  • A Data Adapter - plugin module which interfaces Lightstreamer with the data source to be integrated. It can use any technology to integrate with the source but an asynchronous data feed (e.g. JMS, TIB/RV, MQ) will avoid a break with the asynchronous chain that goes to the client.
  • A Metadata Adapter - plugin module which provides the Lightstreamer Server with the metadata of the push scenarios.

On the client-side, the browser gets the static web pages as usual from the web server, but it receives the real-time updates pushed from the Lightstreamer Server. The consumer of these updates can be a set of Lightstreamer JavaScript libraries which are compatible with most browsers and coexist/integrate with most third-party AJAX frameworks and toolkits. Real-time updates can also be pushed via Lightstreamer to Flash/Flex applications or to desktop applications developed in Java or .NET.

Complete Computation Conglomerate (CCC) by David Pratten Posted
Comet and static content by Jurgen Huls Posted
  1. Back to top

    Complete Computation Conglomerate (CCC)

    by David Pratten

    Thanks for this informative post. The potential of HTTP to form the backbone of a heterogeneous and integrated computing resource is powerfully demonstrated by this development.

    The current situation is that while the WWW allows a programmer to be agnostic about the location, technology and network path to an information resource, the programmer can't be agnostic about where the computations involved will be done. The programmer’s choice of technology (framework, language etc etc) carries with it the implicit choice about the locus of computation (server or client).

    I would be interested in your feedback on my rough sketch of how http can be extended so that programmers can work with a unified programming model and delay decisions about where computation is done until run-time, based on issues like the client's available computing power, intellectual property and security.

    www.davidpratten.com/2008/01/07/request-based-d...

    David

  2. Back to top

    Comet and static content

    by Jurgen Huls

    I think the approach of trying to push updates from the same server you serve static content from (i.e. the main application server) is flawed. The two aims you are trying to achieve are quite different and the load of pushing thousands of updates to clients takes its toll on page-load times and has a negative effect on the user experience. It is much better to separate out the fast, low-latency Comet work to a separate server. This is the approach taken by many products including Lightstreamer and StreamHub Comet, I am suprised Webtide have gone this way. Although it may work for small loads it just won't scale for any large applications. You wouldn't want pages of your website to be slow loading or unresponsive just because there is a sudden burst of chatter across the Comet links.

Educational Content

New-age Transactional Systems - Not Your Grandpa's OLTP

John Hugg discusses high volume transaction processing applications with high and low frequency profiles, and how VoltDB can be used for that purpose.

Cool Code

Kevlin Henney examines code samples to see what can be learned from them starting from the premise that one won’t write great code unless he knows how to read it.

Collaboration: At the Extremities of Extreme

Jason Ayers share the observations he made watching a team of developers collaborating in real time on the same code base, pushing XP, pair programming and continuous integration to their extremes.

Yesod Web Framework

Michael Snoyman presents Yesod, a web framework written in Haskell and containing a web server, templating, ORM, libraries (templating, gravatar, etc.).

Transactions without Transactions

Richard Kreuter and Kyle Banker on how to avoid classical RDBMS transactional systems by using compensation mechanisms, transactional messaging or transactional procedures.

Attila Szegedi on JVM and GC Performance Tuning at Twitter

Attila Szegedi talks about performance tuning Java and Scala programs at Twitter: how to approach GC problems, the importance of asynchronous I/O, when to use MySQL/Cassandra/Redis, and much more.

10 tips on how to prevent business value risk

One category of risk that project teams need to ensure they address is business value failure – delivering a product that fails to provide value for the business investor.

Interview: Software Systems Architecture: Working With Stakeholders Using Viewpoints and Perspectives

InfoQ spoke to the authors of Software Systems Architecture on a couple of new topics, the System Context viewpoint and Agile, which have been added to the second edition.