InfoQ

News

Comet: Sub-Second Latency with 10K+ Concurrent Users

Posted by Alexander Olaru on Jan 29, 2008 02:05 PM

Community
Java
Topics
Performance & Scalability ,
Rich Internet Apps
Tags
Dojo ,
Comet ,
Jetty ,
Lightstreamer

Also known as Reverse AJAX, Comet's main goal is to allow real-time updates on the client of state changes occurring on the server by leveraging the persistent connection feature of HTTP 1.1. As described in the past on Infoq.com, along with Comet, there are other "push technologies" that try to achieve the same goals.

Greg Wilkins and his team at Webtide, company formed by the lead developers of the open-source web server Jetty, have run a number of performance tests aimed at gauging Comet's scalability and wrote about their findings. More specifically, the tests involved running the Dojo Cometd implementation of Bayeux protocol on Jetty. The server running Cometd as well as the client machines (between 1 and 3) - generating together a load of an equivalent of up to 20,000 users - were each Large Instances of the Amazon EC2 virtual servers. The test results are graphically summarized below:


Following are a few highlights from these tests:

  • Sub-second latency was achievable even for 20,000 users. A tradeoff exists between latency and throughput. With 5,000 users the latency of 100ms at 2,000 messages/sec. increases to over 250ms at a throughput of 3,000 messages/sec.
  • The tested application was a simple chat room with up to 200 users/room. "The load was a 50 byte payload sent in a burst to 10 randomly selected chat rooms at an interval fixed for each test. The interval was selected so that a steady state was obtained with the server CPU at approximately 10% and 50% idle."
  • Greg acknowledged that "1 machine just can’t generate/handle the same load as 20K users each with their own computer and network infrastructure". To partially compensate for this limitation, a subset of the tests (see green circles above) simulated users running on 3 different machines.
  • For the tests with 3 client machines the latency measurements were taken from the machine that simulated 1,000 users. Although not specifically measured, Greg mentioned that the upper limit for the latency observed for the other 2 clients, handling the rest of 20K users, would have been the latency observed while running the test with one client machine.
  • A few modifications were needed to the Cometd demo bundled with Jetty 6.1.7. Some were related to alleviating the lock starvation on the thread pool on the server while others involved changes to setup steps.

As mentioned in a comment and one of Greg's prior posts, Jetty is able to asynchronously flush messages to the clients thus requiring fewer resources to service the same number of users. The thread pool code changes applied for these tests are available for download and Greg told Infoq that they will be part of the next Jetty release. He also added that Webtide is in the process of running similar tests via load balancers with more results to be made available soon.

Another interesting approach to address Comet scalability is that taken by Lightstreamer. Its implementation is based on a stand-alone server which does not rely on an underlying application or web server. Some web/application servers, extended to act like streaming engines, are based on a "one-thread-per-connection model". In comparison, Lightstreamer decouples the number of connections that the server can sustain from the number of threads that are employed, thus allowing it to scale to a very large number of clients.

In a conversation with Infoq, Alessandro Alinone - Lightstreamer's CTO, has shared that they have customers in the financial industry that achieve in production "an average of 10,000 concurrent users with an average update frequency of 3-5 updates per second per user." He added "that Lightstreamer is also employed as the core engine within TIBCO Ajax Message Service, through an OEM agreement. Therefore, interesting production scenarios are progressively arising on the TIBCO front too."

Along with the Server, Lightstreamer's back-end architecture includes:

  • A Data Adapter - plugin module which interfaces Lightstreamer with the data source to be integrated. It can use any technology to integrate with the source but an asynchronous data feed (e.g. JMS, TIB/RV, MQ) will avoid a break with the asynchronous chain that goes to the client.
  • A Metadata Adapter - plugin module which provides the Lightstreamer Server with the metadata of the push scenarios.

On the client-side, the browser gets the static web pages as usual from the web server, but it receives the real-time updates pushed from the Lightstreamer Server. The consumer of these updates can be a set of Lightstreamer JavaScript libraries which are compatible with most browsers and coexist/integrate with most third-party AJAX frameworks and toolkits. Real-time updates can also be pushed via Lightstreamer to Flash/Flex applications or to desktop applications developed in Java or .NET.

Complete Computation Conglomerate (CCC) by David Pratten Posted Jan 31, 2008 2:00 AM
  1. Back to top

    Complete Computation Conglomerate (CCC)

    Jan 31, 2008 2:00 AM by David Pratten

    Thanks for this informative post. The potential of HTTP to form the backbone of a heterogeneous and integrated computing resource is powerfully demonstrated by this development. The current situation is that while the WWW allows a programmer to be agnostic about the location, technology and network path to an information resource, the programmer can't be agnostic about where the computations involved will be done. The programmer’s choice of technology (framework, language etc etc) carries with it the implicit choice about the locus of computation (server or client). I would be interested in your feedback on my rough sketch of how http can be extended so that programmers can work with a unified programming model and delay decisions about where computation is done until run-time, based on issues like the client's available computing power, intellectual property and security. http://www.davidpratten.com/2008/01/07/request-based-distributed-computing-a-rough-sketch/ David

Educational Content

Bindings, Platforms, and Innovation

This presentation focuses on the Internet and separating myth from fact, history from the future, and the mundane from the imaginative. Bob Frankston presents a vision of what could and should be.

Orchestrating Long Running Activities with JBoss / JBPM

This article explores the use of JBoss and jBPM to implement design solutions that effectively address the issue of orchestrating long running activities.

Neo4j - The Benefits of Graph Databases

This presentation covers the use of graph databases as an optimal solution for data that is difficult to fit in static tables, rapidly evolving data or data that has a lot of optional attributes.

Realistic about Risk: Software development with Real Options

This session introduces Real Options and shows how it can help in running your project. Real Options is a decision-making process that can be used to manage risk.

Communication Flexibility Using Bindings

This article discusses the use of bindings on services and references (including the instance of non-configured bindings) as the means to implement SCA communications in a Web and SOA environment.

Writing DSLs in Groovy

After a short introduction to DSLs, Scott Davis plays with the keyboard showing how to approach the creation of a DSL by typing working snippets of Groovy code that get executed.

Scaling Agile with C/ALM (Collaborative Application Lifecycle Management)

IBM Rational and InfoQ present, Scaling Agile with C/ALM, an eBook showing organizations how to become “finely tuned software delivery machines” by enabling team integration and scaling.

Concurrent Programming with Microsoft F#

Amanda Laucher presents a real life enterprise application written in F#. She shows actual code snippets, explaining design decisions and suggesting how to use some of the F# constructs.