InfoQ

InfoQ

Article

My Bookmarks

Login or Register to enable bookmarks for unlimited time.

The content has been bookmarked!

There was an error bookmarking this content! Please retry.

Blaze Data Services or LiveCycle Data Services?

Posted by Ryan Knight on Feb 16, 2009

Sections
Operations & Infrastructure,
Enterprise Architecture,
Development,
Architecture & Design
Topics
Rich Client / Desktop ,
Web 2.0 ,
Data Access ,
Open Source ,
Rich Internet Apps ,
Java
Tags
Adobe Integrated Runtime ,
Adobe ,
Flash ,
Flex

Summary

There have been a number of articles about the different versions of data services, yet none seem to clarify how to choose between the different versions. Also none have gone into much detail of the how the end points and channels affect application performance.

Although Adobe refers to 4 different version of Data Services, there are two primary versions. One is the open source Blaze Data Services and the other is the proprietary version called LiveCycle Data Services (LCDS). Both products offer the most important feature which is connectivity between Flex and Java via a Message Broker Servlet. They both can communicate with the server using remote procedure calls and messaging through the binary protocol, ActionScript Message Format (AMF). On top of these two core products Adobe offers supported versions and more extensive product suites.

Our experience at Gorilla Logic is the primary difference between these two products is support options and data management. The differences in performance and scalability are more debatable. LCDS offers additional endpoints and channels for client communication. Adobe states the primary advantage of these is improved scalability. But the fundamental means of communication is always AMF over HTTP, which has the same performance irrespective of the configuration of the server or client.

The other additional feature that LCDS offers is data management. This provides data synchronization between Flex clients and Java Server applications with real time conflict resolution. It also provides data assemblers and adopters for connecting data services to a persistent store via JDBC, Hibernate or other customer adopter.

So what are the 4 different versions?

  1. Blaze Data Services - Free and Open Source edition
  2. LiveCycle Data Services Community Edition - A supported version of Blaze DS
  3. LiveCycle Data Service Single-CPU License - A free version of the commercial edition with the additional features but limited to a single CPU
  4. LiveCycle Data Services - The paid version of the commercial edition with support

There also is a product suite which Adobe calls LiveCycle Data Services Enterprise Suite. This adds the additional products to the core data services to provide content services and document output using tools such as PDF Generation, Forms, and Digital Signatures.

Theoretically then the decision can be made based on three primary points.

  1. Is support needed? This depends if the application needs support, for example for mission critical applications.
  2. Do you need data management services? This depends on the requirements of the application for the data synchronization and management services.
  3. Do you need the additional end points and channels of LCDS? According to Adobe if more than several hundred concurrent connections need to be made, and then the additional LCDS channels or end-points would be necessary. However this point is debatable. The number of concurrent connections a server can handler is based on number of factors, such as threads and I/O throughput. Also handling a larger number of concurrent connections can also be done via load balancing to multiple servers, like you would scale any Java Application Server.

A comparison chart has been made at the end of the article that gives an overview of the different versions.

Overview of the Different End-Points - Servlet and NIO

In Data Services an end-point is how the server listens for connections from the client. The standard endpoint for both Blaze DS and LCDS is a servlet based end-point that runs inside the application server. It uses the Message Broker Servlet, configured in web.xml, to participate in the standard servlet filter chain. This allows Blaze DS or LCDS to be deployed into an existing Java Application. Similar to any Java Servlet end-point, each client connection requires a separate thread on the server.

The NIO endpoint works entirely different. NIO stands for Java New Input / Output. The NIO endpoint creates a stand alone NIO-based socket server. The advantage of NIO is that a single thread can manage multiple I/O streams so it requires less threads and can handle a larger number of clients.

There are several challenges to using NIO. These are:

  • The client cannot access a NIO endpoint through a client side proxy.
  • The connection does not go through the standard servlet chain, which can break any part of the system that rely on servlet processing. For example in one project we used a servlet to handle file uploading to the server.
  • With NIO you typically have to use custom authentication. This is because the socket server is running in a separate process from the servlet container.

Although the NIO server can be configured to listen on port 80, it typically resides on a separate port. This can make network configuration challenging, because the network has to be configured to allow incoming connections on a new port. This could be problematic depending on the internal network of the server and the different client networks. A potential work around to this is to use a load balancer with sticky sessions.

However the advantages of Java NIO are debatable. Java NIO was developed to allow a single thread to handle multiple connections. It did this by having single thread iterate over a pool of connection and see if new data needs to be read or written. Since NIO was introduced in 2002 with Java 1.4, threading in the JVM and Linux has dramatically improved. The ability for Java and Linux to handle a large number of threads has dramatically increased.

For example the Linux Kernel introduced a new threading library in 2.6. This was the Native POSIX Linux Thread Library (NPTL). With NPTL tests have shown the ability to start 100,000 threads on a IA-32 in two seconds.1 Without NPTL it takes the Linux Kernel 15 minutes to create the same number of threads.

Most interesting is a blog post by Paul Tyma and others that argues the Java NIO can actually be a disadvantage 2, 3, 4. Through a series of benchmarks Paul demonstrates the following arguments against NIO:

  • Java NIO loses throughput by 20% to 30%
  • Thread context switching is NOT expensive
  • Synchronization is NOT expensive

Based on these tests Paul shows that one thread per connection can actually scale. Under JDK 1.6 the JVM can handle between 15K and 30K threads. This would mean the limit of servlet end point is not several hundred connections. Instead it is much higher, possibly beyond 15K connections. The actual limit of course depends on the hardware configuration, such as memory and CPU.

Overview of the Different Types of Channels

Over the basic network connection it is possible to use a number of different types of channels for communication between the client and server. For basic remote procedure calls a standard AMF channel is used.

The other type of communication is messaging. This can be used to create applications that push messages from the server and perform near real time communication. Example applications are chat servers, auction clients and collaborative services.

The primary way that data services do messaging is through polling. Because standard communication over HTTP does not keep the communication channel open, a polling channel has the client request waits on the server side until data becomes available. The wait time is adjustable from a few milliseconds to several minutes. This simulates the data being pushed from the server.

There are two basic types of polling channels, short and long polling. The primary difference is in how long the server waits for client data to become available.

A more advanced channel type is streaming AMF. This opens an HTTP connection to the server and allows the server to stream endless messages over this channel. This does have the advantage of no polling overhead from the client and it also uses standard networking configuration. This is the closest option for near real time streaming. The challenge with streaming AMF is that it uses HTTP 1.1 persistent connections which are implemented differently by the different browsers.

The final channel type is the RTMP (real time messaging protocol) channel which is currently only available in LiveCycle DS. Adobe has recently announced they will be publishing the specifications for RTMP. The guess is that it will eventually work its way into other products.

RTMP was designed for streaming large multi-media and data over a duplex channel. One of the primary benefits of RTMP is the connection with the client stays open so it can be used to push data from the server. This allows RTMP to be used for Comet style communications and real time data push.

There are three flavors of RTMP. One works over TCP and uses port 1935. The downside to this version of RTMP is that the connection has to be initiated from the client browser. Also it uses a non-standard port, so it is often blocked by client firewalls.

The other two flavors of RTMP encapsulate the RTMP message within HTTP requests. This allows the protocol to traverse firewalls and use standard ports. The two flavors are RTMPT which goes over standard HTTP and RTMPS which goes over secure HTTPS.

In Flex all calls to the server are performed asynchronously, so none of these channels affect the performance of the client. However they can have an impact on server performance, especially if a large number of client connections are open at the same time. For example streaming AMF could cause a large number of concurrent client connections to be open on the server and thus a large number of threads. As discussed earlier though, the impact of a large number of threads can be minimal.

All client connections can be configured with a default channel and alternative channels if that configuration fails. Depending on the type of communication the server is doing, a different chain of channels can be specified. For example an RTMP channel could be specified, but if that connection fails it could fail back to a long polling channel.

Conclusion

It seems the real benefit of LiveCycle DS over Blaze DS then is primarily the availability of support and data management. The benefits of additional end points and channels are debatable. From the projects we have done at Gorilla Logic we have not seen a need for NIO endpoints or RTMP. As with all technology however, nothing is definitive. I would be interested in others experiences in the comments.

Feature Comparison Chart

table

About the Author

 

Ryan Knight is a Senior Software Architect at Gorilla Logic where he does Flex and Java consulting.   He also is the primary contributor to Anvil Flex, an open source project to help enterprises jump-start their flex development.   He has worked with Java for over 12 years in a variety of roles, from development to consulting.
 
Resources

 

1 http://www.linuxjournal.com/article/6530

2 http://paultyma.blogspot.com/2008/03/writing-java-multithreaded-servers.html

3 http://www.theserverside.com/discussions/thread.tss?thread_id=26700

4 http://cometdaily.com/2008/11/21/are-raining-comets-and-threads/

Links from Adobe

LiveCycle Homepage

LiveCycle Data Services ES FAQ

Comparison of the different LiveCycle Data Services solutions

Links from other sources

LiveCycle ES vs LiveCycle DS vs BlazeDS - clearing up the confusion

Why are you NOT using LiveCycle DS?

ALDS by Christopher Brind Posted
i like the rails by withyou gakaki Posted
Free course on BlazeDS and Flex/AIR by Duane Nickull Posted
Theoretically it's right, but... by Yakov Fain Posted
  1. Back to top

    ALDS

    by Christopher Brind

    The abbreviation commonly used is actually LCDS, which is used in the main article also.

  2. Back to top

    i like the rails

    by withyou gakaki

    rubyamf and phpamf is well
    not java platform

  3. Back to top

    Free course on BlazeDS and Flex/AIR

    by Duane Nickull

    If anyone is interested in learning how to build 5-6 projects in Flex or AIR to talk to a custom build of BlazeDS with an Apache Axis SOAP stack (v 1.4), you can download it from www.web2open.org/courses.html. This course is self paced and you are free to take the materials, code samples and teach them in your own 'hood. Ciao!

  4. Back to top

    Theoretically it's right, but...

    by Yakov Fain

    Theoretically your conclusions regarding the "useless NIO" may be right if you live in the world of servers with unlimited power CPU/Memory and when each Java thread won't take a specific amount of resources in a JVM with a limited heap size.

    Our company, Farata Systems, did some real performance tests hitting BlazeDS hard emulating thousands user requests with a PROFESSIONAL stress test software. This test put the Tomcat/BlazeDS down reaching 800 users. After that, we've created our own solution that works with Jetty server and can be stable with at least 5K users hitting Jetty/BlazeDS. Here's a video recording of this stress test:myflex.org//demos/JettyBlazeDS/JettyBlazeDSload...

    Jetty's suspend/resume thread architecture was supposed to be used as a base for Servlet 3.0 spec, but because of some weird reason it didn't happen, so our solution works with Jetty only at this time.

Educational Content

10 tips on how to prevent business value risk

One category of risk that project teams need to ensure they address is business value failure – delivering a product that fails to provide value for the business investor.

Interview: Software Systems Architecture: Working With Stakeholders Using Viewpoints and Perspectives

InfoQ spoke to the authors of Software Systems Architecture on a couple of new topics, the System Context viewpoint and Agile, which have been added to the second edition.

Beauty Is in the Eye of the Beholder

Alex Papadimoulis discusses ugly code, where it comes from, how to avoid it, and how to get rid of it.

Architecting Visa for Massive Scale and Continuous Innovation

John Davies examines Visa’s architecture and shows how enterprises have architected complex integrations incorporating Hadoop, memcached, Ruby on Rails, and others to deliver innovative solutions.

Max Protect: Scalability and Caching at ESPN.com

Sean Comerford unveils ESPN.com’s architecture, what components are used and why, and the current changes the website goes through.

The Seven Deadly Sins of Enterprise Agile Adoption

Are there repeated patterns of failure on Enterprise Agile Enablement efforts? Sanjiv and Arlen discuss Seven Deadly Sins to avoid when adopting Agile in an enterprise.

Questions for an Enterprise Architect

Erik Dörnenburg answers: What is Enterprise and Evolutionary Architecture?, discussing 4 issues: Turning strategy into execution, Ensuring conformance, Where do the architects sit? Buying or building?

Wrap Your SQL Head Around Riak MapReduce

Sean Cribbs explains what Map-Reduce and Riak are, why and how to use Map-Reduce with Riak, and how to convert SQL queries into their Map-Reduce equivalents.