Rob Windsor on WCF with REST, JSON and RSS
WCF is not just for SOAP based services and can be used with popular protocols like RSS, REST and JSON. Join Rob Windsor as he introduces WCF 3.5 and its new native support for non-SOAP services.
Tracking change and innovation in the enterprise software development community

Posted by Stefan Tilkov on Apr 05, 2007 08:00 AM
InfoQ today publishes a one-chapter excerpt from Frank Cohen's book "FastSOA". On this occasion, InfoQ had a chance to talk to Frank Cohen, creator of the FastSOA methodology, about the issues when trying to process XML messages, scalability, using XQuery in the middle tier, and document-object-relational-mapping.
InfoQ: Can you briefly explain the ideas behind "FastSOA"?
Real World REST & Document-Oriented Distributed Databases Tracks @ QCon SF Nov 19-21
Introducing application infrastructure virtualization and WebSphere Virtual Enterprise
The Agile Business Analyst: Skills and Techniques needed for Agile
Scaling a Massively Multi-player Server Casestudy: Terracotta on SmartFoxServer
Rainmaking - IBM's software virtualization strategy (Jerry Cuomo CTO blog)
Frank Cohen: For the past 5-6 years I have been investigating the impact an average Java developer's choice of technology, protocols, and patterns for building services has on the scalability and performance of the resulting application. For example, Java developers today have a choice of 21 different XML parsers! Each one has its own scalability, performance, and developer productivity profile. So a developer's choice on technology makes a big impact at runtime.
I looked at distributed systems that used message oriented middleware to make remote procedure calls. Then I looked at SOAP-based Web Services. And most recently at REST and AJAX. These experiences led me to look at SOA scalability and performance built using application server, enterprise service bus (ESB,) business process execution (BPEL,) and business integration (BI) tools. Across all of these technologies I found a consistent theme: At the intersection of XML and SOA are significant scalability and performance problems.
FastSOA is a test methodology and set of architectural patterns to find and solve scalability and performance problems. The patterns teach Java developers that there are native XML technologies, such as XQuery and native XML persistence engines, that should be considered in addition to Java-only solutions.
InfoQ: What's "Fast" about it? ;-)
FC: First off, let me describe the extent of the problem. Java developers building Web enabled software today have a lot of choices. We've all heard about Service Oriented Architecture (SOA), Web Services, REST, and AJAX techniques. While there are a LOT of different and competing definitions for these, most Java developers I speak to expect that they will be working with objects that message to other objects - locally or on some remote server - using encoded data, and often the encoded data is in XML format.
The nature of these interconnected services we're building means our software needs to handle messages that can be small to large and simple to complex. Consider the performance penalty of using a SOAP interface and streams XML parser (StAX) to handle a simple message schema where the message size grows. A modern and expensive multi-processor server that easily serves 40 to 80 Web pages per second serves as little as 1.5 to 2 XML requests per second.

Without some sort of remediation Java software often slows to a crawl when handling XML data because of a mismatch between the XML schema and the XML parser. For instance, we checked one SOAP stack that instantiated 14,385 Java objects to handle a request message of 7000 bytes that contains 200 XML elements.
Of course, titling my work SlowSOA didn't sound as good. FastSOA offers a way to solve many of the scalability and performance problems. FastSOA uses native XML technology to provide service acceleration, transformation, and federation services in the mid-tier. For instance, an XQuery engine provides a SOAP interface for a service to handle decoding the request, transform the request data into something more useful, and routes the request to a Java object or another service.
InfoQ: One alternative to XML databinding in Java is the use of XML technologies, such as XPath or XQuery. Why muddy the water with XQuery? Why not just use Java technology?
FC:We're all after the same basic goals:
In SOA, Web Service, and XML domains I find the usual Java choices don't get me to all three goals.
Chris Richardson explains the Domain Model Pattern in his book POJOs in Action. The Domain Model is a popular pattern to build Web applications and is being used by many developers to build SOA composite applications and data services.

The Domain Model divides into three portions: A presentation tier, an application tier, and a data tier. The presentation tier uses a Web browser with AJAX and RSS capabilities to create a rich user interface. The browser makes a combination of HTML and XML requests to the application tier. Also at the presentation tier is a SOAP-based Web Service interface to allow a customer system to access functions directly, such as a parts ordering function for a manufacturer's service.
At the application tier, an Enterprise Java Bean (EJB) or plain-old Java object (Pojo) implements the business logic to respond to the request. The EJB uses a model, view, controller (MVC) framework - for instance, Spring MVC, Struts or Tapestry - to respond to the request by generating a response Web page. The MVC framework uses an object/relational (O/R) mapping framework - for instance Hibernate or Spring - to store and retrieve data in a relational database.
I see problem areas that cause scalability and performance problems when using the Domain Model in XML environments:
In no way am I advocating a move away from your existing Java tools and systems. There is a lot we can do to resolve these problems without throwing anything out. For instance, we could introduce a mid-tier service cache using XQuery and a native XML database to mitigate and accelerate many of the XML domain specific requests.

The advantage to using the FastSOA architecture as a mid-tier service cache is in its ability to store any general type of data, and its strength in quickly matching services with sets of complex parameters to efficiently determine when a service request can be serviced from the cache. The FastSOA mid-tier service cache architectures accomplishes this by maintaining two databases:
FastSOA uses the XQuery data model to implement policies. The XQuery data model supports any general type of document and any general dynamic parameter used to fetch and construct the document. Used to implement policies the XQuery engine allows FastSOA to efficiently assess common criteria of the data in the service cache and the flexibility of XQuery allows for user-driven fuzzy pattern matches to efficiently represent the cache.
FastSOA uses native XML database technology for the service and policy databases for performance and scalability reasons. Relational database technology delivers satisfactory performance to persist policy and service data in a mid-tier cache provided the XML message schemas being stored are consistent and the message sizes are small.
InfoQ: What kinds of performance advantages does this deliver?
FC: I implemented a scalability test to contrast native XML technology and Java technology to implement a service that receives SOAP requests.

The test varies the size of the request message among three levels: 68 K, 202 K, 403 K bytes. The test measures the roundtrip time to respond to the request at the consumer. The test results are from a server with dual CPU Intel Xeon 3.0 Ghz processors running on a gigabit switched Ethernet network. I implemented the code in two ways:
The results show a 2 to 2.5 times performance improvement when using the FastSOA technique to expose service interfaces. The FastSOA method is faster because it avoids many of the mappings and transformations that are performed in the Java binding approach to work with XML data. The greater the complexity and size of the XML data the greater will be the performance improvement.
InfoQ: Won't these problems get easier with newer Java tools?
FC: I remember hearing Tim Bray, co-inventor of XML, extolling a large group of software developers in 2005 to go out and write whatever XML formats they needed for their applications. Look at all of the different REST and AJAX related schemas that exist today. They are all different and many of them are moving targets over time. Consequently, when working with Java and XML the average application or service needs to contend with three facts of life:
What's needed is an easy way to consume any size and complexity of XML data and to easily maintain it over time as the XML changes. This kind of changing landscape is what XQuery was created to address.
InfoQ: Is FastSOA only about improving service interface performance?
FC: FastSOA addresses these problems:

FastSOA is an architecture that provides a mid-tier service binding, XQuery processor, and native XML database. The binding is a native and streams-based XML data processor. The XQuery processor is the actual mid-tier that parses incoming documents, determines the transaction, communicates with the ?local? service to obtain the stored data, serializes the data to XML and stores the data into a cache while recording a time-to-live duration. While this is an XML oriented design XQuery and native XML databases handle non-XML data, including images, binary files, and attachments. An equally important benefit to the XQuery processor is the ability to define policies that operate on the data at runtime in the mid-tier.

FastSOA provides mid-tier transformation between a consumer that requires one schema and a service that only provides responses using a different and incompatible schema. The XQuery in the FastSOA tier transforms the requests and responses between incompatible schema types.

Lastly, when a service commonly needs to aggregate the responses from multiple services into one response, FastSOA provides service federation. For instance, many content publishers such as the New York Times provide new articles using the Rich Site Syndication (RSS) protocol. FastSOA may federate news analysis articles published on a Web site with late breaking news stories from several RSS feeds. This can be done in your application but is better done in FastSOA because the content (news stores and RSS feeds) usually include time-to-live values that are ideal for FastSOA's mid-tier caching.
InfoQ: Can you elaborate on the problems you see in combining XML with objects and relational databases?
FC: While I recommend using a native XML database for XML persistence it is possible to be successful using a relational database. Careful attention to the quality and nature of your application's XML is needed. For instance, XML is already widely used to express documents, document formats, interoperability standards, and service orchestrations. There are even arguments put forward in the software development community to represent service governance in XML form and operated upon with XQuery methods. In a world full of XML, we software developers have to ask if it makes sense to use relational persistence engines for XML data. Consider these common questions:
Your answers to these questions forms a criteria by which it will make sense to use a relational database, or perhaps not. The alternative to relational engines are native XML persistence engines such as eXist, Mark Logic, IBM DB2 V9, TigerLogic, and others.
InfoQ: What are the core ideas behind the PushToTest methodology, and what is its relation to SOA?
FC: It frequently surprises me how few enterprises, institutions, and organizations have a method to test services for scalability and performance. One fortune 50 company asked a summer intern they wound up hiring to run a few performance tests when he had time between other assignments to check and identify scalability problems in their SOA application. That was their entire approach to scalability and performance testing.
The business value of running scalability and performance tests comes once a business formalizes a test method that includes the following:
All of this requires much more than an ad-hoc approach to reach useful and actionable knowledge. So I built and published the PushToTest SOA test methodology to help software architects, developers, and testers. The method is described on the PushToTest.com Web site and I maintain an open-source test automation tool called PushToTest TestMaker to automate and operate SOA tests.
PushToTest provides Global Services to its customers to use our method and tools to deliver SOA scalability knowledge. Often we are successful convincing an enterprise or vendor that contracts with PushToTest for primary research to let us publish the research under an open source license. For example, the SOA Performance kit comes with the encoding style, XML parser, and use cases. The kit is available for free download at: http://www.pushtotest.com/Downloads/kits/soakit.html and older kits are at http://www.pushtotest.com/Downloads/kits.
InfoQ: Thanks a lot for your time.
FastSOA appears to be more of a software solution to many of the problem domains solved by DataPower.
Hi Jason: The patterns are platform agnostic, but they do require a persistence engine. As far as I know IBM's DataPower appliances give you the transformation and security decription capability. I did not see a persistence engine to let you do things like intelligent caching, transformation, and federation in the appliance. Did I get that wrong? -Frank
Are you using (in your metrics) the XML database? I would expect that for a fixed data set (where the message itself had a high correlation to the physical storage) this would be more efficient. For a message that has customer, order, payment, and product to represent and "purchase order service" I would expect each entity to be persisted in a relational way so that another service might be “customer details” and another to be “product details”. I would expect the performance to go down as you had more needs to aggregate data from the XML DB. I am guessing that as the more aggregation was needed, the less a hierarchical db would be more efficient than an object relational solution. XML is a hierarchical, so I presume that Native XML DB would also be persist hierarchical, but most apps have a need to store things in a relational database because of the relational nature of the information. While the transformations are a cost, shouldn’t this cost be compared on an aggregate level? I have not worked with any XML databases so I was just wondering about the efficiencies of using them.
Hi Jeff: Native XML databases are typically hierarchical in the way they store data and multi-dimensional in the each stored XML document have a different hierarchy. This is different from object databases where you truly didn't know exactly what you would be storing and querying ahead of time. Native XML databases now have the XQuery and XPath query syntax to work with the stored documents. Most of the XML DBs also have an indexing engine that uses one or more schemas to determine the correct approach to indexing the contents of a collection (the XML DB version of a row-set.) In my performance and scalability testing I find that native XML databases usually achieve 5x to 30x performance increase over using similar XML techniques on a relational DB. -Frank
Thanks Frank, that's a pretty profound improvement. Looking forward to to reading FAST SOA.
Hi Frank: It seems that NetKernel - netkernel.org - already addresses the problems FastSOA addresses. NetKernel also provides an uniform programming model using the resource oriented paradigm. How would you compare your approach to NetKernel?
WCF is not just for SOAP based services and can be used with popular protocols like RSS, REST and JSON. Join Rob Windsor as he introduces WCF 3.5 and its new native support for non-SOAP services.
Christophe Coenraets discusses Flex 3, Flex Builder, AIR, BlazeDS, Adobe and open source, integrating Flex with existing applications, and integrating RIAs with search engines and browsers.
Danijel Arsenovski attempts to dispel some of the myths around refactoring and how it applies to .NET developers.
In this presentation, recorded at QCon San Francisco, CORBA guru Steve Vinoski explains REST from the view of someone who comes to SOA from a traditional, RPC-oriented background.
Feature teams are key to scaling agility for large teams. In an excerpt from "Scaling Lean and Agile Development," Larman & Vodde show how feature teams resolve traditional problems & raise new issues
Billy Newport talks about virtualization, eXtreme Transaction Processing (XTP) and WebSphere Virtual Enterprise. He discusses hardware, hypervisor, JVM, application and data virtualization.
While virtualization provides many benefits, security can not be a forgotten concept in its application.
This session is specifically aimed at traditionally trained project managers who are new to Agile, and who would like to be able to relate the PMI's best practices to their Agile equivalents.
6 comments
Reply