BT

Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ

Topics

Choose your language

InfoQ Homepage Articles "Code First" Web Services Reconsidered

"Code First" Web Services Reconsidered

This item in japanese

Bookmarks

Are you getting started on developing SOAP web services? If you are, you have two development styles you can chose between. This first is called “start-from-WSDL”, or “contract first”, and involves building a WSDL service description and associated XML schema for data exchange directly. The second is called “start-from-code”, or “code first”, and involves plugging sample service code into your framework of choice and generating the WSDL+schema from that code.

With either development style, the end goal is the same – you want a stable WSDL+schema definition of your service. This goal is especially important when you’re working in a SOA environment. SOA demands loose coupling of services, where the interface is fixed and separate from the implementation. XML web services make a great basis for implementing SOA largely because WSDL and schema allow you to specify the XML message exchange used by a web service in a platform-neutral manner. If the WSDL and schema aren’t stable the service can only be used by clients which are directly or indirectly under the control of the service provider – and that’s not SOA.

Start-from-code issues

The idea of developing web services starting from code is frowned upon by many authorities in the web services and SOA fields. They feel that starting from code ties the XML message structures to a particular implementation, which defeats the whole purpose of using WSDL and schema. This was certainly true of the original form of start-from-code, the SOAP encoding scheme widely used for rpc/encoded support. With SOAP encoding, the XML schema was generated directly from the service provider application data structures and the client code worked with a generated duplicate of those data structures. This automatic conversion of data models to and from XML was the feature that made rpc/encoded popular in the early days of SOAP – but was also one of the big reasons the style has since been deprecated. It meant that every time your service data structures changed the schema would also change, and clients would need to regenerate their code using the new schema.

SOAP encoding approach to start-from-code

Figure 1. SOAP encoding approach to start-from-code

Besides the tight coupling created with SOAP encoding, it also had drawbacks in XML data representation and schema definitions. SOAP encoding is an XML serialization algorithm for object graphs, defined in programming language-independent terms. Since it’s a serialization algorithm, there’s no flexibility in terms of the resulting XML structure – you apply the algorithm to your data structures, and what you get out is the SOAP encoding for those structures. Unfortunately, the serialization rules used resulted in XML schemas which were practically unusable for any purpose other than rpc/encoded message exchange, including document validation. The serialization format also provided relatively poor performance, due to the overhead of reference structures, excessive runtime typing, and the use of child elements for all components (rather than attributes, where appropriate).

Most of these problems will apply to any technique which just serializes data structures to and from XML. But start-from-code doesn’t have to mean exposing a data model by direct serialization. Current web services stacks generally support flexible conversions between data models and XML, using some form of data binding. With data binding you maintain control over the XML representation of data. That control means your schema definition can be at least somewhat isolated from the actual data model, and you can chose XML representations that are suited to your data. With data binding approaches, many of the issues associated with SOAP encoding are no longer relevant.

Start-from-WSDL issues

The biggest problem with start-from-WSDL is just the cumbersome nature of working with WSDL and schema definitions, as compared to working with code. Modern IDEs come equipped with “intelligent” editors and powerful refactoring toolkits that make code changes easy. Equivalent tools for WSDL and schema are simply not available. Even the most basic schema refactorings, such as converting a local definition to a global definition, are not supported by the dominant WSDL and schema tools.

Because the tools are weak, start-from-WSDL also requires a solid understanding of both WSDL and schema in order to obtain good results. If the available tools are used by developers without a grounding in the standards, the resulting WSDLs and schemas are often an ugly mess that do more to obscure the structure of the service and data than to reveal it. For the WSDL part obtaining an effective understanding is not too difficult, but the schema side is a different story. The W3C XML Schema recommendation (the full name for “schema”) is at least as complex as most programming languages, and requires just as much effort to become proficient. Large organizations with dedicated architecture teams can afford to hire or train schema experts, but for smaller organizations the complexity of schema is a real barrier to start-from-WSDL service specification.

Even after an initial set of WSDL and schema definitions have been developed these ease-of-use issues still apply. The development of a complex set of services is always going to be an iterative process, with repeated cycles of specification, prototyping, and testing. The inconvenience of working with poorly functioning tools will hinder this development cycle at each stage.

Making start-from-code work

The tools now available for start-from-code approaches to service specification are far superior to the SOAP encoding model that caused this type of development to fall into disrepute. They offer flexibility and extensibility, making it possible to work with even complex data structures. Most importantly, they add a layer of decoupling between the data structures defined in code and the corresponding XML representations.

Microsoft’s .NET framework and Sun’s JAX-WS 2.0/JAXB 2.0 are two popular examples. Both stacks use configuration information embedded in source code (in the .NET case as attributes, in the JAX-WS/JAXB case as annotations) to control the conversion to and from XML. The control provided by this embedded configuration is limited, and generally amounts to just detailing differences from the default serialization choices. That means the XML is not necessarily isolated from the details of the data structure – for instance, if you add a new field to an object it will automatically become part of the XML representation unless you explicitly list the fields to be included – but it’s still much better than a pure serialization approach.

.NET and JAX-WS 2.0/JAXB 2.0 approach to start-from-code

Figure 2. .NET and JAX-WS 2.0/JAXB 2.0 approach to start-from-code

Decoupling with JiBX

The author’s own JiBX (http://www.jibx.org) data binding framework for Java (which can be used for web services with Apache Axis2, XFire, and JiBX/WS stacks) goes even further in decoupling the XML representation from the application data model. JiBX uses binding definitions which are separate from the source code, and requires each item to be included in an XML representation to be explicitly named in the binding. Structural differences between the data model and the XML representation can be handled in the binding, so the XML representation can generally be preserved even as the data model changes over time. JiBX also allows multiple bindings to be applied to the same code, permitting many types of schema versioning changes to be supported with a single data model.

The associated Jibx2Wsdl tool (http://www.sosnoski.com/jibx-wiki/space/axis2-jibx/jibx2wsdl) demonstrates the potential benefits of using a start-from-code approach. It generates WSDL and schema along with the corresponding JiBX binding definition, assuring that all the artifacts match. It also exports JavaDoc documentation from Java source code into the generated WSDL and schema, so that the service description is fully documented without the need for any manual editing. Jibx2Wsdl uses a reasonable default algorithm to create the binding used for data model classes, but the algorithm can be modified at any level by supplying customizations in the form of an XML document. These customizations have the same effect as .NET attributes and JAX-WS/JAXB annotations without needing to be embedded in the source code.

JiBX/Jibx2Wsdl approach to start-from-code

Figure 3. JiBX/Jibx2Wsdl approach to start-from-code

In effect, Jibx2Wsdl splits the start-from-code approach into two separate steps. In the generation step, you use Jibx2Wsdl to create the actual WSDL+schema definitions, and the corresponding JiBX binding definition. In the deployment step, you use JiBX to apply the generated binding definition to your Java classes.

During initial development these two steps can be combined to allow easy creation and refinement of prototype services. Once a stable service definition has been finalized, the generation step is no longer necessary – the JiBX binding definition can be treated as a stable artifact and used directly for deployment, as long as there are no changes to the data model which effect the bound data (such as missing fields, or changed class structures). If there are such changes, the JiBX binding compiler will report an error and the deployment step will fail. At that point, you can either restore the data model expected by the binding or modify the binding to match the modified data model (while preserving the XML format defined by the schema – though that part is not currently enforced by JiBX).

 package com.sosnoski.infoq.ex1;

/**
* Interface for placing orders and checking status.
*/
public interface StoreService
{
/**
* Submit a new order.
*
* @param order
* @return id
*/
public String placeOrder(Order order);

/**
* Retrieve order information.
*
* @param id order identifier
* @return order information
*/
public Order retrieveOrder(String id);

/**
* Cancel order. This can only be used for orders which have not been shipped.
*
* @param id order identifier
* @return <code>true</code> if order cancelled, <code>false</code> if already shipped
*/
public boolean cancelOrder(String id);
}

/**
* Order information.
*/
public class Order
{
/** Unique identifier for this order. This is added to the order information by the service. */
private String orderId;

/** Customer identifier code. */
private String customerId;

/** Customer name. */
private String customerName;

/** Billing address information. */
private Address billTo;

/** Shipping address information. If missing, the billing address is also used as the shipping address. */
private Address shipTo;

/** Line items in order. */
private List items;

/** Date order was placed with server. This is added to the order information by the service. */
private Date orderDate;

/** Date order was shipped. This is added to the order information by the service. */
private Date shipDate;
...
}

Listing 1. Sample service code and data model code (partial)

Listing 1 gives a simple example of a service interface, and a root data model class. Listing 2 shows a customization file for Jibx2Wsdl that adds additional information beyond what’s present in the Listing 1 source code. In this case the added information includes specifying the namespaces to be used in the WSDL and schemas, listing which values are required in each data class and which should be represented using attributes rather than child elements (the leading ‘@’ on the value names), and specifying the type of items contained in the collection.

 <custom force-classes="true" namespace="http://ws.sosnoski.com/order/data"
namespace-style="fixed">
<wsdl namespace="http://ws.sosnoski.com/order/wsdl"
wsdl-namespace="http://ws.sosnoski.com/order/wsdl"/>
<package name="com.sosnoski.infoq.ex1">
<class name="Order" requireds="@customerId customerName billTo items"
optionals="orderId orderDate shipDate">
<collection-field field="items" item-type="com.sosnoski.infoq.ex1.Item"/>
</class>
<class name="Address" requireds="street1 city @state @zip"/>
<class name="Item" requireds="@id @quantity @price"/>
</package>
</custom>

Listing 2. Jibx2Wsdl customizations

Listing 3 shows selected portions of the WSDL and schemas generated by Jibx2Wsdl. You can see the JavaDocs extracted from the source code in the form of schema <xsd:annotation>/<xsd:documentation> components, and as WSDL <wsdl:documentation> elements. These generated artifacts may or may not be ready for final deployment – some added whitespace and formatting might help make the documents more human readable, for one thing – but they’re certainly at least a very good start on the final versions.

<wsdl:definitions ... targetNamespace="http://ws.sosnoski.com/order/wsdl/StoreService">
<wsdl:types>
<xsd:schema ... targetNamespace="http://ws.sosnoski.com/order/wsdl/StoreService">
<xsd:import namespace="http://ws.sosnoski.com/order/data"
schemaLocation="data.xsd"/>
<xsd:element name="placeOrder">
<xsd:complexType>
<xsd:sequence>
<xsd:element type="ns1:order" name="order" minOccurs="0"/>
</xsd:sequence>
</xsd:complexType>
</xsd:element>
<xsd:element name="placeOrderResponse">
<xsd:complexType>
<xsd:sequence>
<xsd:element type="xsd:string" name="string" minOccurs="0">
<xsd:annotation>
<xsd:documentation>assigned order identifier</xsd:documentation>
</xsd:annotation>
</xsd:element>
</xsd:sequence>
</xsd:complexType>
</xsd:element>
...
</xsd:schema>
</wsdl:types>
...
<wsdl:portType name="StoreServicePortType">
<wsdl:documentation>Interface for placing orders and checking status.</wsdl:documentation>
<wsdl:operation name="placeOrder">
<wsdl:documentation>Submit a new order.</wsdl:documentation>
<wsdl:input message="tns:placeOrderMessage"/>
<wsdl:output message="tns:placeOrderResponseMessage"/>
</wsdl:operation>
...
</wsdl:portType>
</wsdl:definitions>

<xsd:schema ... targetNamespace="http://ws.sosnoski.com/order/data">
<xsd:complexType name="order">
<xsd:annotation>
<xsd:documentation>Order information.</xsd:documentation>
</xsd:annotation>
<xsd:sequence>
<xsd:element type="xsd:string" name="orderId" minOccurs="0">
<xsd:annotation>
<xsd:documentation>Unique identifier for this order. This is added to the order information by the service.</xsd:documentation>
</xsd:annotation>
</xsd:element>
...
<xsd:element ref="tns:address" minOccurs="0">
<xsd:annotation>
<xsd:documentation>Shipping address information. If missing, the billing address is also used as the shipping address.</xsd:documentation>
</xsd:annotation>
</xsd:element>
<xsd:element name="item" minOccurs="0" maxOccurs="unbounded">
<xsd:complexType>
<xsd:sequence/>
<xsd:attribute type="xsd:string" use="required" name="id"/>
<xsd:attribute type="xsd:int" use="required" name="quantity"/>
<xsd:attribute type="xsd:float" use="required" name="price"/>
</xsd:complexType>
</xsd:element>
</xsd:sequence>
...
</xsd:complexType>
</xsd:schema>

Listing 3. WSDL and schema (partial)

Once you have the basic sample code for a service constructed, it’s very easy to fill in more details and transform the data model to fit the needs of different stakeholders. The generated bindings can be used in combination with the WSDL and schema to deploy your services using the Apache Axis2 web services framework (or with XFire, or the soon to be released JiBX/WS), and the same data model can be used directly for Java clients. Naturally you can also use other frameworks and other clients – since the generated WSDL and schema define the service interface – but for initial development it’s a real convenience to be able to work with a single version of the code, and Jibx2Wsdl gives you a very easy way of doing just that.

Conclusion

The SOA community has embraced the idea that start-from-WSDL is always the right approach, but real world choices are more complex than this simple judgment would indicate. Starting from WSDL requires a high level of investment, both in terms of learning WSDL and schema and in working with the often cumbersome tools that support these formats. There’s a lot of up-front effort, and no guarantees that the results will even suit your needs, let alone be clean and well structured.

Start-from-code also has its own potential downfalls, including the possibility that you’ll unwittingly tie your service description to a particular implementation. But modern data binding frameworks allow you to isolate the details of the data model from the actual XML representation, and from a practical standpoint developers are always going to be more productive working in code than in WSDL and schema. In many cases, web service development is actually starting from existing code anyway, in the form of services implemented using some older technology. So no matter what opinions are given by experts, start-from-code is likely to remain an important part of web service development for a long time to come.

Regardless of the type of data binding and web services framework used, it’s possible to use start-from-code as a fast track to a working service. Once you have your service functioning properly and tested against your use cases, you can always choose to break the tie completely – just take your generated WSDL and schema definitions as a new starting point, and if necessary modify them to clean up any portions of the XML which don’t fit your organization’s needs. Then use the “final” WSDL and schema to generate new service provider code in your framework of choice, and convert your server application over to working with that code.

About the author

Dennis Sosnoski is a consultant and training facilitator specializing in Java-based SOA and web services. His professional software development experience spans over 30 years, with the last nine years focused on server-side XML and Java technologies. Dennis is the lead developer of the open source JiBX XML data binding framework and the associated Jibx2Wsdl tool, as well as a committer on the Apache Axis2 web services framework. He was also one of the expert group members for the JAX-WS 2.0 and JAXB 2.0 specifications. For more information, check his website or email him at enquiry@sosnoski.com

Rate this Article

Adoption
Style

BT