InfoQ

News

W3C Efficient XML Interchange format draft published

Posted by Arnon Rotem-Gal-Oz on Jul 23, 2007 12:03 AM

Community
SOA
Topics
WS Standards
Tags
XML,
W3C
The W3C has recently announced the first public draft for the Efficient XML Interchange Format which is a suggestion for compressing XML to increase the efficiency on the wire and on CPUs. As can be expected it didn't take too long before we started to see some criticism of this new standard...Yes, another debate on binary XML is on its way.

The proposed format is:
"The EXI format uses a hybrid approach drawn from the information and formal language theories, plus practical techniques verified by measurements, for entropy encoding XML information. Using a relatively simple algorithm, which is amenable to fast and compact implementation, and a small set of data types, it reliably produces efficient encodings of XML event streams"
Or in plain English - a compression algorithm for XML. As expected criticism was soon to follow. The  first  was Elliotte Harold who simply  said that
The Efficient XML Interchange Format is neither efficient nor XML nor interchangeable
Joe Gregorio said that they can call it what they want but it is still Binary XML and on the XML-dev mailing list Michael Champion asked "is it time for the binary XML permathread to start up again?". On the thread that ensued few people raised the issue of the difference between EXI an previous attempts for a binary interchange formats like the Fast Infoset format (FI)

Santiago Pericas-Geerstsen (who is an editor in the W3C XML Binary characterization working group) responded to the last claim and said that EXI is better than FI since it "knows" it deals with XML and not some general infoset. This pre-knowlege allows EXI to produce more compact results. Also EXI works in whole bytes rather than FI that works at the bit level which makes EXI less CPU intensive. Santiago also says that internal tests of EXI performance showed promising results.

In any event, it is also interesting to note that the Technical Architecture Group (TAG) was weary of the need for a binary XML format in a report they issued back in May 2005:
We therefore believe that the benefits of a binary XML must be
predictable and compelling in order to justify development of a
Recommendation.
...
if XML 1.x is inherently capable of meeting the needs of users, then our
efforts should go into tuning our XML implementations, not designing new
formats. Benchmark environments should be as representative as possible
of fully optimized implementations, not just of the XML parser, but of
the surrounding application or middleware stack.
Will binary XML catch-on this time around? Only time will tell.

4 comments

Reply

Why not ASN.1 ? by - - Posted Jul 23, 2007 4:02 AM
Re: Why not ASN.1 ? by Paul Fremantle Posted Jul 23, 2007 7:54 AM
Re: Why not ASN.1 ? by - - Posted Jul 23, 2007 11:20 AM
Elliot's problem and why he is wrong by Paul Fremantle Posted Jul 23, 2007 8:00 AM
  1. Back to top

    Why not ASN.1 ?

    Jul 23, 2007 4:02 AM by - -

    I always wondered if somebody knows ASN.1 for this classes of communications (wether it is BER, PER, DER, ...). Please, can you help me explain me this issue ?

  2. Back to top

    Re: Why not ASN.1 ?

    Jul 23, 2007 7:54 AM by Paul Fremantle

    The FastInfoset model discussed above is based on ASN.1

  3. Back to top

    Elliot's problem and why he is wrong

    Jul 23, 2007 8:00 AM by Paul Fremantle

    I agree with Elliot Harold that the Efficient XML shies away from the true XML model. His concern is that EXI defines specific datatypes such as unsigned integer, Date/Time etc. True XML is just strings. However, *any* binary XML format has to move away from some aspect of XML. The point of XML is that it pushes generality over most other concerns. In order to gain efficiency, you have to make a compromise or two - for example readability. In this case EXI is gaining efficiency by treating certain datatypes more efficiently than pure character strings. Is that justified? Well in many cases yes it will be. The real point tho, is that EXI is just *one* potential infoset serialization, and there just isn't going to be "one size fits all".

  4. Back to top

    Re: Why not ASN.1 ?

    Jul 23, 2007 11:20 AM by - -

    Thanks Paul. I am glad to read we have several silver bullets.

Exclusive Content

Agile in Practice: What Is Actually Going On Out There?

Scott Ambler talks about actual data resulting from surveys made during 2006-2008, showing how Agile is perceived and implemented within organizations.

Building Smart Windows Applications

From QCon 2008, Daniel Moth presents on using Visual Studio 2008 and .NET 3.5 to create compelling rich Windows applications.

Joshua Kerievsky about Industrial XP

Joshua Kerievsky, founder of Industrial Logic, talks about Industrial Extreme Programming which extends XP by including practices dealing with management, customers and developers.

Jeff Barr Discusses Amazon Web Services

Amazon Web Services (AWS) Evangelist Jeff Barr discusses SimpleDB, S3, EC2, SQS, cloud computing, how different Amazon services interact, origins of AWS, AWS globalization and the March AWS outage.

More Than Just Spin (Up) : Virtualization for the Enterprise and SaaS

Cloud services have helped bring virtualization to the forefront. Its full power however, also includes other benefits such as high availability, disaster recovery, and rapid provisioning.

Ruby Beyond Rails

John Lam talks about his path to dynamic languages, some of the problems of making IronRuby run fast, and how the DLR helps with implementing languages.

VMware Infrastructure 3 Book Excerpt and Author Interview

VMware Infrastructure 3: Advanced Technical Design Guide and Advanced Operations Guide provides a wealth of practical insights into setting up virtualization in todays corporate environments.

Architectures of extraordinarily large, self-sustaining systems

Can a system that is so large it cannot be comprehended be "designed" in a conventional sense? The foundations of computing are about to change. In this talk, Richard P. Gabriel explores why and how.