Your opinion matters! Please fill in the InfoQ Readers’ Survey!

W3C Efficient XML Interchange format draft published

| by Arnon Rotem-Gal-Oz Follow 1 Followers on Jul 23, 2007. Estimated reading time: 2 minutes |
The W3C has recently announced the first public draft for the Efficient XML Interchange Format which is a suggestion for compressing XML to increase the efficiency on the wire and on CPUs. As can be expected it didn't take too long before we started to see some criticism of this new standard...Yes, another debate on binary XML is on its way.

The proposed format is:
"The EXI format uses a hybrid approach drawn from the information and formal language theories, plus practical techniques verified by measurements, for entropy encoding XML information. Using a relatively simple algorithm, which is amenable to fast and compact implementation, and a small set of data types, it reliably produces efficient encodings of XML event streams"
Or in plain English - a compression algorithm for XML. As expected criticism was soon to follow. The  first  was Elliotte Harold who simply  said that
The Efficient XML Interchange Format is neither efficient nor XML nor interchangeable
Joe Gregorio said that they can call it what they want but it is still Binary XML and on the XML-dev mailing list Michael Champion asked "is it time for the binary XML permathread to start up again?". On the thread that ensued few people raised the issue of the difference between EXI an previous attempts for a binary interchange formats like the Fast Infoset format (FI)

Santiago Pericas-Geerstsen (who is an editor in the W3C XML Binary characterization working group) responded to the last claim and said that EXI is better than FI since it "knows" it deals with XML and not some general infoset. This pre-knowlege allows EXI to produce more compact results. Also EXI works in whole bytes rather than FI that works at the bit level which makes EXI less CPU intensive. Santiago also says that internal tests of EXI performance showed promising results.

In any event, it is also interesting to note that the Technical Architecture Group (TAG) was weary of the need for a binary XML format in a report they issued back in May 2005:
We therefore believe that the benefits of a binary XML must be
predictable and compelling in order to justify development of a
if XML 1.x is inherently capable of meeting the needs of users, then our
efforts should go into tuning our XML implementations, not designing new
formats. Benchmark environments should be as representative as possible
of fully optimized implementations, not just of the XML parser, but of
the surrounding application or middleware stack.
Will binary XML catch-on this time around? Only time will tell.

Rate this Article

Adoption Stage

Hello stranger!

You need to Register an InfoQ account or or login to post comments. But there's so much more behind being registered.

Get the most out of the InfoQ experience.

Tell us what you think

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

Why not ASN.1 ? by - -

I always wondered if somebody knows ASN.1 for this classes of communications (wether it is BER, PER, DER, ...). Please, can you help me explain me this issue ?

Re: Why not ASN.1 ? by Paul Fremantle

The FastInfoset model discussed above is based on ASN.1

Elliot's problem and why he is wrong by Paul Fremantle

I agree with Elliot Harold that the Efficient XML shies away from the true XML model. His concern is that EXI defines specific datatypes such as unsigned integer, Date/Time etc. True XML is just strings. However, *any* binary XML format has to move away from some aspect of XML. The point of XML is that it pushes generality over most other concerns. In order to gain efficiency, you have to make a compromise or two - for example readability.

In this case EXI is gaining efficiency by treating certain datatypes more efficiently than pure character strings. Is that justified? Well in many cases yes it will be.

The real point tho, is that EXI is just *one* potential infoset serialization, and there just isn't going to be "one size fits all".

Re: Why not ASN.1 ? by - -

Thanks Paul. I am glad to read we have several silver bullets.

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

4 Discuss