InfoQ

InfoQ

News

My Bookmarks

Login or Register to enable bookmarks for unlimited time.

The content has been bookmarked!

There was an error bookmarking this content! Please retry.

W3C Efficient XML Interchange format draft published

Posted by Arnon Rotem-Gal-Oz on Jul 23, 2007

Sections
Enterprise Architecture
Topics
SOA ,
WS Standards
Tags
XML ,
W3C
The W3C has recently announced the first public draft for the Efficient XML Interchange Format which is a suggestion for compressing XML to increase the efficiency on the wire and on CPUs. As can be expected it didn't take too long before we started to see some criticism of this new standard...Yes, another debate on binary XML is on its way.

The proposed format is:
"The EXI format uses a hybrid approach drawn from the information and formal language theories, plus practical techniques verified by measurements, for entropy encoding XML information. Using a relatively simple algorithm, which is amenable to fast and compact implementation, and a small set of data types, it reliably produces efficient encodings of XML event streams"
Or in plain English - a compression algorithm for XML. As expected criticism was soon to follow. The  first  was Elliotte Harold who simply  said that
The Efficient XML Interchange Format is neither efficient nor XML nor interchangeable
Joe Gregorio said that they can call it what they want but it is still Binary XML and on the XML-dev mailing list Michael Champion asked "is it time for the binary XML permathread to start up again?". On the thread that ensued few people raised the issue of the difference between EXI an previous attempts for a binary interchange formats like the Fast Infoset format (FI)

Santiago Pericas-Geerstsen (who is an editor in the W3C XML Binary characterization working group) responded to the last claim and said that EXI is better than FI since it "knows" it deals with XML and not some general infoset. This pre-knowlege allows EXI to produce more compact results. Also EXI works in whole bytes rather than FI that works at the bit level which makes EXI less CPU intensive. Santiago also says that internal tests of EXI performance showed promising results.

In any event, it is also interesting to note that the Technical Architecture Group (TAG) was weary of the need for a binary XML format in a report they issued back in May 2005:
We therefore believe that the benefits of a binary XML must be
predictable and compelling in order to justify development of a
Recommendation.
...
if XML 1.x is inherently capable of meeting the needs of users, then our
efforts should go into tuning our XML implementations, not designing new
formats. Benchmark environments should be as representative as possible
of fully optimized implementations, not just of the XML parser, but of
the surrounding application or middleware stack.
Will binary XML catch-on this time around? Only time will tell.
  • This article is part of a featured topic series on SOA
Why not ASN.1 ? by - - Posted
Re: Why not ASN.1 ? by Paul Fremantle Posted
Re: Why not ASN.1 ? by - - Posted
Elliot's problem and why he is wrong by Paul Fremantle Posted
  1. Back to top

    Why not ASN.1 ?

    by - -

    I always wondered if somebody knows ASN.1 for this classes of communications (wether it is BER, PER, DER, ...). Please, can you help me explain me this issue ?

  2. Back to top

    Re: Why not ASN.1 ?

    by Paul Fremantle

    The FastInfoset model discussed above is based on ASN.1

  3. Back to top

    Elliot's problem and why he is wrong

    by Paul Fremantle

    I agree with Elliot Harold that the Efficient XML shies away from the true XML model. His concern is that EXI defines specific datatypes such as unsigned integer, Date/Time etc. True XML is just strings. However, *any* binary XML format has to move away from some aspect of XML. The point of XML is that it pushes generality over most other concerns. In order to gain efficiency, you have to make a compromise or two - for example readability.

    In this case EXI is gaining efficiency by treating certain datatypes more efficiently than pure character strings. Is that justified? Well in many cases yes it will be.

    The real point tho, is that EXI is just *one* potential infoset serialization, and there just isn't going to be "one size fits all".

  4. Back to top

    Re: Why not ASN.1 ?

    by - -

    Thanks Paul. I am glad to read we have several silver bullets.

Educational Content

New-age Transactional Systems - Not Your Grandpa's OLTP

John Hugg discusses high volume transaction processing applications with high and low frequency profiles, and how VoltDB can be used for that purpose.

Cool Code

Kevlin Henney examines code samples to see what can be learned from them starting from the premise that one won’t write great code unless he knows how to read it.

Collaboration: At the Extremities of Extreme

Jason Ayers share the observations he made watching a team of developers collaborating in real time on the same code base, pushing XP, pair programming and continuous integration to their extremes.

Yesod Web Framework

Michael Snoyman presents Yesod, a web framework written in Haskell and containing a web server, templating, ORM, libraries (templating, gravatar, etc.).

Transactions without Transactions

Richard Kreuter and Kyle Banker on how to avoid classical RDBMS transactional systems by using compensation mechanisms, transactional messaging or transactional procedures.

Attila Szegedi on JVM and GC Performance Tuning at Twitter

Attila Szegedi talks about performance tuning Java and Scala programs at Twitter: how to approach GC problems, the importance of asynchronous I/O, when to use MySQL/Cassandra/Redis, and much more.

10 tips on how to prevent business value risk

One category of risk that project teams need to ensure they address is business value failure – delivering a product that fails to provide value for the business investor.

Interview: Software Systems Architecture: Working With Stakeholders Using Viewpoints and Perspectives

InfoQ spoke to the authors of Software Systems Architecture on a couple of new topics, the System Context viewpoint and Agile, which have been added to the second edition.