BT

Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ

Topics

Choose your language

InfoQ Homepage News Debate: JSON vs. XML as a data interchange format

Debate: JSON vs. XML as a data interchange format

The debate about JSON vs. XML as a data interchange format has begun in blogspace, following JSON inventor and architect at Yahoo Douglas Crockford's talk at XML 2006 JSON, the fat-free alternative to XML (ppt).  JSON  is a data interchange format whose design goals were to be textual, minimal, and a subset of JavaScript; it supports two structures: objects (unordered collections of name/value pairs) and arrays (ordered sequences of values), as well as four simple types: strings, numbers, booleans, and null. Some of the main points Douglas raises for why JSON is well-suited as a data-interchance format were:
  • It's simultaneously human- and machine-readable format;
  • It has support for Unicode, allowing almost any information in any human language to be communicated;
  • The self-documenting format that describes structure and field names as well as specific values;
  • The strict syntax and parsing requirements that allow the necessary parsing algorithms to remain simple, efficient, and consistent;
  • The ability to represent the most general computer science data structures: records, lists and trees.
Douglas also listed and refuted common arguments against JSON:
  • JSON Doesn't Have Namespaces.  But every object is a namespace. Its set of keys is independent of all other objects, even exclusive of nesting. Also, JSON uses context to avoid ambiguity, just as programming languages do.
  • JSON Has No Validator. Ultimately, every application is responsible for validating its inputs. This cannot be delegated. A YAML validator could be used however.
  • JSON Is Not Extensible. It doesn't need to be. JSON is flexible. It can represent any non-recurrent data structure as is.New fields can be added to existing structures without obsoleting existing programs.
  • JSON Is Not XML. Douglas argued that JSON is much simpler than XML.
Mike Champion, Program Manager for XML Standards at Microsoft attended the talk and summarised his takeaways:
  • JSON took a very different approach than XML in a couple of areas: there is no version number because the spec is declared stable forever; it takes a non-draconian “be liberal in what you accept and conservative in what you produce” philosophy to be friendly to supersets (such as YAML www.yaml.org, which is big in the Ruby world)
  • The JSON folks are agitating for the infrastructure vendors to support a JSONRequest API http://www.json.org/JSONRequest.html  that addresses some limitations in XmlHttpRequest for AJAX environments. 
Mike also commented his own opinion that he sees JSON as a good solution for browser-server communication, "but not a serious contender for interoperability purposes...will we see JSON substituting for XML in SOAP messages, RSS feeds, etc.?   Will people store JSON persistently? That will require JSON-flavored APIs, schema/'data contract' languages, query/transformation languages, apps, etc. "

Dave Winer kicked off the debate saying that JSON "proposes to solve a problem that was neatly solved by XML-RPC in 1998" and remarked at the complexity of a real world JSON example "look at how deep they went to re-invent, XML itself wasn't good enough for them..."  Dave's post kicked off a long discussion thread which included Douglas Crockford responding "The good thing about reinventing the wheel is that you can get a round one."  Mike Champion summarized some of the key takeways from the thread:
  • There seemed to be a lot more JSON fans than XML fans in that thread (maybe because the original post was just a wee bit inflammatory)
  • JSON may be something like 100x faster to parse than XML in today's browsers (but I doubt very much if the best JSON parsers are anywhere near that much faster than the best XML parsers ... it would be interesting to know!),
  • JSON parsing ends up with something akin to a  typed "business object" rather than an untyped DOM tree
  • To do that in XML requires yet another layer or two of cruft (a schema and a databinding tool)
  • The bottom line argument for JSON comes down to elegance -- it does what is does simply and cleanly, and simply refuses to worry about many of the things that complicate XML such as metadata (attributes), comments, processing instructions, a schema language, and namespaces.
Mike also commented his own opinions that while XML isn't as elegant, it is more sturdy and flexible.  JSON may look simpler now but it's own limitations will become apparent once people start testing it in new scenarios.  The argument that JSON is better because it is more elegant "doesn't bode well for its ultimate success" if you consider historical examples of more elegant technologies that lost to more widely adopted heavier technologies, such as LISP vs. C.

Simon Wilson added his own thoughts:
The sweet spot for JSON is serializing simple data structures for transfer between programming languages. If you need more complex data structures (maybe with some kind of schema for validation), use XML. If you want to do full blown RPC use SOAP or XML-RPC. If you just want a light-weight format for moving data around, JSON fits the bill admirably.

What do we lose from not using XML? The ability to use XML tools. If you’re someone who breathes XSLT that might be a problem; if like me your approach when faced with XML is to parse it in to a more agreeable data structure as soon as possible you’ll find JSON far more productive.

XML co-inventor Tim Bray also commented that JSON exists "to put structs on the wire", whereas: with XML "it’s assumed that you might want to stream it in by the gigabyte, or load it into one of a many different in-memory data structures, or run a full-text indexer over the contents, or render it for human consumption, or, well, anything." Tim went on further to position when ot use one vs. the other: 

  • Use JSON... if you want to serialize a data structure that’s not too text-heavy and all you want is for the receiver to get the same data structure with minimal effort, and you trust the other end to get the i18n right, JSON is hunky-dory.
  • Use XML if you want to provide general-purpose data that the receiver might want to do unforeseen weird and crazy things with, or if you want to be really paranoid and picky about i18n, or if what you’re sending is more like a document than a struct, or if the order of the data matters, or if the data is potentially long-lived (as in, more than seconds) XML is the way to go.
Sam Ruby took a closer look at the bytes going over the wire; Robert Sayre noted two advantages: As JSON is not XML, it is a lot easier to embed XML or HTML content.
Uche Ogbuji argues that both XML and JSON have their use cases:
Folks tried from the beginning to make XML right for data as well as documents, and even though I think the effort made XML more useful than its predecessors, I think it's clear folks never entirely succeeded. XML is much better suited to documents and text than records and data.
While claims that JSON is re-inventing XML may be debateable, it is a fact that Douglas Crockford did invent a new keyboard layout. :)

Rate this Article

Adoption
Style

BT