Do you feel overwhelmed by the effort needed to learn a new Web API ? Do you feel that a major chunk of your effort is spent dealing with incompatible data formats for the same object from different Web APIs? Duncan Cragg at ThoughtWorks is working on a project called 'The Object Network' that aims at eliminating this learning curve through common data definition and access formats and most importantly amplifies the network effects through the creation of a globally linked data fabric. InfoQ spoke to Duncan Cragg to uncover the deeper motivations and reasons for launching this effort and how API developers can publish to and extract from this data web.
InfoQ: What is the Object Network?
It's one of those things that are almost stupidly obvious and hard to refute when proposed, yet perhaps no-one believes is actually possible:
Instead of the 4000 APIs listed on the ProgrammableWeb site all being completely different and incompatible, why not have just one way of exposing data from these sites and services?If Twitter, Facebook, Flickr and Google all have users and photos, then why have four different ways of accessing them, in four different, incompatible formats? Just define a single way, a single format, so that we only have to write one client library not four, not four thousand!
The growth in APIs is impressive-looking: a sharp upward graph. But compare that with the growth of the Web in the 90's! I believe that we could achieve similar stratospheric growth in APIs - in the 'data Web' - if we could just agree on a few basic, simple protocols and formats. We need an 'HTML for APIs' - some conventions about retrieving JSON and some expectations about the shape of that JSON - it's really simple!
That's where the Object Network comes in.
There's a simple specification to follow for publishing into the Object Network, covering basic conventions around HTTP and JSON, plus any number of formats for common data, including contact information, calendar events, news, article and message feeds, media metadata, comments and reviews, products, etc.
By following these patterns, your site's data can join a global, linked-up 'data fabric'. Your data can be made accessible to clients or client code that already existed before you came along!
InfoQ: Why is the Object Network needed?
Well the problem is that every API is different and requires different protocols above HTTP and different parsing of the JSON responses. But we always strive in this industry to reduce inefficiency through shared code and common standards.
As a server developer, publishing to the Object Network can extend the reach of your data, through shared client code and interlinking between sites. You also don't have to spend so much time designing your interface, as all the protocol conventions and formats will be documented by the time you come to implement it. Or you can get involved with others to define new Object Network formats for special cases and domains. No doubt there will also be frameworks to ease the publishing of existing and new data into the Object Network.
As a client developer using Object Network interfaces, you get to re-use common client code that handles the fetching of links and cacheing of data, the conversion to basic HTML, etc. It will also have understanding of the common data types and can offer useful functions on top, such as plotting contact objects on a map. Last but not least, more data will be just a link away, allowing cross-site navigation and data consumption.
InfoQ: What is missing in the Semantic Web or Linked Data standards, or REST's Hypermedia constraint, that the Object Network offers?
Well, most developers (including me) don't have the time or the patience to learn or do anything that seems too complicated or requires too much investment! Which is where I think the Semantic Web or Linked Data, and the more exotic REST patterns, interpretations and arguments scare developers off.
Basically, I don't see the average API developer deciding that RDF is a good fit for publishing their data. RSS 1.0 is a data point there. JSON-LD is possibly a start in the right direction, but I think it will still be seen as encumbered with too much Semantic Web baggage. It also doesn't seem to set any conventions around HTTP usage, or RESTful application. Further, there doesn't seem to be anything around data updates.
Likewise, I don't see evidence that the average developer is paying much attention to correct REST. If doing REST properly were seen as easy, then all the so-called 'REST APIs' would actually be RESTful!
Developers need a simpler pattern to work with: one that makes their lives easy, and delivers all the Semantic Web benefits of a common language for data, plus all the REST benefits of interoperability and scalability - 'mashability and cacheability'.
Most people couldn't tell you how to publish linked Semantic Web data in simple JSON, or how to satisfy each REST constraint, from Statelessness to Hypermedia via Self-descriptiveness.
But I show in just a few lines how to publish your data into the Object Network within simple JSON templates, how to have hyperlinks between your objects, and what to put into your HTTP messages. And data updates are part of the Object Network language.
Others can argue if it's truly RESTful, or better or worse than Linked Data - obviously I believe 'yes' and 'better', respectively! I just want to get people talking and sharing and delivering. Delivering into a global network of linked-up, dynamic data.
Obviously, there is a huge amount of value in the many Semantic Web and Linked Data sources, and in the vocabularies they use. All of that can be re-used and re-published into the Object Network as a vast pool of static data and object type semantics. There should be a reasonably mechanical conversion from rdf+xml resources to linked Object Network JSON.
Similarly, the so-called 'REST APIs' - all 4000+ of them - offer a vast, rich mine of useful data, all of which can be simply adapted into the Object Network, perhaps using tools like ql.io.
InfoQ: How do you propose to bridge the impedance mismatch between the object programming model and the data model in the Object Network?
I believe it's mostly accepted that distributed systems work best when your 'unit of distribution' is a decent chunk of data, not a process, procedure or method call. Similarly, things are more manageable when you operate asynchronously, statelessly and idempotently (you can repeat the same message twice), rather than synchronously and in brittle lock-step.
So I don't think anyone will complain too much that I'm not exposing an RMI interface to the objects in the Object Network!
The word 'Object' of Object Network actually derives from the 'O' in JSON: it's a serialised hashtable, in effect. So it perhaps looks more like a JSON DTO, but one with a strict format or type structure, a URL, content that links to other objects, and content that can change.
I see Object Network objects as having their public state 'animated' by their server.
They should ideally work through state dependency - like a spreadsheet: 'this object's state depends on that object's state'. I've published some detail around this programming model - I call it 'Functional Observer' or 'FOREST' - in a chapter in the book 'REST: From Research to Practice' (plus see http://forest-roa.org - note that I called it the 'Object Web' in some of that older material).
But that whole area is optional: you can animate your objects any way you like behind the Object Network interface. I'm focusing on explaining the simplest patterns first - so only go there if you're up for the advanced material!
InfoQ: What is the current status on the Object Network and what are next steps?
Well it's been mostly ThoughtWorks clients and colleagues that have been involved so far, but I'm seeing a lot of outside interest in the current Object Network blog series. I'm also regularly doing talks and writing articles explaining the ideas - any chance I get! (It's changed name a few times, but now it's gaining traction, I will call it the 'Object Network' from now on.)
I have some Javascript code on GitHub (that needs to be brought up to date) which implements a browser client - an Object Network viewer. This allows you to navigate through the linked objects, picking up changes by polling or refresh. It also understands the various object types, so can offer those functions like viewing contacts on a map. The plan is to also offer this as a Javascript library that can be re-used in dedicated client applications consuming Object Network data. It should also run on the server in Node.js, for normal page assembly.
Looking further out, I have a Java codebase also on GitHub called 'NetMash' which has an Android Object Network viewer and a server. That's much more experimental right now: it has more advanced features in it around asynchronous and bidirectional updates of data.
InfoQ: How can one get involved in the the Object Network project?
Mandatory actions are: (1) Join the Google Group and (2) Follow the blog series as it's written!
Then get publishing! It's really simple - you can use your existing Web stack with simple JSON templates, or even publish static or pre-generated JSON files. Like I said, the first data we need to publish is that available already in less accessible forms. We'll need to work on some server-side framework or library support, no doubt, to ease publishing.
Then we can view it all in the Javascript viewer, or you can start writing special apps based on the Javascript library code - when that's ready. Of course, I could absolutely use the skills of a Javascript expert to work with me on that code.