Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ


Choose your language

InfoQ Homepage Interviews Stefan Tilkov Talks REST, Web Services and More

Stefan Tilkov Talks REST, Web Services and More


1. I'm Sadek Drobi and I am here at Qcon with Stefan Tilkov. So, Stefan how are you doing, what have you been up to lately?

As some of your readers or my former readers, whatever, they know me, I've been working in the architecture space for the last few years, with my company. We do a little work with customers both large and small. And in the last few years I've been writing and talking and applying much of the REST stuff that I've been writing about here on the InfoQ as well. So that still may be the topic that is dearest to my heart and that I spend the most time with. I've also actually written a book on REST in German --unfortunately for many of you -- so if you can read German buy that book. Otherwise, that's what I've been doing for the last few years.


3. That's a good thing.

Many people would say that I am pretty religious; I don't think I'm not religious in any way not in the literal way not in any metaphorical way. I do actually believe that many of the things are simply useful in practice; I find them to be useful and reasonable concepts. And I have to say that I found that if I tune back a little on the evangelism type of thing it's actually much more convincing, which is nice. Because I now, in many cases, talk to people who are very skeptical about this whole REST thing, and the less I go into the evangelism mode the more convincing this stuff gets. Because if you look at it from a neutral point of view there are certain things that simply make sense - that are a sort of no brainers that you can hardly argue against.

And there are some others that merit a little more discussion. But in general people actually nowadays are open at least to something else. It's not as exotic as it was before and people start adopting far more easily than they did five years ago when others fought that battle.


4. So, would you still suggest web services, not REST style web services, for a client under some circumstances or situations?

Yes, but never because of technical reasons. So I'm a consultant, my company is a consultancy, so we have to make money. We actually work in customer projects, we have learned to deal with reality and sometimes reality forces you to do certain things that you don't consider to be the best technical option. So there may be reasons that make you do something else, something that you don't consider to be the best stuff. Your viewers can see that you are wearing a Haskell t-shirt, right? So you probably can't use Haskell everywhere you want to. Sometimes that's OK. There are bad reasons for that and there are also some very valid reasons. Like, for example, existing investments or existing skill, or huge amount of stuff that is already there. That is often the case when we go to customers.

They have a very big investment in web services. They may have standard software packages -- ERP packages like that from SAP that actually expose a lot of services they want to benefit from that. And we wouldn't gain that much from trying to evangelize something else, so that's one of the reasons. It might also be the case that we are actually fighting several battles at once, and we want to change several things. If you try to win all of those battles at the same time that's pretty hard to do. So you focus on the ones that are the most important for you from your perspective.

And maybe the REST discussion may be something else. Maybe you want them to adopt a different way of developing stuff; maybe you want them to use a different programming language or move to web UIs. At least never talk about changing anything, in the backend. Just get them to drop some sort of, I don't know, view infrastructure that you don't like, whatever. So there are many things. And quite often it's the case that we end up doing web services even though we know technically there's a better solution. We always say that but still we end up doing them. But I don't think there's actually a good technical reason. It would be the more diplomatic thing to say, like a typical consultant: It all depends - sometimes this is better sometimes that is better. I actually believe that there are a lot of things that are great. I like messaging, infrastructure, I even like, from time to time, tightly coupled stuff like RMI or CORBA. If that's what you need that's perfectly fine with me. And I also like REST. But beyond those choices I don't really see a much of a place for WS-* web services these days.


5. Cool, the first thing like, one of the first things customers ask you when you are suggesting REST is workflows, orchestration - how do we do it in REST? I mean, there doesn't seem to be an obvious mapping to something called orchestration in web services. How would you answer them?

That's actually very true. There's no accepted, standardized way of doing something similar to let's say the stuff that a BPEL engine does in the REST space. I'm not aware of any BPEL engine currently that can orchestrate RESTful HTTP stuff. In fact that seems pretty hard to do even from a theoretical standpoint. But there is some research. Cesare Pautasso is doing some research in that respect. He has an open source engine. I'm going to learn about that tomorrow or the day after tomorrow with his talk.

But it's still work in progress, it's still research. So if what you are looking for is a BPM, a solution for business process automation - in that sense of the BPM term, then there is nothing that it is offered in the REST space. But, having said that, there is actually a very strong relation between flows, between the flow through an application and the REST model. In fact that's one of the under-appreciated aspects of REST, that you can actually have the server provide information about what the next step in the flow is to the client in a very dynamic way. So maybe what you have to get used to at first is that stuff works differently, so there is no one-to-one mapping between the stuff you're used to from the WS-* web services side of things.

So there is no analogy to WSDL and there is no analogy to do BPEL, but there are different ways to do this stuff. So, in fact, on InfoQthere is an article authored by Jim Webber, SavasParastatidis and Ian Robinson who describe how to actually drive an application's flow through a business process - the REST box example. It also forms the core of their upcoming book. And that shows very nicely how you can drive that. And we've been using that in applications to have the server provide a list of possible choices, a list of possible statestraight transitions to the client. If you think of that, that is actually what happens when you use the web on a daily basis. You have an application, which is the web browser that talks to a web server. And that is actually machine to machine communication -- people always claim this is something different, it's not.

It's one machine talking to another machine. Of course the one machine has a UIon top of it, but it is still application to application communication. And the forms and the links you use actually tell you what you can do next. So the generic browser application doesn't have to know anything particularly particular or specific about what the web server talks to. Itdoesn't know whether it's talking to or to Google or to anybody or to InfoQ. So you get back a page that provides you with a list of links that you can do, a list of links that you can follow. And it provides you with a list of, or maybe with a form that you can enter and them submit and puts you to another - puts the application to another state and runs that. And the same concepts are absolutely applicable in any sort of scenario, you provide a set of stuff that's appropriate now and you drive the application through that.


6. So, at the implementation level if we talk like about in REST verbs, how would it translate into REST?

Well, usually the safe thing that you can always do is that you can perform a GET, that's the very definition of the GET method is also the most obvious benefit of having GET available for everything, you unlock information that sits in your application. Everybody can get to everything you want to expose. You can of course secure it, but if you want to expose it to a subset of users, or to all of the users, they can actually get at the stuff and do something. If they have issued a GET and have received a representation or some sort of document that supports hypermedia, then you will have a way to go from this representation, from this point in the application to another. So this might be some links -- so if you've done a GET you'll get some links.

Depending on the representation type that you have received from the server you might use different verbs to use those links. So one example is the atom publishing protocol that actually defines a relation --a so-called link relation is part of an atom document that is returned from the server and says link rel="edit". Now, this is the link that you are supposed to use when you want to edit this particular aspect. So the media type definition which is an RFC, a standard, defines how these links look like, and that's actually the most RESTful you can get. You have a standardized media type description that tells you what the links mean or the hypermedia links mean in the former. It actually means that you can implement a completely generic atom client that can work with any kind of AtomPub server and can follow those links in meaningful way. And you can decide to expose them to some user or do something automatically with them.


7. So, actually what you are telling is that every request will give me a link of what I can do afterwards, right, it's a message?

Yes, to a large degree, yes. There may be some things, where you get back a representation that's, I don't know, maybe a binary form, it's a JPEG that doesn't have any links in it. That is fine as well. But in fact this is something that I think the community has only recognized with this vehemence in the last two years or so that actually the hypermedia aspect's the most important one. Some people always knew that, obviously, but the vast majority, the REST folks have now recognized that and put a very strong emphasis on the fact that this is the case -- that you get actually links or transitions provided with the stuff you get back.

If you get a document from the server that doesn't have any links in it, you should ask yourself whether this is RSET. It may be more RESTful than you know than the WS-* level views that people do, but it's still not exploiting the full benefits. You should, if you evaluate a RESTful solution if you want to check it for RESTfulness, if you want to do the litmus test to see how restful it is, you actually should look at the number of URIs you have to know up front. Ideally that should be one, that's the limit - the upper limit. If you get one URI the rest you always get, you always follow through links that you retrieve in your representations you get as responses to the messages you sent.


8. So, I guess this is the one of the misunderstandings of REST, because what I see and what I hear about it is that clients often save all the links they need to use and they use in a more procedural manner than in a hypermedia.

Yes, I would say, I would claim that there is a distinction to be made, one is whether you actually store the URI you got back from the server. Which is sort of ok, because I mean that those are URIs the server provided to you and you may have an expectation that they are there for some time and you can store them and you can access them again. That's fine, as long as you are able to deal with the consequence like a 404 or 410, it's gone, it was there before and it's now gone. Or with a redirect, one of the three XX codes. Then you are actually doing, you are acting very RESTfully.

You are reacting to the full set of response codes… You should actually react to the families of the response codes, you should be able the handle the different families as well. Then you get back something and act accordingly. The one thing you should never do on the client side, on the consumer side, is actually construct URIs according to some out-of-band knowledge. So for example if I have some interface on the server side that says the URIs have this form -- and you have a document that describes this form -- and says if you want to access this stuff you actually do it by putting this value at this place in the URI stream.

That's wrong because you make your client dependent on a particular structure of the URIs and if you decide to restructure then everything breaks, which is not what you want. So, that's bad, constructing URIs on the client side is bad unless you have received the recipe through hypermedia. So, one example is a HTML form with a GET. An HTML form, as you know, can have a method which says FORM METHOD EQUALS POST OR GET and then action equals some URI. Now, if the method is GET you can view the HTML form as a link or as a URI generator.

The server provides you with the recipe that gives you certain fields and selects input field, you type something in, and then hit submit and actually transforms that into the URI of the resource. So, that's perfectly fine and that's something that's very often used in RESTful web applications. It's not used very often in machine-to-machine applications, which I think is a pity. I think there is a place of that and it should be used more often.


9. So actually if I have an atom feed, I'm receiving an atom feed, and actually I have links that guide me through more details of what I'm receiving. Say I'm receiving products and it gives me like links describing more things of these products and actually what I want is to show all of these in one grid for example on the interface, what would you do in this kind of situation? I want the main description and the rest of the description.

It's an excellent question. Actually there are many options that you can choose from, one is to actually do the first GET request, get back a list. And you usually don't get everything, you get a partial sub segment of the list with the first hundred elements, and you have a link to get to the next one hundred elements, and another link… You know, paging is one of those aspects that people associate with web applications, but, you know, , page through a result set from your Google search. That's actually something that's very applicable for machine-to-machine communication as well.

It's a sort of the iterator pattern or cursor pattern that we had in CORBA or RMI or whatever. It's actually easily solved using hypermedia because you just provide a link to the next segment of stuff that you get. Now you could retrieve the first one hundred and then do one hundred requests to get the particular stuff. Maybe that's too wasteful, maybe that's not the kind of interaction that you want, because you have to pay for the latency one hundred times to get the stuff -- unless you do it on parallel, whatever. So, one alternative is to have an overlapping resource that gives you the first hundred elements including the pictures. You could actually encode them within the document that returns.

It's up to you; it's a question of how you want to - how you as a server designer want to design your interface. So what I'm trying to get at is there is not really a difference, there is no relaxation in the amount of thinking you have to put in this stuff. You still have to design good interfaces and you have to find the right granularity of interfaces. That granularity isn't the same for every possible client, they will have different needs. Some of them will just want some summary information, some of them will want the full information, some will be happy with issuing a thousand requests, others will want the one big bag of things. You just have to cater for a good set of possible needs and come up with the set of interfaces. The nice thing about the resource-oriented is that things can actually overlap. You can have them. You don't have to decide for, you don't have to decide to pick one of these things because that's always going to be bad for some of your clients.


10. Yeah, if I'm thinking rather from, let's say, a caching - if we are talking about caching because one of the good features you always said is caching -- and the problem is that the algorithm that is searching my product is very specific, so I'm sure that it will never be cached. Because each time the parameters, they are searching by word and the word will not - there is not a big chance that they will write the same URL for getting. So, if I put everything inside actually, I'm not getting any caching, right, so what would I do in that situation?

Well, again, there are so many options that there's hard to say what's the specific best option; that's the answer for your particular use case. But first of all regarding caching, what people tend to forget is that caching can be done on multiple levels across or along the communication pipe between your client and your server. So of course your client can do some caching itself, which your browser for example does. It caches all the time.

Things are fast because your browser caches. Try clearing your caches and you will see how much stuff it actually pulls from there. Then you can have something sitting within your network, you have the client, you have some forward looking proxy cache that lets you cache stuff that you can actually share with your co-workers who sit in the same network. So for example in the image example there might be some certain popular products that other co-workers have looked at already, so if you retrieve the result and you externalize the product pictures you actually get them from the cache if they have been received before. Now, maybe that's not the case, because you have totally different tastes and you look for totally different things. Now, if you look at the other side of the wire.

Now at the other side of the wire sits the server, in front of the server back-end application --, you know something that hits some sort of database and calculates stuff and builds the mails and does all that -- in front of that server you can place a reverse proxy cache that will actually see all the traffic from all of the users that comes from anywhere. And it will have a very high chance of having a large number of cache hits, so you put the cache in front of the server and you serve a huge amount of data from the cache instead of driving it from the backend application.

And I forgot one layer, you can actually have between the back-end cache and the front-end cache; you can have a content delivery network such as Akamai's or Amazon's CloudFront -- something that actually allows you to distribute the content to your users geographically distributed across the world. So there are multiple layers of caches that you can have, and you can thereby increase the chance of a cache hit. You cannot be one hundred percent sure, in many cases that's obviously true, and then you will not benefit from it.


11. But do you see a scenario like this, where I get an atom feed of let's say three hundred resource and for each resource I hit a picture, which means the client- one client I'm doing three hundred requests but there is caching in all of the stuff, that's why it works good, does it work? Could it be one scenario of doing it?

Yes, that's one scenario that can absolutely work. But I'm absolutely sure that there are scenarios where it won't work. So you have to come up with different solution. That's perfectly fine as well. So it's not a one size fits all solution.


12. So, it's not a crazy idea to do such a thing?

Not at all, not at all.


13. Because it seems for example, let's say for a Java developer, it seems crazy, like you doing three hundred requests, after one request, but there is caching so there are some doubts there that it would work.

It's exactly the same discussion. If the two of us got together and built a Java client-server application with RMI in between -- you know an app server and an application client built with Swing or so -- we would have the exact same discussion. Do we want to have one big fat request retrieving everything or do we want to have several smaller requests. There is no single answer to that. It all depends on the design choice that matches this particular scenario.


14. But doesn't caching help in here?

Of course it helps, yes. Of course it helps but ultimately you may also pay a certain price for the overhead of the protocol, which again is a usual trade-off, right. I found that caching works in a much larger number of locations than people initially think. So one example that I try to give is or that I always experience is when I talk to people who talk about their business applications, who always say: "Well, I can understand how caching works for images and for static HTML files, but I have no idea how caching would ever work for my business data or my orders. How would my orders ever be cacheable? How could I possibly do that? And the answer is very obvious. One aspect is they might be orders from the past, the orders from last year. I'm pretty sure they won't change anymore - if they've been fulfilled last year they will not change. And you can cache them, in theory, indefinitely. At least until 2038, whatever the Unix date limit is. So you can actually return something and people can cache that out of the infrastructure, the intermediaries on the way to the client can cache them. That makes a lot of sense.

You can also use the validation model where you can actually have the client or the intermediaryask the backend "Has this changed?" you know, using ETags. So you provide a value, for example, a hash value, as part of the initial response. And then for the next response the client has the conditional GET that asks "Has this stuff changed or is this ETag still valid?" And if it is still valid the server simply says "Not modified" and you actually save bandwidth. If you had an intelligent implementation for the server side you can also save on processing power. So you get a lot of benefit from cache. And, oh yeah, one final thing, you can also invalidate a cache, if it's a back-end side cache, you know, a reverse proxy cache. And all of the decent ones support some sort of mechanism to actively invalidate an entry in there. So if you notice in your back-end that it changes, you can send a message to the cache that throws it away. So you've got lots of options where that works just great.


15. So, back to the hyperlinks, we have frameworks today in Java, Ruby and C# that try to map to the REST model, but often it seems like very much a one to one mapping with methods which ignores in some way like, what about hyperlinks or hypermedia. Do you think that frameworks are constraining REST? Like available frameworks today are constraining REST program architecture?

I would possibly not say they constrain it because essentially you can build a very RESTful application using Perl and CGI. There's nothing that stops you from doing that; it's just pretty much work. So frameworks whatever language tend to try to simplify that. I think that more recent developments, both in the .NET space and the Java space make many things a lot easier. So it's very simple, for example using JAX-RS, the Java API for RESTful web services to expose something as a resource.

That's very easy. Essentially you write a class and you can do a 20-line class and it would work very nicely and expose that stuff. It is pretty weak in its support for hypermedia, you're right; even on the server side it is pretty weak because there is no - I don't think anybody really had such a great idea yet. There are some experiences -- the Restfulie folks that you also interviewed on InfoQ and have very nice ideas, the JAX -RS or the Jersey folks at Sun/Oracle are also playing along with stuff.

So there are some ideas popping up. That's from the server side of things, which I think is the easy part. I think the really hard part is the client side and this is really at the moment, I think, in the second stage of research. So people have built first frameworks they play with, and I really, if I had a good answer I probably would try to build a framework myself.

I don't really have a great answer on what a client API should look like, because it is pretty hard to find a good mapping of these concepts to some sort of programming model that would be equally valid for any kind of RESTful service. That's what I think is the problem with this. So, it's easy to come up with a programming model that works for one particular application or one particular kind of media type. It's pretty hard to come up with something generic. It may be impossible to come up with something that's totally generic, I don't know. So, we will see. I'm very sure that the next few months or years will bring very many interesting developments on the client side, hypermedia, RESTful, API area.


16. What is the framework that makes you feel most comfortable with writing REST, since you have experienced several?

In the Java space we have used a lot of Jersey, and we are actually pretty happy with it. As I said it's not perfect. There's a little bit of lacking in the hypermedia aspects. But you can write something yourself, and as you usually know what kinds of media types you're dealing with, what kind of formats, what kind of stuff you want to send over the wire, you can actually come up with something very useful, in that respect. And I'm not personally a .NET person but I've heard from the.NET folks that they are very happy with the stuff that's actually built into .NET the newer versions.

I also see that more and more frameworks, web application frameworks, that are actually built for building web apps come up with good ways of supporting both RESTful machine-to-machine communication as well as web applications because essentially that's no difference. It shouldn't be that much of a difference. So Spring MVC, for example, has a nice model of allowing you - enabling you to do both from a single source basis. So you can actually expose resources with both, let's say an HTML UI and some XML or JSON representation.

And in the Ruby space I think Rails has become more and more RESTful with each release. And I think they have added great support for example for cache and ETag handling that's very nice. And they try to do the very best that they can to actually follow as many as the REST principles as possible. Again, there's no great client API, not on the other side as well. But I'm happy using Rails when I use Ruby. I'm happy with Jersey, SPRING MVC, and as far as I can tell the .NET stuff is coming, has become pretty good as well. Also I have heard that the ASP.NETMVC style of doing web applications is actually very nice, and actually a nice match with RESTful principles.

One thing that I think is missing is a well-accepted, widely deployed, widely accepted web framework for Java. Not a web services framework or RESTful web services framework, but rather web framework. So SPRING MVC is an exception. But the others, as far as I can tell, none of them actually embraces REST in the same way that Rails or even ASP.NET MVC does. So SF it's definitely not something that I would consider RESTful, I'm looking into it. I don't consider Wicket RESTful but, no, I may be wrong, we'll see.


17. I think that there is some work on Groovy that ….

I don't know much about Groovy, so I can't comment… I would expect Groovy on Grails, or Grails to be exactly, to offer exactly the same stuff, but that's just an expectation. Our readers, our viewers can actually comment on that.


18. And, when we talk REST, one of the first questions you get is about security, what's the quality of security, how do I do security? Most of the answers are quite simple, but I don't know if they are. Since you work for a lot of clients, can you give us your experience about security handling for your clients, for REST?

So, most of our clients actually still use web services, so that's the majority. And essentially what we do for them is exactly the same stuff that we do for them when we do for REST, because we essentially use HTTPS That's what happens in 99.99% of the cases. I don't think we actually have a single client who ended up using the WS-Security stuff because it is usually a pain to use. It's very expensive in terms of performance and development time and everything else.

And usually HTTPS is good enough. So that means that usually in REST, simply using HTTPS is good enough as well. You actually get some benefits if you use HTTP the way it supposed to be used because your URIs actually means something. So for example you can use your web server or your web intermediary to actually limit access to certain URIs, to certain group of users. You can actually use that so you can see in your log file what has been accessed you can actually make more sense of that. And HTTP has nice security mechanisms in terms of authentication support; you've got extensible authentication mechanism.

That said, there is a lack of something like message based security as it's available in the WS-*world. And while you can of course send encrypted data, like encrypted documents over opened HTTP, this is not something that is standardized. You cannot use XML encryption to send an XML encrypted document, as, let's say the body of a post or a response to a get. But there is no place to put standardized metadata about what kind of encryption algorithm you have used as there is in WS Security. So, if you want message based security you either have to invent something on your own or you will have to rely on the WS-* side of things. So, that may actually be, as I said before, as I claimed before, there are no technical reasons that may actually be a technical reason if you had a strong requirement for both message based encryption or signature and standardized metadata support. Frankly, I've not seen that yet but that doesn't mean that others don't have the problem and exactly view that this is the solution.


19. What about authorization, authentication or authorization for REST services?

So, authentication, there is an extensible mechanism built into HTTP. In many cases using HTTPS basic authentication is perfectly fine, works very well. You can use digest authentication, you can use extensible mechanisms. So for example if you look at the stuff that Google does they actually have their own authentication scheme that they have defined to work with the HTTP protocol and there are other authentication mechanisms that pop up on the web and become more and more widely adopted like OpenID for authentication or OAuth for authorization. They come from other parts of the community; they don't come from the enterprise community doing that. But again, I think that the options are pretty good, I didn't have a problem yet that this actually turned into something limiting us from deploying this stuff in anyway.


20. Some people argue like with SOAP you got WSDL that documents the services so you can know what to do and REST doesn't have one. And some people came up with WADL [Web Application Description Language] and claim to be, claim it to replace with WSDL. What do you think of that, documenting my services, my REST services?

So, first of all a very large part of your WSDL in general is actually XML schema. So if you look at a typical WSDL file it consists of a huge either imports or consists of a huge set of document definitions in XML schema format. And you can of course use XML with RESTful services as well. You can invent your own media types, and XML document types and provide a schema for it and you can actually use that in just exactly the same way that you would do it for WS-* web services. The rest of the WDSL file, a lot of it is actually legacy crap.

It's stuff that's there because it once mattered, like for example a support for RPC encoded. So the encoding thing in SOAP 101 and SOAP 102 stuff is actually pretty much considered obsolete by now and nobody is supposed to use it anymore, even though some people mistakenly do,. But, essentially the mechanisms that you have in your WSDL file to actually state that you are using this particular model are no longer relevant, everybody does the single style document/ literal, wrapped. Now, if you don't care about web services you don't know, you don't have to ask about that, you don't have to worry about that. If you know something about web services you probably recognize this document/literal wrapped as the style that everybody uses. So, that part on WSDL's plainly unnecessary.

And then there is actually something that is, that you can consider meaningful which is the names of the operations, right. You have names of the operations. Of course names are just names, they don't really convey meaning, they don't tell you what actually this operation does and they don't tell you what pre-conditions there are, and whether you have to call those operations in any particular order or anything like that. So essentially you always accompany your WSDL file with additional documentation in a word file, PDF file, some HTML document. Now, my claim is that you can do exactly the same with RESTful HTTP.

And we actually do this in customer projects, so you actually provide HTML documentation that describes the media types that you use, the link relations that you use, that gives examples of resources. Another beauty of this is that actually you can use the same mechanisms to deliver the documentation that you used to deliver your service, and you can link them together because the common model is hypermedia. So from your documentation you provide a link that presents you with a form, so you can put something in and hit submit and get back results.

And from the results if you display them in HTML, you can have a link back to the documentation, from there you can have a link to a documentation that describes how to use this stuff with JSON. And if you so have a mapping that says if I have a URI and add JSON, to the end, you can link to that and you can actually provide, return that, or the XML or whatever you want, as well. So, you sort of activate your documentation. You use the same concepts, you have documentation, you have standardized machine readable formats like for example XML schema, and you can link that stuff together.

I actually think that this is the right answer. There are, of course description languages such as WADL and there are some other research projects that come up with new stuff. I think the best indication that it's probably not that good an idea is that it's not been widely adopted. So it's available. I think WADL is actually pretty good. I think it's as good as you probably can get with this idea. It's done by somebody - by Mark Hadley from Sun --who understands REST and who knows. It's not somebody who wants to recreate WSDL; it's done by somebody who wants to provide a useful thing, but it's not widely adopted, I'm not aware of any public web API being documented in WADL.

And the Jersey implementation of JAX-RS generates WADL files by default, which is nice. You can use them to generate documentation, I don't think that, I actually think that's a good idea but it's not actually, that's another important aspect…If you look at Jersey, which generates WADL files, it actually generates them fully automatically from the source code. So it could just as well directly generate any sort of documentation. I'm not convinced that this is really that meaningful. I may be wrong, but I have not seen much adoption. I personally did not fell any need for it, I'm happy with the other option, but everybody has to make up their own minds about it.


21. So, I guess, the last question, how often are you in between either sticking to standards or doing your own custom things? And how do I decide? Should I always try to stick to standards, like to existing standards or can I sometimes extend?

I think that's a very tough question to answer. I think that obviously you should try to standardize as much as is meaningfully possible, and it very much depends on how much value you get from that. So a good example of this is atom. Atom is a standardized format, it is an official RFC. Atom messages are self- descriptive, they have a nice registered IANA MIME type that comes with the message. So if I, let's say, want to provide a service that notifies people of changes to my customer data, so every time any customer is changed I notify people.

The RESTful way of doing that would be to provide a feed and the people would poll, they would actually pull the information and the caching and the conditional get will ensure that it scales very nicely and works in good enough performance, in most cases. Notify you have real time performance needs, but in the most cases such as the one that I described just now, you don't have that. So you can use that model and you can have to make a decision whether you want to use an atom feed for that, because you can put that information in there or something else. But using an atom feed gives you benefits. For example you can use any feed reader to subscribe to those changes, which is nice.

I mean I can have my feed reader I can feed from InfoQ, I have several blogs that I follow, and I have feeds from our company internal applications, and in this particular scenario, I have feed that subscribes to changes to customers, which is nice. On the down side, I inherit the stuff that I don't need from Atom. Like for example, maybe I don't need XML, maybe I would have been perfectly fine with a simple JSON file or maybe CSV, because my clients usually use Excel or something like that. So I have to decide whether the trade-off is worth it or not. If the ecosystem around this format, around this set of standards is big enough then that's reasonable. So, maybe another example, I will stop talking, is using HTML for human readable stuff.

You can get a lot of benefit by simply providing HTML representations, sort of as a to String method of the web, right. So assuming that you can make debugging easy if your objects have a good to String method, you can make debugging in a distributed RESTful HTTP scenario easy if your resources have an HTML representation. That's a very meaningful thing because everybody with any browser can access the stuff they're authorized and view information, and the browser, the HTML are a huge ecosystem, so much you can benefit from. So there is a case to be made using those standard formats and I think you should definitely seriously consider that before you build your own.


22. Would you use HTML as data type, as a data format?

Maybe yes, in certain cases yes, because using micro formats, that's one of the options of doing that, you would actually embed information into HTML, I haven't looked into much detail but HTML5is coming up and it has a new model for putting machine readable information into HTML. And, again, it's a question of the size of the ecosystem. If this becomes widely adopted then there's a strong benefit to actually providing information in single place, not having two representations but just one. Because using the model of separating content from form in HTML using CSS, you actually don't have that much of difference. But, again, you have all the stuff in HTML that you don't need, that you carry along with this stuff so maybe your decision is different in any particular scenario.

May 28, 2010