Article: Webber, Parastatidis and Robinson on "How to GET a Cup of Coffee"
In a new article, Jim Webber, Savas Parastatidis and Ian Robinson show how to drive an application's flow through the use of hypermedia in a RESTful application, using the well-known example from Gregor Hohpe's Starbucks does not use Two-Phase-Commit" to illustrate how the Web's concepts can be used for integration purposes.
While many people have grasped the utility of REST for simple cases, the authors show how to get more value out of the REST core concepts, specifically the "hypermedia as the engine of application state" principle. They show how links, included within resource representations retrieved from the server, can enable the client to find out which possible transitions are available from a particular point in the overall application flow.
Find out more about more about "How to GET a cup of coffee", the RESTful way.
Very well done
Good explanation of "ST" of REST.
1. After paying for the order, what would the next state for the customer be and how would he/she get there? In a real Starbucks the barista would notify the customer via a push - "Roy's drink is ready" and the customer would pick it up. But that's not very RESTful. Instead would the customer poll a feed of drinks that are ready? And perhaps the feed could have a URL telling them where to pick up their drink(s).
2. I was a little confused by your use of the 'rel' tag. After placing an order the customer is directed to the payment URL via the 'next' tag. The xml namespace describing the semantics of the next tag points to example.org/state-machine. However the 'rel' tag points to starbucks.example.org/payment for the definition of the payment value. I'm used to having all the possible values of the rel tag defined in the 'next' tag's xmlns, and the rel tag having one of those simple values like "payment", "alternate", etc. Either example.org defines the complete semantics of 'next', or if starbucks has extended it both namespaces should be mentioned in the next tag. At least that's how I understood things but maybe I'm wrong. Anyway this is a minor detail.
Re: Two Questions
Thanks for your questions. A partial reply to your first only, for now:
I think, as you suggest, come form of client polling will work here. But note that the barista doesn't so much "push" the finished coffee as announce or publish its being completed: it's up to the customer then to fetch the coffee. In this way, some of there reliability issues surrounding the delivery of the coffee and the completion of the transaction are delegated to the customer. If the customer's still in the store and in earshot, he or she can act on the notification and get the coffee: there's no need for some expensive table service (bus) to guarantee delivery from the barista to the customer.
We could have the barista publish to some sort of feed of completed orders, an Atom feed for example, with each customer then polling the feed at intervals in order to determine the status of his or her order. Customers in this particular scenario don't require subsecond updates, but they might reasonably expect to learn their coffee's ready within say 10 seconds of its being completed. As George Malamidis has pointed out, many clients polling a feed every 10 seconds or so poses some pretty significant challenges.
I don't think an Atom feed is appropriate here. A feed of events is useful when the feed consumers are interested in all or most of the events. But here, because they're only interested in the status of their own particular order, each customer will likely have to parse the feed and discard the majority of entries. (But see the FriendFeed update feed and Simple Update Protocol for an example of using feeds to publish events to clients, each of which is interested in only a fraction of the events.)
Better to create an order status resource per order. Customers can GET (or conditionally GET) a representation of this resource at frequent intervals, and the barista can PUT to it as he or she makes the coffee. We let the customer know about this resource when they first place their order by returning 202 Accepted and setting the Location response header to the URI of the new order status resource.
We can create this order status resource wherever we like. That is, we can shard the order status URI space, with particular shards servicing particular geographies. Better still, we can cache each order status and use something like Mark Nottingham's cache channels to control the freshness of these responses in a reasonably fine-grained manner. We might specify a max-age of only 5 seconds, but using cache channels, a cache can prolong the freshness of an order status as long as:
1) it (the cache) continues polling the cache channel at least as often as the cache channel's precision, and
2) the cache doesn't receive a stale event related to that resource from the cache channel.
Note that cache channels use Atom feeds for communication between the origin server and the cache or cache hierarchy - which makes perfect sense since the cache is interested in all or at least a significant portion of events coming off the channel.
The result: many instances of customers hitting the caches, and many fewer instances of caches hitting the origin servers, either to revalidate stale responses or poll cache channels.
Multipart vs. Atom vs. your own format
Even better than multipart, why not just send back a comma delimited list of order URIs?
One of the things (but not the only thing) that attracted me to REST was that you could focus on the data format you were exchanging between your services and not tunnel your interactions with a intermediary protocol. So far, Atom to me is just a sexier replacement to SOAP.
BTW, I liked the article.
JBoss, a division of Red Hat
A few things
* It would be good to explicitly point out that the representations are just that -- representations of state. I've seen many services where the author feels the need to wrap things with "RequestFoo" and "ResponseFoo" elements, or "Message" elements.
* You say 404 means that the service is too lazy or secure to give a real reason for denying the request. Using 400 and 404 is more appropriate for those, respectively; 404 has defined semantics that some client sofware will act upon.
* Using OPTIONS seems a bit contrived here, and it should be pointed out that it isn't required; the client is allowed to just try the request and see if it succeeds.
* Expect/Continue is really for cases where you have a long request body and want to see if the URI, method and headers are acceptable before sending the whole thing. It also has some interop problems (i.e., it isn't widely well-supported), so again it's probably easier to just send the request. I understand you're trying to illustrate a point here, but again it's contrived and doesn't get to the heart of the real utility of E/C.
* You update an order with a PUT. My notes on the printout say "No, that's f**ked". This is a flagrant abuse of the semantics of PUT; if you want to combine a representation with the existing state of the resource, use POST or the emerging PATCH.
* You ask the service if it's still possible to PUT using OPTIONS. Again, this leads people to believe that they should OPTIONS before every request, which isn't good practice, or necessary. Furthermore, it's a poor indicator of whether the PUT will be allowed, because if you're updating state that quickly, it very well may change between requests. General consensus is that OPTIONS metadata isn't that fine-grained in time.
* You model your payment resources as /payment/order/1234; why not /order/1234/payment, to leverage relative URIs?
* It would be really nice to see the service interacting with a link that the client gives it (e.g., to a bank account for payment). A lot of service authors make the mistake of only defining the interfaces they provide, rather than the ones that they consume; playing both roles is much more powerful.
* Note that Expires isn't the only mechanism that will keep the order list up-to-date here; POST to the order list will invalidate any intervening caches (as per RFC2616 section 13.10). Unfortunately, this isn't widely implemented, but we've now got it in Squid 2.HEAD, and it should be in release tarballs soon. At Y! we've also been working on communicating invalidations generated in this fashion between peers to keep a cluster in sync, and doing stuff with the grouping mechanism to let them invalidate other URIs. But I digress.
* Intermediaries are one of the more powerful actors in this kind of deployment; not only for caching/acceleration, but also load balancing, routing, and other services. It would be nice to illustrate that (perhaps with an extra barista?).
* Webber, you wrote this whole article just so you could say "The Technology you're about to enjoy is extremely hot", didn't you? :)
A Restful BPM engine in SaaS mode
Interesting paper; I'd like to share a practial experience on this kind of approach. At RunMyProcess (www.runmyprocess.com) we propose a fully featured BPM engine which technically relies on ReST+Atom (this combination is extremely powerful IMHO). We started to build it 2 years ago, and it's been GA since June 2008. This platform is delivered as a "BPM as a service", meaning it is a "software as a service" platform.
Beyond ideology, what does ReST+Atom bring to a BPM engine? From the user point of view, mainly an extreme ease of integration. A process can be started by a standard HTML form (it's just a post); any data configured within the engine or computed by the engine can be added in any portal, they're all just "RSS" feeds. This last point includes workflow tasks, process monitoring reports, real time BAM data (business activity monitoring), etc. Just feeds; besides, imagine that a feed is a resource like any other, meaning they can be used during process execution: typically useful to build control process in a couple of hours (minutes when you get skilled ;-).
Besides during execution we allow interactions with ReST "services", of course, but also with WS-SOAP, FTP servers or email smtp/pop. From the user point of view all the technical complexity is hidden, they become all equivalent, embedded into a restful process.
ReST+Atom is the technical stack which will contribute to democratize BPM: together with BPMN and "as a service" model, it should put an end to over complicated and endless BPM projects and large spendings.
That combination also opens the door for a largely new spectrum of BPM projects: multientreprise business process.
Enough blah-bah... You're all welcome to register and try, trial is fully featured and free ;-) (www.runmyprocess.com)
Good but not enough
As far as I know, it is the first and most complete article on how to use REST and this is cool!
So first, thank you for describing such a sample!
I have however some points to address you.
About the next links system, how to handle more complex (and more real-life-business-like) cases like:
* When you have a split with multiple paths to follow concurrently after one transition? (note this would have not much sense with the lifecycle of one resource but here we have three resources managed in the same workflow: order, payment and drink)
For instance, we could imagine that payment of the drink would set the order in the paid state (that's the process you've described) but also set the myStarbucks account in the created state (if the first order at Starbucks would automatically create a myStarBucks account for further loyalty or partnership offers)
* When you have a long-running process (which is not the case here as processes you've described all complete in a few minutes), how to ensure new next links will not set resources in odd states? (by long-running process, I mean several days, weeks, months or even years and versioning is a tough issue with such processes)
For instance, if we imagine a coffee takes some days to be ready (ok, it's not really realistic in the case of coffee but let's just imagine that as many real-life business processes act this way), if we change the process to enforce customers to fill in a customer service survey after having paid and before being able to receive its drink (i.e. the next link appearing on the payment resource points no more to drink received state but to a survey completed state)
For the last next link (the free-offer link, should be figure 20...), from which resource is it extracted? Does this link bounds to the order resource or the payment resource?
If it bounds to the order resource, is there a way to determine programmatically which next link to follow? (the payment link or the free offer one)
If it bounds to the payment resource, will the barista be able to see this link when checking a customer payment? (and if the barista was not human, how to avoid him to follow to this - irrelevant - next link?)
The latter introduces issues about automation.
In the article, all is clear: we know what to do to go from one state to the other, we understand a conflict root cause, we could even understand a unknown state - a myStarbucks account created or a customer survey complete -.
But this is barely because we are humans and we natively understand the semantics of the process (as we know the process very well, at least, from a customer point of view).
What if we were programs? (i.e. a Java/JEE or PHP web application - I know, stating a such thing does not really promotes us ;-) )
By this I mean, how distributed programs (and a program has a limited "smartness" capability) could conclude and act the same way we can do?
To take another sample than the previous one (aka the barista who should not follow the next link embedded in the resource it gets):
When updating an order, if there is a conflict and if I am a program, I can conclude the following:
- * The difference between what I want to PUT and what exists is a XML element additions
- * Since there is a conflict, I cannot PUT my additions XML element
And that's all! I cannot conclude on what to do to go further: should I drop my update request (which is the case in the article) or should I send another update (a different addition) because I used a value for the additions element that cannot be proceed at this time? (the latter means that the additions we want is invalid only because of the actual state of the resource (and not for all states) and even if it was semantically valid)
Is there a way to accurately handle the discovered on the fly next link system in order to let a computer use it without blowing our resources out?
In the 3rd story, you indicate in this model, state-machines and workflows are discovered and described on the fly and, because agreements have been made on the semantics (i.e. the set of transitions and states to handle - whether now or in several years -), clients will always know what to do with every state.
I completely agree that semantics must be agreed upfront and, in your sample, processes are discovered - by the client - on the fly.
But I strongly disagree about the fact that processes are described on the fly! Even if they are not described in WS-BPEL or WS-CDL (or in a Java bean or in a C program or whatever), they are however described upfront (at least by your UML diagram).
And the best proof of this is that a set of states has been agreed in the first place (prior any coding). And what is a set of states if not an - underlying - process?
It is then, IMO, sad - as it weakens it - that your article tries to state that no process description is needed upfront.
Moreover, I think that integrating the process in the client (as the client knows all states - not only information it can send or transitions/functions/services it can call - and hold in some way the process description) leads to a big risk: coupling the two.
Coupling is not always a bad thing (there's a lot of coupling in all we've done the past decades and not all has problems). But in the case of processes, especially frequently-changing processes (as the business has to change and adapt itself), coupling processes and clients means that, changing the processes will leads to changing the client (not easy to manage, especially in an automated environment)
Your article seems to get rid of that issue and tend to state that coupling the two is good for every process. IMO, there's no silver bullet to every problem and it's highly risky to state so (or to not explicitly state the contrary).
Last thing of this long comment: why do the barista set the order status by hand?
In all previous stories in your article, the order doesn't carry its own state as the state-machine does for him!
Here, IMO, you left the "Representational State Transfer" (aka REST) principles - as there are no state transfer and no transition used - to go back (in time) to data update (aka the U of CRUD...)
To change the order status to preparing, I expected the barista to PUT a "prepare" resource...
This issue maybe also shows that, in your article, you've mixed cross-resources processes (workflows) and resources lifecycles (which are individual state-machines).
Workflows are not always driven by states (understand workflows are not necessarily state-machines) but lifecycles are always state-machines.
Here, you've described two workflows describing an order-to-drink process and involving three resources (order, payment and drink)
Each of the resource has its own lifecycles (order goes from placed to prepared, payment from tentative to confirmed, drink from ordered to received) which are all distinct from the order-to-drink process (including states when you make your process state-driven).
The order-to-drink process uses resources and makes them evolve (through transitions of their lifecycles) and its state is a combination of resources' states and order-specific information.
I think clearly and cleanly separating all of these is a critical element in REST... And I was disappointed to see it was not in the first and most complete article on how to use REST...
Great Article - Few Comments
The use of PUT on the Order was slightly confusing, you are using it to update a portion of the existing resource and I thought this was a bit of a no-no. Instead I'd have expected a POST (or maybe in the futue a PATCH). I realize you have your reasons but I thought it was worth a bit more discussion, especially as an indisciplined client (not sending e-tags or if modified since) could re-POST and get another shot unexpectedly.
The PUT by the barista to update the Order state also seemed fine but I wondered if you'd considered handling this by POSTing a resource (perhaps OrderInProgress) to the Order? Just thinking that going in and updating part of a RESOURCE in this way is quite different to the other examples of RESTful workflow/process solutions that I've seen as they used more of a messaging paradigm (posting messages essentially) which seemed to result in a clearer approach (and allowed us to keep PUT idempotent).
The whole OPTIONS thing also seemed a bit contrived, its a pity its so early on in an example like this.
All in all though great stuff.
Re: Great Article - Few Comments
I should also say that I recognise that using PUT to update the Order state is a valid (and maybe preferrable) approach, but I'm just not sure how well it works in more complex workflows.
Re: Good but not enough
Nevertheless, I found the article very instructive and helpful to position REST in the "architecture toolbox" - a silver bullet is it not. Seems to me it still requires a lot of implicit agreements between the parties (see the comments about PUT vs. PATCH vs. POST) - a lot of this goes in the WS-* world in the explicit "contract" in case of cross-enterprise services (in form of WSDLs and documents...). Still it makes for a "round" and convincing example. I like MEST better :-)
application state vs. resource state
"Don’t confuse application state (the state of the user’s application of computing to a given task) with resource state (the state of the world as exposed by a given service). They are not the same thing." (see roy.gbiv.com/untangled/2008/rest-apis-must-be-h...)
That means, GET and HEAD, of course, cause state transitions of the application state. The application e.g., a web browser, will process e.g., render, the retrieved HTML page.
Furthermore, the "additions" have to be add to the "drink" resource, which is part of the order resource and not to the order resource itself.
Another issue is, that the processing model is defined by the specific link type descriptions and not by the referenced resource itself e.g., "anchor elements with an href attribute create a hypertext link that, when selected, invokes a retrieval request (GET) on the URI corresponding to the CDATA-encoded href attribute" (see roy.gbiv.com/untangled/2008/rest-apis-must-be-h...).
The processing models of link types are defined in knowledge representation language specifications. A media type specification is a specific knowledge representation language specification. These specifications themself can be described with a knowledge representation that is based on a generic knowledge representation structure, i.e. RDF Model - a resource description language. For that matter, every knowledge representation of a resource can be descrided with knowledge representation languages that are based on such a knowledge representation structure, i.e. Semantic Web ontologies.
To map your flat descriptions on RDF Model based knowledge representations, you can do the following things:
- the resource URI of a description e.g., starbucks.example.com/order/1234, is a subject of an RDF triple
- the resource type of a description e.g., starbucks.example.com/order, is the type of this subject
- the value of a "rel" attribute identifies the predicate of an RDF triple e.g., starbucks.example.com/payment; although the example is here a bit inappropriate, because such a relation would rather be named as e.g., "has payment" and hence starbucks.example.com/has_payment, because starbucks.example.com/payment can better identify a resource type (which are currently addressed by a (uninteresting) media type)
- the value of a "uri" attribute identifies the object of an RDF triple e.g., starbucks.example.com/payment/order/1234; this resource must not exist at the moment of usage (as you also pointed out), we just use it to reference (denote) it
- the value of a "type" attribute identifies the (intended) range of a property (predicate) or explicitly the resource type of the object resource
=> addressing the available media type (and also resource type) is not really necessary, because a client would request its preferred media types, i.e. a knowledge representation based on the RDF knowledge representation structure can be serialized by applying different serialization formats e.g., Turtle, JSON or XML. However, the necessary semantics are more included in the knowledge representation structure and knowledge representation languages that are based on such a knowledge representation structure, rather than in the serialization syntax, which is, however, also important in the processing chain of machines.
To sum up, I can apply a generic resource description language (knowledge representation structure), like RDF Model, to provide the basis for semantically rich descriptions of media type specifications and further knowledge representation languages. This includes also the description of link types and their processing models.
PS: the order description does not contain all information our barista needs to make our coffee, i.e. if he/she don't know how to make coffee, I should provide a further resolvable reference that leads to a description on how to make coffee, which can be requested, if needed (knowledge discovery on-the-fly) ;)