BT

RESTful Services with Erlang and Yaws

Posted by Steve Vinoski on Mar 31, 2008 |

Ever see the famous "Apache vs. Yaws" graphs and wonder whether you, too, should be using Yaws? The graphs show what at first seems to be an unbelievably huge scalability advantage for Yaws, with its ability to scale to over 80000 parallel connections while Apache keels over at only 4000. Reactions to these graphs tend to be quite polarized, typically either one of "there's no way these graphs are accurate" or "they must have misconfigured Apache", to the opposite reaction of "Wow, I need to try using Yaws!"

Regardless of whether you believe the Yaws comparison graphs or not, Yaws is a solid web server for serving dynamic content. Claes Wikström wrote Yaws - "Yet Another Web Server" - in Erlang, a programming language created specifically to support long-running, concurrent, highly reliable distributed systems. (To learn more about Erlang, get a copy of the wonderful book Programming Erlang, written by the language's creator, Joe Armstrong.) The flexibility of Yaws combined with several unique features of Erlang makes them a compelling combination for a RESTful web services platform. If you're serving static pages, grab lighttpd or nginx instead, but if you're writing dynamic RESTful web services, then Yaws is definitely worth exploring. In this article, I'll relate some of my experiences with using Yaws and Erlang for web services development.

Yaws Basics

Yaws provides several ways of serving dynamic web content and supporting RESTful web services:

  • Embedding Erlang code within static pages. With this approach, you embed Erlang code within a function named out/1 within <erl>...</erl> tags directly into static content. Files of this nature have a .yaws extension, by which Yaws knows to process the file and replace the <erl>...</erl> tags with the result of executing the out/1 function they're expected to contain. In Erlang terms, out/1 is a function of arity 1, i.e., a function taking one argument. Its argument is expected to be a Yaws arg record, which is a data structure that Yaws uses to communicate details for incoming requests to the code handling them. For example, an arg record supplies information such as the request URI, incoming headers, POST data, etc.

  • Application Modules (appmods). The Yaws appmod facility lets application code take control of URIs. In the approach described above, Erlang code is embedded within static files whose URIs are determined by their pathnames relative to the web server's document root. With an appmod, however, the application controls the meaning of URIs, and such URIs usually do not correspond to any file system artifacts. Appmods are basically Erlang modules that export an out/1 function. Such modules are configured in the Yaws configuration file to correspond to a URI path element. When a request is made containing a path element associated with a registered appmod, Yaws invokes that module's out/1 function, passing it an arg record. The appmod's out/1 function can then examine the rest of the URI to determine the precise resource that is the target of the incoming request, and respond accordingly.

  • Yaws applications (yapps). Unlike appmods which are usually just single Erlang modules, Yaws yapps are full-fledged applications. Each yapp has its own document root, and each can have its own set of appmods. Specifically, yapps are Erlang/OTP applications. OTP, which stands for "Open Telecom Platform," is a set of well-proven libraries and frameworks that provide Erlang applications with powerful capabilities. OTP encapsulates idioms and approaches for achieving distribution, event handling, and high reliability, among many other things. Erlang/OTP has been proven in real-world field usage within a variety of telecom systems, for example, some of which mark their downtime in just a few milliseconds per year.

All three of these approaches, which are detailed at the Yaws website, can be usefully applied within a RESTful web service, depending on the specific nature of the service itself. However, in my experience, yapps and appmods work the best, because they provide the most control to the web application.

RESTful Design

Since we want to develop RESTful web services, let's look at some details of REST, which stands for "Representational State Transfer." Roy T. Fielding coined the term "REST" in his doctoral thesis to describe an architectural style suitable for large-scale distributed systems like the web. HTTP is essentially an implementation of REST. The term "representational state transfer" refers to the fact that RESTful systems operate via the exchange of representations of resource state in requests and replies. For example, the typical web page retrieved with an HTTP GET is an HTML representation of the web resource identified by the URI targeted by the GET.

When developing a RESTful web service, these are the key areas to pay attention to:

  • Resources and resource identifiers

  • Methods supported by each resource

  • Formats of data interchanged between client and server

  • Status codes

  • Applicable HTTP headers for each request and response

Let's consider each of these areas in the context of Yaws and Erlang.

Resource Identifiers

Designing a RESTful web service requires you to think about what resources comprise your service, how to best identify them, and how they relate to one another. RESTful resources are identified by URIs. Normally, related resources have URIs that are themselves related, sharing common path elements. For example, in a web-based bug tracking system, all the bugs for imaginary project "Phoenix" might be found under the URI http://www.example.com/projects/Phoenix/bugs/, whereas the specific bug numbered 12345 might be under http://www.example.com/projects/Phoenix/bugs/12345/. RESTful resources also tend to provide URIs for other resources within their own state representations. This allows clients retrieving a particular resource's state to use the URIs returned within the state representation to navigate to other portions of the overall web application.

In Yaws, the arg record indicates the request URI, and the yaws_api module provides the request_url function to easily retrieve it:

out(Arg) ->
     Uri = yaws_api:request_url(Arg),
     Path = string:tokens(Uri#url.path, "/"), 

Once you have the request URI, I've found that it's handy to tokenize the request path as shown above, by splitting it on its forward slashes. The result is a list of path elements that begin at the URI point where you've tied your appmod. For example, let's assume we've tied an appmod onto the "projects" path element in the URI http://www.example.com/projects/. If a request is made on any URI containing this URI as its prefix, the appmod's out/1 function will wind up with a list of separated path elements indicating the target resource of the request. For example, a request for URI http://www.example.com/projects/Phoenix/bugs/ will result in the following Erlang list of path elements in the Path variable after executing the code shown above:

["projects", "Phoenix", "bugs"]

The utility of splitting the URI is that it makes further dispatching quite simple, thanks to Erlang's pattern matching. For example, we can write a separate function, let's call it out/2, to handle this specific URI by defining the function head like this:

out(Arg, ["projects", Project, "bugs"]) -> 
     % code to handle this URI goes here.

This out/2 function will handle all requests for bug lists for all projects we know about, with the variable Project, which is available to the function body, being set to the specific project name being requested. Supporting additional URIs is equally as simple: just add more variants of the out/2 function. You can also feel free to use a name other than out for these functions if you wish, since they are not invoked directly by the Yaws framework.

Note that properly defining your resource URIs yields significant benefits. With appmods and yapps, having a rich URI space is quite simple because of the simplicity of tying different appmods onto different URI path elements, and the ease of dispatching. Erlang pattern matching makes handling requests for different URIs trivial. Contrast this with the poor style traditionally used for defining non-RESTful services, where all services are given the same URI. This URI typically points to a script that uses information provided within the request body or through URI query strings to determine where to actually dispatch the request. The URIs that result from the Erlang/Yaws dispatching technique shown above are far cleaner than the overloaded URIs with seemingly endless parameter lists that result from the traditional approach.

Resource Methods

The methods that web clients can invoke on a web resource are defined by HTTP's verbs, primarily GET, PUT, POST, and DELETE. However, individual resources tend to support only a subset of those verbs. When you design your web service, you need to determine what methods each of your resources supports, bearing in mind the semantics expected for each HTTP verb as defined in RFC 2616.

In Yaws, the request method is found in the http_request record, accessible via the arg record:

Method = (Arg#arg.req)#http_request.method

This returns an Erlang atom representing the request method, which can then be added into our pattern-matching dispatching approach. We can add a new parameter to our out function, turning it into out/3, to include the request method:

out(Arg, 'GET', ["projects", Project, "bugs"]) ->
     % code to handle GET for this URI goes here.

This variant of the out function handles only GET requests for bug lists for each of our projects. Another variant might handle a POST, presumably to add a new bug to the list. To allow only GET and POST but disallow all other verbs, you'd simply write a catch-all function for the same URI:

out(Arg, 'GET', ["projects", Project, "bugs"]) ->
     % code to handle GET for this URI goes here;
 out(Arg, 'POST', ["projects", Project, "bugs"]) ->
     % code to handle POST for this URI goes here;
 out(Arg, _Method, ["projects", _Project, "bugs"]) ->
     [{status, 405}].

Here, methods other than GET and POST will match the third variant, which returns HTTP status 405, which means "method not allowed." The leading underscores on the Method and Project variables quiet compiler warnings about them being unused.

Just as with URI dispatching, Erlang pattern matching makes dispatching to separate functions to handle separate HTTP verbs trivial.

Representation Formats

When designing a RESTful web service, you need to consider what representation(s) each resource supports. Web service resources often support XML or JSON representations, for example. Erlang supplies the xmerl library for creating and reading XML, and Yaws provides a straightforward JSON module. Both work quite well.

You can access an incoming request's Accept header to determine what representation(s) the client prefers. This header is available in a headers record, also available through the arg record:

Accept_hdr = (Arg#arg.headers)#headers.accept

If your resource supports multiple representations, you can check this header to see if the client indicated which representation it prefers. If the client did not send an Accept header, the Accept_hdr variable shown above will be set to the atom undefined, and your resource can supply whatever representation it deems best. Otherwise, your service can parse the Accept_hdr value to determine which representation to send. If the client requests representations that your resource cannot fulfill, it can return HTTP status 406, which means "not acceptable," along with a body indicating what formats are acceptable:

case Accept_hdr of
     undefined ->
         % return default representation;
     "application/xml" ->
         % return XML representation;
     "application/json" ->
         % return JSON representation;?
     _Other ->
         Msg = "Accept: application/xml, application/json",
         Error = "Error 406",
         [{status, 406},
          {header, {content_type, "text/html"}},
          {ehtml,
           [{head, [], [{title, [], Error}]},
            {body, [],
             [{h1, [], Error},
              {p, [], Msg}]}]}]
 end. 

The Erlang code above checks the Accept_hdr value to see if it's either application/xml or application/json. If it's either of those, the resource returns a suitable representation, but if not, the code returns an HTTP status 406 along with an HTML document indicating the representations the resource is willing to provide.

Another way of handling the desired representation is - you guessed it - adding it as another parameter to our out handler function. This way, Erlang pattern matching ensures that our request gets dispatched to the right handler for the requested URI/method/representation combination. This avoids cluttering handlers with case statements like the one above.

By the way, this example also shows the Yaws ehtml type, which is a way of representing HTML as a series of Erlang terms. I find ehtml quite intuitive to write because it directly follows the structure of HTML, but is far more compact and eliminates the tedium and errors of matching tags that you face when writing literal HTML.

Status Codes

RESTful web services must return proper HTTP status codes, as indicated by RFC 2616. Returning the right status is easy with Yaws: simply include a status tuple in the result of your out/1 function. See the case statement above for an example of returning the appropriate status code. If your code does not explicitly set a status, Yaws will set a status 200 for you, indicating success.

HTTP Headers

Retrieving request headers and setting reply headers with Yaws is straightforward, too. We've already seen an example of retrieving the Accept header from the headers record; other request headers can be retrieved in the same fashion. Setting reply headers simply requires putting a header tuple in the outgoing reply, like this:

{header, {content_type, "text/html"}}

This sets the Content-type header to "text/html," for example. Similarly, in our previous example where we returned status 405 to indicate a "method not allowed" error, we should have also included the following header:

{header, {"Allow", "GET, POST"}} 

Appmods or Yapps?

So far we've seen how Yaws and Erlang make it almost trivial to handle many of the most important concerns for RESTful web services. One remaining question is about choosing appmods vs. yapps, and the answer depends on what your services do. If you're writing web services that have to interact with other back-end services, then yapps are probably your best bet. Since they're full-blown Erlang/OTP applications, they typically have initialization and termination functions where connections to the back end can be created and shut down. If your yapp is an Erlang/OTP gen_server, for example, your init/1 function can establish state that the gen_server framework will provide to you, and allow you to modify, every time it calls you back due to an incoming call to your server. Besides, using yapps also means you can use appmods as well, so it's not really a matter of choosing one over the other. Finally, yapps can participate in Erlang/OTP supervision trees, where supervisor processes can monitor your yapps and restart them if they should fail. Supervisor trees play a significant role in the reliability of long-running Erlang systems.

This article is geared toward RESTful web services based on back ends other than relational databases. If you're writing a traditional web server on top of a relational database, you should check out Erlyweb, a framework for such web services, which is also based on Yaws and Erlang.

Conclusion

A significant aspect of writing RESTful web services is choosing the right programming language. We've seen numerous service frameworks in a variety of programming languages come and go over the years, and most were failures simply because they were a poor match to the problem. Yaws and Erlang do not specifically provide a RESTful web services framework, yet the facilities they provide are a better match for RESTful development than many other language frameworks that were built specifically for that purpose.

While an article of this nature necessarily can't dive deeply into the details of Yaws, Erlang, and RESTful web services, it has hopefully touched on the important topics and provided, through its minimal code examples, an idea of how to address them. In my experience, building RESTful web applications with Yaws and Erlang is very straightforward, and the resulting code is easy to read, easy to maintain, and easy to extend.

About the Author

Steve Vinoski is a member of the technical staff at Verivue in Westford, MA, USA. He is a senior member of the IEEE and a member of the ACM. Over the past 20 years he has written or co-written over 80 articles, columns, and a book on distributed computing and integration, and for the past six years he's written the "Toward Integration" column for IEEE Internet Computing magazine. You can reach Steve at vinoski@ieee.org, and visit his blog at http://steve.vinoski.net/blog/.

Hello stranger!

You need to Register an InfoQ account or or login to post comments. But there's so much more behind being registered.

Get the most out of the InfoQ experience.

Tell us what you think

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

Repeatability by Erik Onnen

Hey Steve,

Thanks for the article. In my own testing, I've seen Yaws fall over and die under a load of ~20 concurrent requests. This really begs the question, what can we do to repeat Joe's tests? The documentation and code of "the test" are sparse, and they aren't public so it's impossible for anyone to reproduce the same numbers. Given that, I can only assume the Yaws numbers don't take into account dynamic content, something that is important to most developers. Any thoughts there?

-erik

Re: Repeatability by Erik Onnen

Also, I'm curious about your mention of Erlyweb. How many people are maintaining Erlyweb? For those of us who are interested in the comminity there, how many people actively contribute to Erlyweb?

Apples And Oranges by Paul Tiseo

While I am learning Erlang because it seems to be an emergent language I should have in my toolset as a developer, I fear the "famous graph" at the start of the article, used as a teaser, is perhaps misleading.

First, are Apache and Yaws comparable? How much of Apache's functionality set is found in Yaws? Second, what was the test configuration for that graph? Also, no mention is made of the follow-up benchmarking

I mean, if a writer starts an article with: "Whether or not you should use X, here's how you'd use X...", shouldn't the "whether or not" part be well-established rather than open to debate?

Missing hypermedia by Guilherme Silveira

Hi Steve,

Although the example almost reaches REST in a really simple way, it is just missing the HATEOAS point when leaving the hypermedia aspect behind. It is much easier to develop thatn SOAP based solutions, but without hypermedia support, both clients and servers are coupled in a way that REST systems should not.

Regards

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

4 Discuss

Educational Content

General Feedback
Bugs
Advertising
Editorial
InfoQ.com and all content copyright © 2006-2014 C4Media Inc. InfoQ.com hosted at Contegix, the best ISP we've ever worked with.
Privacy policy
BT