InfoQ Homepage Articles Designing and Implementing Hypermedia APIs

Designing and Implementing Hypermedia APIs

Feb 05, 2013 18 min read

Write for InfoQ

Feed your curiosity. Help 550k+ global
senior developers
each month stay ahead.Get in touch

This article (the first in a four-part series) talks briefly about the concept of using hypermedia as an application programming interface (API) and how to design a hypermedia type to use as a basis for your API. It also covers the work of mapping your particular problem domain to hypermedia messages and documenting your design.

In upcoming articles, the details of implementing hypermedia servers and clients will be reviewed along with techniques for evolving the API safely over time.

Why Hypermedia?

Over the last several years the idea of creating and publishing a Web API has become quite common. The explosion of mobile devices of various sizes, the desire to provide third-party developers access to existing services, etc. all contribute to this increased focus on APIs for the Web.

Most programming interfaces for the Web follow the same general design patterns for local APIs: a set of functions that accept parameters and return objects and/or collections. However, there are a growing number of interfaces that do not use this RPC-style design pattern but instead more closely mirror the way Web interactions were originally designed - via links and forms. This approach for programming interface design is commonly called a Hypermedia API.

Marshalling Objects vs. Representing State

One of the noticeable differences between Hypermedia APIs and RPC APIs is the use of a shared message model for every interaction. Hypermedia APIs usually use a registered media type (e.g. HTML, Atom, HAL, Collection-JSON, etc.) as the message model for all request and response payloads regardless of the object or collection being passed.

A key reason for using this approach is that the Hypermedia APIs are focused on representing the state of the application rather than the objects or functions that affect that state. While it may make sense to express the state of an application as an object (or object graph) when working in local source code, this approach has drawbacks on the World-Wide Web. There are times when third-party developers are not using the same programming language as the API server. Its is also likely that these third-party apps are built remotely (in time, space, or both) from the initial implementation. In these cases, often the only “shared understanding” between the client and server developers are the response messages themselves. In these instances, a rich, consistent message design can be most effective.

Hypermedia Controls

Another key differentiator between Hypermedia-style and RPC-style approaches is the use of a message design that contains more than just data, but also includes control information that represents the possible “next steps” at the time of the response. RPC-style APIs usually publish a static list of all possible requests within the scope of the application. Client developers are expected to code their apps so that the apps know which APIs to call and to call them in the proper order, depending on the state of the client application.

Hypermedia-style APIs usually only publish a small handful of the possible “starting” URIs (sometimes just one) and then include additional instructions on which API calls are valid for the current state of the client application with the responses themselves. This reduces the need for client app developers to know the order in which APIs must be called and allows servers to tailor responses based on context information such as user identity, performance considerations on the network, etc.

The Class-Schedule Problem Domain

The first step in implementing any API is to sufficiently document the problem domain. For this article series a relatively simple problem domain will be used: scheduling classes. This will offer the chance to manage student, teacher, and course records along with the ability to create a class schedule useful for both teachers and students. This problem domain will provide opportunities to read, write, and query stored data and create relationships between various records.

Instead of starting with a list of “objects” or “record types”, we’ll first document the domain in terms of “states” and “transitions.” This gives us a chance to think about how clients and servers can share information about the domain without adding any programming requirements (functional, object-oriented, etc.) up front.

We’ll also document the problem from the client (or user) point of view instead of the view of the server. This approach also lends itself well to a task-oriented model that is relatively easy to map into a media type design when needed.

NOTE:
A document showing the States, Transitions, and Data elements for this problem domain (complete with parameter lists and details on implementing the transitions using HTTP methods) can be found in the project’s repository on Github (http://github.com/APIAcademy/Class-Scheduling/).

Domain States

In this simple domain, there are only a handful of “states” of which the client needs to be aware:

Home
The initial entry point of the service.
Student
Either a list of students in the system or a single student record.
Teacher
Either a list of teachers in the system or a single teacher record.
Course
The list of available courses or a single course record.
Schedule
The list of schedules classes or a single schedule record.

NOTE:
Technically, you can expand this list of states to contain one state for each list and one state for each record. For now, it is helpful to view each single record as a “list of one” in order to simplify the modeling.

The list above shows all the ways in which the client can “view the state” of the application at any point in time. However, the list above does not indicate how clients can request a view or move between states. That is the work of “transitions.”

State Transitions

State transitions are ways in which clients can make a request of the server and/or change the state of the application. In RPC-style APIs these transitions are documented as static URLs with associated request parameters or content bodies. Client code is then written to “know” all the URLs and related request details.

In Hypermedia-style APIs the transition instructions are provided within the response in a standardized format. In HTML these transitions are expressed as links (<a> tags) and forms (<form> and <input> tags). Other media types (Atom, HAL, Siren, Collection+JSON) have similar controls defined. While each of these differ in the details of how transitions are expressed, all the transitions have the same basic features. They indicate a URL for the transition target, usually have some type of identifier (name, id, rel), and may have one or more variables to hold values to send to the server when activating the transition.

Below is the list of transitions for the Class-Schedule problem domain:

Add
Add a new record to the system. Applies to the Course, Student, Schedule, and Teacher states.
Assign
Assign a student to a schedule. Applies to the Student and Schedule states.
Filter
Retrieve a filtered list of resources. Applies to the Course, Student, and Teacher states.
List
Retrieve an unfiltered list of resources. Applies to the Course, Student, and Teacher states.
Read
Retrieve an existing resource. Applies to the Course, Home, Schedule, Student, and Teacher states.
Remove
Remove an existing resource from the system. Applies to the Course, Schedule, Student, and Teacher states.
Unassign
Remove a student from an existing scheduled course. Applies to the Schedule and Student states.
Update
Modify the state of an existing resource. Applied to Course, Student, and Teacher states.

NOTE:
Some of the above transitions will also have parameters associated. For brevity, they are left out of this list but will be reviewed in subsequent sections below.

The list of state transitions indicates cases where a client can query the server to view an existing state (Filter, List, Read) and where a client can send a request to the server in order to alter the state of the application (Add, Assign, Remove, Unassign, Update). You can also see that these transitions do not apply to all possible states (from the first list).

Data Elements

All APIs need to identify the list of data elements used in the problem domain. These data elements can be returned in responses and/or used as parameters in making requests to the server. For this simple problem domain there are only a handful of data elements needed.

courseCapacity
Maximum number of students allowed to sign up for the course.
courseDescription
Description of the course.
courseId
System-generated unique ID for the course.
courseName
Name of the course.
scheduleId
System-generated unique ID for the schedule.
scheduleSlot
Time and days of the week for the scheduled course.
studentId
System-generated unique ID for the student.
studentName
Name of the registered student.
studentStanding
University standing of the student (freshman, sophomore, junior, senior).
teacherId
System-generated unique ID for the teacher.
teacherName
Name of the teacher.

In a fully-functional production system there would be quite a few additional data elements. However, for this article series this will present enough variety to illustrate the basic points of designing and implementing Hypermedia-style APIs.

Defining a Domain, Not an Implementation

The above sections only documented the basics of a problem domain; the states, transitions, and data elements. This outlines what’s possible, but not the implementation details. In most cases, RPC-style APIs document a single instance - a solution - for a problem domain. Usually, hypermedia-style APIs document the general problem - a domain - for creating solutions.

Hypermedia APIs define the domain clients and servers can work within, not a static implementation instance. This allows clients and servers to “share understanding” about the actual problem space itself, not just a single implementation within that space. This is one of the characteristics of Hypermedia-style design that makes it possible for clients and server to establish new states and transitions without the need to break existing implementations. In some cases these additional features can be implemented successfully without requiring changes to running code.

NOTE: The ability to evolve a Hypermedia API will be covered in a later article in this series.

With the problem domain sufficiently defined, it’s time to move on to designing a message format - a media type - with which to communicate the states and transitions of this domain.

Designing a Hypermedia Type

The process of designing a hypermedia type involves just a few basic steps:

Select a data format (XML, JSON, CSV, etc.)
Define the Message Structure (required and optional elements, properties, etc.)
Identify the Protocol Semantics (HTTP, WebSockets, XMPP, etc.)
Apply the Domain Semantics (Course, Schedules, Students, etc.)

Along the way, like any design process, you need to make judgements on how specific or general your design will be. Usually, the more specific your design, the easier it is to implement for clients. However, this specificity comes at a cost; it is difficult to evolve highly specific designs without breaking existing implementations. The more general a design, the more likely it is to support evolvability. But, as you might guess, more general designs can be more challenging to implement.

NOTE:
It is usually best to select an existing IANA-registered hypermedia type (Atom, HAL, Siren, Collection+JSON, etc.) instead of creating your own. Using a registered design means you’re likely to find existing support libraries & tools and can rely on the experience of others when doing your instance implementations. However, for this article series, we’ll go through the process of creating a hypermedia type in order to learn more about what makes up a hypermedia design.

Select a Data Format

The first step in designing a Hypermedia Type for your API is to select a data format. While there are a handful from which to choose, the most common formats are XML and JSON. For the most part this is a choice driven by available tooling, community preferences, and (in some cases) the skills of the designer. The key difference between XML and JSON is that XML has a richer representation model (XML attributes, the ability to express collections easily) than JSON. However, JSON has “built-in” tooling for Web browsers and other Javascript-based implementations like Node.js.

For this series we’ll use XML to start with. In another article we’ll also look at supporting multiple formats for the same API using HTTP Content Negotiation.

Define the Message Structure

After selecting the format, you need to define the basic message structure. This establishes the layout of a valid message, identifies any required elements, properties, attributes, etc. and notes any hierarchies or other rules for valid messages.

For our design, one simple approach would be as follows:

<root>
  <course />
  <schedule />
  <student />
  <teacher />
</root>

This identifies each of the major documented “states” and is easy for both humans and machines to follow. However, there is a possible drawback here. What happens if we want to add future states (e.g. Application, Graduation, etc.)? Adding new elements might “break” the message validation and/or result in clients ignoring them completely.

One way to avoid this possible problem is to use a more general design that allows for safely adding new states without breaking any validators and (depending on the client implementation) would even allow new “unknown” states to be parsed, processed, and rendered in the future.

Here’s a more general message structure:

<root>
  <list name=”course” />
  <list name=”schedule” />
  <list name=”student” />
  <list name=”teacher” />
</root>

The above design will allow client apps the code for the existence and processing of <list> elements and could easily allow for the safe rendering of new elements such as <list name=”application”>, etc.) in the future.

We can follow the same principle of generality and define additional structural elements of the message in order to indicate which parts of the message are lists, which are individual items, which parts are for display and which are for transitions.

Below is a more complete example of a general design that will work well for our problem domain (including notes on required attributes for some elements:

<root>
   <actions name="links">
        <!-- REQUIRED: href, action -->
        <template href=”...” name=”...” action=”...” prompt=”...”>
          <data name=”...” value=”...” prompt=”...” />
        </template>
       <!-- REQUIRED: href, action -->
       <link href="..." name="..." action="..." prompt="..." />
   </actions>
   <list name="...">
     <actions>
        <!-- REQUIRED: href, action -->
        <template href=”...” name=”...” action=”...” prompt=”...”>
          <data name=”...” value=”...” prompt=”...” />
        </template>
       <!-- REQUIRED: href, action -->
       <link href="..." name="..." action="..." prompt="..." />
     </actions>
     <item name="...">
       <display>
         <!-- REQUIRED: value -->
         <data name="..." value="..." prompt="..." />
       </display>
       <actions>
         <!-- REQUIRED: href, action -->
         <template href="..." name="..." action="..." prompt=”...”>
           <!-- REQUIRED: value -->
           <data name="..." value="..." prompt="..." />
         </template>
         <!-- REQUIRED: href, action -->
         <link href="..." name="..." action="..." prompt="..." />
       </actions>
     </item>
   </list>
</root>

NOTE:
A complete specification on this simple design can be found at the github repository for this article series (http://github.com/APIAcademy/Class-Scheduling).

Identify the Protocol Semantics

Once you have selected a format and defined the message structure, you need to define the way transitions will be expressed within the responses. in HTML this is done using links (<a> tags) and forms (<form> and <input> tags). Since XML will be the format for this design, the transitions will look very similar to the ones that appear in HTML.

Static Read-Only Transitions

Transitions that are designed to allow immutable, read-only operations can be expressed as simple links:

<link name=”...” action=”...” prompt=”...” href=”...” />

The example above includes an “name” identifier (to help client code recognize the link), a prompt (to help humans recognize the link), an “action” value to indicate what type of transition this is, and an href to hold the actual URL to active the transition.

Variable Transitions

Transitions that allow for clients (either the code or the human) to insert varying values can be expressed as “forms” - links with additional parameters:

<template name=”...” action=”...” href=”...” prompt=”...”>
    <data name=”...” prompt=”...” value=”...” />
</template>

The example above includes a template element with a machine-friendly name, an action property (to identify what type of transition this is) and the href to hold transition URL. One or more data elements may appear, too.

NOTE:
It should be noted that nothing here has been said about which protocol is in use (HTTP, WebSockets, etc.). These transitions will work well for any OSI Layer 7 protocol. In the case of HTTP, the “add” transition could be mapped to the POST method. When using WebSockets, the message Send operation can include the action and/or name values to help recipients route the message accordingly.

Apply the Domain Semantics

Once the format, structure, and protocol details are complete, the final step is to make sure the design supports applying domain specifics to the messages. In our case, we want to be able to apply the state, transition, and data element identifiers and details to this message design.

Since the design includes attributes for name and action, it’s relatively easy to apply problem domain details to this design. Below is an example teacher resource represented in our new message design:

<root>
   <actions name="links">
     <link href="..." name="home" action="read" value="Home" />
     <link href="..." name="teacher" action="list" value="List"/>
     <link href="..." name="teacher" action="filter" value="Filter"/>
   </actions>
   <list name="teachers">
     <actions>
       <template href=”...” name=”teacher” 
          action=”add” prompt=”Add Teacher”>
          <data name=”teacherName” prompt=”Name” value=”...” />
       </template>
     </actions>
     <item name="teacher">
       <actions>
         <link href="..." name="teacher"
           action="read" value="Refresh"/>
         <link href="..." name="teacher"
           action="remove" value="Remove"/>
         <template href="..." name="teacher" action="update">
           <data name="teacherName" prompt="Name" value="..." />
         </template>
       </actions>
       <display>
         <data name="teacherId" prompt="ID" value="..." />
         <data name="teacherName" prompt="Name" value="..." />
       </display>
     <item>
   </list>
</root>

You’ll notice in the above example that the state identifiers (home, teacher) as well as the data elements (teacherId, teacherName) appear as values for the name attributes. The transition identifiers(read, list, filter, remove, update) appear as values for the action attributes.

NOTE:
A more complete set of representations for each of the states in this problem domain are available in the GitHub repository associated with this article (http://github.com/APIAcademy/Class-Scheduling).

Registering Your Design

It is a good idea to register your media type design with the IANA. There is no cost, there is a single online form to fill out, and approval of “personal” or “vendor” designs is usually completed within a few weeks. This need not hold up your implementation work and it has the potential to expose your designs to a wider audience who may find it handy and begin using it, too.

NOTE:
Part of the registration process involves establishing an identifier. For this project we’ll use the following: application/vnd.apiacademy-scheduling+xml

With both the design of the media type completed as well as the documentation of the problem domain, the basic work is accomplished. However, there is one more step before this design can be considered complete - documenting the design.

Documenting Your Design

The process of documenting your design involves writing up both your media type design and your problem domain details. It is the combination of “how” information is passed between client and server (the media type) and “what” information is shared (the Class Scheduling program domain).

Documenting the Media Type

Usually, the media type design is already documented when it is registered with the IANA. However, since we created our own design, we’ll need to write up the document ourselves and make it available to developers. If this is a private design, it can be posted at some link within the organization. Public designs can be posted online and the address shared as needed.

The details of documenting media types is beyond the scope of this series but, in general, it is wise to include the following sections in your media type documentation:

General Description
IANA Registration Status (unregistered, pending, approved)
Known Implementations
Format (Elements, Attributes, etc.)
Examples (small, complete request and response messages)
Tutorials (larger samples of how the media type can be used in an implementation)

The last two items are optional, but can be very helpful for anyone working to build a server or client implementation that relies on your design.

NOTE:
A sample Media Type document for this design is available at the GitHub repository for this article (http://github.com/APIAcademy/Class-Scheduling)

Documenting the Problem Domain

Documenting the problem domain is not very difficult. The challenge is usually to limit the amount of implementation details with the problem domain documentation. The domain documentation consists of the same three items discussed at the start of this article:

State Identifiers
Transitions
Data Elements

These items outline that “states” in which the application will reside, the various actions clients can take to query and/or modify that state (the transitions), and the data elements shared between client and server when expressing or altering the application state. While it is often helpful to include protocol-specific details in the transitions (which HTTP method to use, whether to send the arguments as a body or in the URL, etc.), be aware that some clients or servers might want to use WebSockets or some other protocol when implementing a solution for the same domain. Be sure to document the domain in a way that will allow for this at some future point.

NOTE:
A sample domain document for the Class Scheduling domain is available at the github repository for this article (http://github.com/APIAcademy/Class-Scheduling).

Summary

In this article, the concept of using Hypermedia-style APIs was discussed, a problem domain was defined (Class Scheduling), a simple hypermedia type design was created (application/vnd.apiacademy-scheduling+xml), and a brief review of documentation patterns was covered.

In the next installment, we’ll use the media type design and domain documentation to build a fully-compliant hypermedia server for the Class Scheduling domain.

NOTE:
While this article did not go into great depth in designing and documenting media types, you can find supporting material in the github repository and can even fork and send pull requests into the repo if you’d like to get more involved in this project.

About the Author

Mike Amundsen is Principal API Architect for Layer 7 Technologies, helping people build great APIs for the Web. An internationally known author and lecturer, Mike travels throughout the US and Europe consulting and speaking on distributed network architecture, Web application development, Cloud computing, and other subjects. He has more than a dozen books to his credit.

InfoQ Software Architects' Newsletter