BT

Designing an Event Log API with RAML

Posted by Saul Caganoff on Apr 11, 2014 |

Introduction

There is now a strong consensus that APIs should be designed from front-to-back with an emphasis on the developer usability of the API. In a crowded economy with many competing products and many competing API implementations, the easy to use and well designed APIs will have an advantage in attracting and retaining developers. Hence much recent effort has gone into tools to help design APIs in an iterative fashion and to help developers learn and use those APIs.

Humans are the key ingredients in API design which means that API design tools and documentation formats must be human readable and writeable. Recent API documentation standards strive to be "human-centric." API Blueprint was an early leader in this area adopting Markdown as a format which is familiar to both developers and business analysts involved in the API design lifecycle.

Mulesoft released their RESTful API modeling language RAML, late last year. The language could end up being just a proprietary vendor language, but there are a number of reasons why this language is interesting to the broader API community: 

  • RAML has been open-sourced along with tools and parsers for common languages. The development of RAML will be overseen by a steering committee of API and UX practitioners and there is an emerging ecosystem of third-party tools being developed around RAML.
  • Mulesoft originally started with Swagger but realised that standard was best suited to documenting an existing API, not for designing an API from scratch. RAML evolved out of the need to support up-front API design in a succint, human-centric language.
  • API descriptions are often verbose and repetitive which can hinder their structure, understanding and consumption. RAML has introduced language features that support structured files, inheritence and that address cross-cutting concerns. These features are the main focus of this article.

RAML does not enforce good API design. There are no "best practices" inherent in the language or the tooling. The intention of RAML was described by Uri Sarid, CTO of MuleSoft as "...that an open, simple and succinct spec for describing APIs, that captures their structure and fosters pattern-based design and reuse, will help unlock the potential of the API economy."

RAML does not address the implementation of an API—it is purely a specification language. There are a small number of open-source tools available for implementing an API described in RAML. APIKit converts a RAML specification into Mule ESB flows and is useful if you are targeting Mule ESB as your runtime platform. JAX-RS Codegen has broader applicability as it generates JAX-RS Java code from a RAML specification. Support for other languages and platforms will clearly be a strong determinant of the ultimate success of RAML.

This article is intended as an example of using RAML to design a simple but realistic API and to illustrate some of the features of RAML. It is not intended as a comparison of RAML with any of the other API design or description formats such as API Blueprint, IO Docs or Swagger. The exciting thing is we now have a viable and competing ecosystem of design choices which will hopefully shape and inform a consensus for RESTful API design and specification.

API Example Requirements

The example API we'll work with is an API for logging events into a stream to support online analytics and reporting. Logfiles as a repository of the state and progress of running software is a pattern that is perhaps as old as computing itself. But as systems become more distributed and workloads are shared between many - perhaps ephemeral - systems, managing and analysing logs as files becomes an impossible task. An emergent design pattern is to treat logs as streams of events. So the Eventlog API has the following basic requirements:

  • Enable a third-party application to log an event to a named event stream.
  • Events carry a small number of mandatory properties with extensible "context" properties.
  • Support retrieval of an event stream.
  • Retrieve a list of events within a sliding time window.
  • Retrieve a list events as a paginated list.
  • Support a limit to the number of events retrieved along with a sensible default limit of 100.
  • Support retrieval of a single event via its Id.
  • Events are immutable so cannot be deleted or modified.
  • The API must be publicly readable, but write operations must be secured by an OAuth 2.0 security scheme.

So let's explore RAML by using it to describe an API that supports these requirements.

For this article, we'll use the Anypoint API Design environment for RAML authoring. The designer also allows us to view interactive documentation and mock the API as we proceed with the design. We'll start with a simple specification and then use more advanced features of RAML to make our specification more DRY. REST API specifications can be very repetitive with a lot of boilerplate and repeating specifications for resources that might be only slightly different from each other. RAML provides features which help avoid or manage this repetition including resourceTypes, traits and file include directives. We'll introduce these features as we proceed before finishing the example with security specifications.

A Simple RAML API

Documentation and tutorials for RAML including the full RAML specification are available at the raml.org website. For the development of this example, I've used the Anypoint API Designer which includes an online RAML editor.

Defining Resources

As a starting point there are two main resources for the Eventlog API: a collection of events which we'll call an event stream and the individual events themselves. For the stream we require a POST method to create a new event in the stream and a GET method to return the events in the stream. Each individual event can be requested using the stream name and the event Id. The RAML code for this specification is shown in the following code snippet.

#%RAML 0.8
# Basic starter raml with two resources and corresponding methods.

title: Eventlog API
version: 1.0
baseUri: http://eventlog.example.org/{version}

/streams/{streamName}:
  displayName: A Named Stream
  description: A stream is a collection of related events. A Named Stream has been defined by the user to contain a list of related events. Named Streams are created by POSTing an event to a Named Stream.
  get:
    description: Get a list of events in this stream.
  post:
    description: Create a new event in this stream.
  /{eventId}:
   get:
     description: Get a particular event by its Id.

The baseUri provides the domain name and a version parameter. URI parameters are designated by curly braces. The collection resource has a URI relative to the base of "/streams/{streamName}" along with GET and POST HTTP methods respectively which perform the expected functions for a collection. The entity resource has a relative URI of "/streams/{streamName}/{eventId}" along with a single GET method.

Update: My original RAML had the entity as a top-level resource. Uri Sarid, CTO of Mulesoft and one of the developers of RAML advises that sub-resources should be nested in the RAML description underneath their parent resource.  The example code above has been updated to show the /{eventId} as a sub-resource of the stream /streams/{streamName}.

RAML allows us to provide a human readable description for these resources and method. The API Designer renders this specification into interactive documentation as shown in figure 1.

The API Designer allows us to test the API using a mocking service. As an example we can POST an event by specifying a stream name and clicking the 'POST' button. A default HTTP 200 response is returned at this stage.

Requests, Responses, Schemas and Examples

The default POST doesn't fully represent the behaviour and structure required for our API. We need to describe to the developer the data structures recognised for an event and the behaviour expected upon its creation. RAML provides two complementary options to describe this information: schemas and examples.

Although many new public APIs prefer JSON representations, a large proportion of APIs also or alternatively use XML. So RAML pragmatically allows specification of payloads with either XML Schema or JSON Schema. Schemas can be specified inline with the resource description, inline in the RAML header or included from an external file or URL. Since schemas can be quite large it is usually good practice to include external schemas because inline schemas can interrupt the flow and human understanding of the RAML specification.

#%RAML 0.8

title: Eventlog API
version: 1.0
baseUri: http://eventlog.example.org/{version}

schemas:
  - eventJson: !include eventSchema.json
    eventListJson: !include eventlistSchema.json

This example, includes a JSON Schema description from an external file. RAML can include both local files and HTTP resources. In this case we've included files that are local within API Designer and the schema definitions are included within a schema tag near the top of the file with names that can be referenced in later resource descriptions.

Update: At the time of writing, including HTTP resources from another site results in a cross-site scripting error since the javascript editor cannot reference resources from another domain. This won't happen if the remote server enables Cross Origin Resource Sharing (CORS). Mulesoft is considering a proxy feature in their API designer which will remove the requirement for CORS. 

Many APIs describe resource representations using examples rather than schemas, so RAML also allows you to provide examples in the specification. This is illustrated in the next code snippet using the example tag. An example is simply a block of JSON or XML code representing the resource body. Examples are used by the mocking service to mimic expected requests and responses.

When the developer POSTs a new event into a stream they should receive confirmation that the event resource has indeed been created and notification of the URL (including the id) for the new event. This is indicated by a 201 HTTP response (created) and an HTTP header attribute 'location' specifying the URL of the new resource. The following example shows how we specify the request and response for POSTing an event to a stream:

/streams/{streamName}:
  displayName: A Named Stream
  description: A stream is a collection of related events. A Named Stream has been defined by the user to contain a list of related events. Named Streams are created by POSTing an event to a Named Stream.
  post:
    description: Create a new event in this stream.
    body:
      application/json:
        example: |
                { "name": "Temperature Measurement",
                  "source": "Thermometer",
                  "sourceTime": "2014-01-01T16:56:54+11:00",
                  "entityRef": "http://example.org/thermometers/99981",
                  "context": {
                    "value": "37.7",
                    "units": "Celsius"
                  }
                }
    responses:
      201:
        description: A new event was created.
        headers:
          location:
            description: "Relative URL of the created event."
            type: string
            required: true
            example: /streams/temperature/123456
        body: null

The code example shows the RAML description of POSTing a temperature measurement event to a stream called 'temperature'. The body specifies a JSON payload and provides an example. The event contains properties you'd expect for an event such its name (or type), source and timestamp. We've also included an optional reference to an information entity related to this event as well as context data which is specific to the event type (in this case value and units).

In the response specification we've provided details on what happens with a 201 response. Of course there may be other responses in the 400 or 500 range which indicate other problems but we'll take defaults on this. The important aspect for our API is that developers should expect a 201 (created) response along with a header attribute containing the relative URL of the newly created event. We also specify that the body is empty. If the developer needs to obtain the event again, they can use the URL provided.

The GET method on the event collection returns either a 404 (if the stream hasn't been created yet) or a 200 response and a body containing an array of events. This is illustrated in the following code:

  get:
    description: Get a list of events in this stream.
    responses:
      404:
        description: The specified stream could not be found.
      200:
        description: Returns a list of events.
        body:
          application/json:
            example: |
              [
                { "id":"123456",
                  "name": "Temperature Measurement",
                  "source": "Thermometer",
                  "sourceTime": "2014-01-01T16:53:54+11:00",
                  "entityRef": "http://example.org/thermometers/99981",
                  "context": {
                    "value": "37.1",
                    "units": "Celsius"
                  }
                }
                { "id":"123457",
                  "name": "Temperature Measurement",
                  "source": "Thermometer",
                  "sourceTime": "2014-01-01T16:54:54+11:00",
                  "entityRef": "http://example.org/thermometers/99981",
                  "context": {
                    "value": "37.3",
                    "units": "Celsius"
                  }
                }
                { "id":"123458",
                  "name": "Temperature Measurement",
                  "source": "Thermometer",
                  "sourceTime": "2014-01-01T16:55:54+11:00",
                  "entityRef": "http://example.org/thermometers/99981",
                  "context": {
                    "value": "37.5",
                    "units": "Celsius"
                  }
                }
              ]

DRYing Up the Specification

We're only a little way into our API specificaion and already it has becoming quite bulky. A large number of resources along with each of their methods, parameters and examples will introduce repetition and opportunities for bugs in the specification. But RAML provides a number of features to help us not repeat ourselves.

Resource Types

There is a common pattern for collections where a POST produces a 201 response along with the resource location and a GET returns a list of resources. So we could abstract this pattern into a resourceType which other resources can inherit.

resourceTypes:
  - collection:
      post:
        responses:
          201:
            headers:
              location:
                description: The relative URL of the created resource.
                type: string
                required: true
                example: /streams/temperatures/12345
      get:

This greatly simplifies the streams resource description and allows us to specify only the items which are specific to the event stream, such as the example request and response bodies. Note the type keyword under the resource definition.

/streams/{streamName}:
  type: collection
  displayName: A Named Stream
  description: A stream is a collection of related events. A Named Stream has been defined by the user to contain a list of related events. Named Streams are created by POSTing an Event to a Named Stream.
  post:
    description: Create a new event in the stream
    body:
      application/json:
        example: |
                { "name": "Temperature Measurement",
                  "source": "Therometer",
                  "sourceTime": "2014-01-01T16:56:54+11:00",
                  "entityRef": "http://example.org/thermometers/99981",
                  "context": {
                    "value": "37.7",
                    "units": "Celsius"
                  }
                }
  get:
    description: Get a list of events in the stream.
    responses:
      200:
        body:
          application/json:
            example: |
              [
                { "id":"123456",
                  "name": "Temperature Measurement",

		... etc ...

Traits

An important feature missing from our current API specification is the ability to filter events from requests based on various criteria. Our requirements listed the ability to retrieve events in a sliding time-window or as a paginated list and to have a limit on the events returned so we don't overwhelm either the service or the client. We can accomplish this for all resources by specifying traits: one trait for time windows, another trait for pagination and a third for limits. While resourceTypes provide for inheritence of resource specifications, traits address cross-cutting concerns commonly encountered with (but not limited to) query parameters.

The following code describes three traits. Slidingwindow represents two query parameters which specify a window in time relative to the current time. Windowstart is the time interval between now and the start of the sliding window. This is expressed as an integer concatenated with a unit symbol. For example "1h" is one hour, "30m" is thirty minutes and "86164s" is one sidereal day expressed in seconds. Windowsize is another time interval (with the same notation) specifying the size of the sliding window.

The paginated trait represents the common query parameters where a chunk of resources are retrieved a page at a time. The current page and page size are represented by parameters pagenumber and pagesize respectively.

Finally the limited trait specifies a general row number limit to protect from the case where a naive client returns all events in a stream and overwhelms some part of the system. Limit has a default value of 100 which can be overridden in any request.

traits:
  - slidingwindow:
      description: Query parameters related to retrieving a sliding window of timestamped entities relative to now.
      queryParameters:
        windowstart:
          description: The begining of the sliding window expressed as a time interval from now represented as an integer concatenated with units h (hours), m (minutes), s (seconds) or ms (milliseconds).
          type: string
          example: 1h, 30m, 3600s
        windowsize:
          description: The end of the sliding window expressed as a time interval from the start as a concatenated integer and unit suffix.
          type: string
          example: 10s, 1h, 25m
  - paginated:
        pagenumber:
          description: The page number of the result-set to return.
          type: integer
          minimum: 0
        pagesize:
          description: The number of rows in a page request.
          type: integer
          maximum: 100
  - limited:
      queryParameters:
        limit:
          description: A general limit on the number of rows to return in any request.
          type: integer
          default: 100

The stream resource specification applies these traits to the GET method using the is key and listing all relevant traits in a YAML array:

/streams/{streamName}:
  type: collection

	... etc ...

  get:
    is: [ slidingwindow, paginated, limited ]

After applying these changes, the resourceType and traits, we can review the API in the API Designer, viewing the rendered documentation and trying out the API using the mocking service. The results are illustrated in figure 2 which shows the full query parameters documentation. The rendered documentation also indicates the resourceType of the stream resource as well as the traits that are associated with the GET method on the stream resource.

Security Schemes

API security is a top priority for API designers and an endlessly moving feast of protocols and standards. So it's good news that RAML supports a wide range of security schemes including OAuth 2.0, OAuth 1.0, Basic Authentication and Digest Authentication. Custom schemes can also be specified. Each of these schemes is qualified by which headers, query parameters and responses are involved. For this example, we'll use OAuth 2.0 but apply it only to the event POST method. I.e. anyone can read the event stream, but only authorized consumers can POST to it.

The following RAML code describes an OAuth 2.0 security scheme specified by an HTTP header Authorization and utilising the client credentials grant only. The RAML specification describes many other variations on available security schemes.

securitySchemes:
  - oauth_2:
      description: Eventlog uses OAuth2 security scheme only.
      type: OAuth 2.0
      describedBy:
        headers:
          Authorization:
            type: string
            description: A valid OAuth 2 access token.
        responses:
          401:
            description: Bad or expired token.
          403:
            description: Bad OAuth request.
      settings:
        authorizationUri: http://eventlog.example.org/oauth2/authorize
        accessTokenUri: http://eventlog.example.org/oauth2/token
        authorizationGrants: [ credentials ]

The oauth_2 security scheme is applied to the stream POST method using the securedBy key on that operation.

/streams/{streamName}:
  post:
    securedBy: [ oauth_2 ]

Note that securedBy accepts an array which allows multiple security schemes to be applied to an API or to an individual resource.

Conclusion

This has been a brief tour through a RAML specification of a very simple API. The emphasis has been to examine features in RAML which support code re-use and DRY principles for API descriptions. The key elements include:

  • The !include directive that allows inline inclusion of other RAML, YAML, JSON, XML or schema files into the main RAML file. This supports modularization and re-use.
  • resourceTypes which support inheritence of common behaviours across resources.
  • traits which address cross-cutting concerns commonly found in aspects such as url query parameters.

Finally, RAML provides a mechanism to describe a wide range of security schemes including all of those in common use today.

The full example RAML for this article is available on Github at: github.com/scaganoff/example-eventlog-api.

About the Author

Saul Caganoff is the CTO of Sixtree, an Australian system integration consultancy. He has extensive experience as an architect and engineer in major integration and software development projects in Australia, the United States and Asia. Saul's professional interests include architecture at all levels - enterprise, solution and applications - distributed systems, composite applications, cloud computing and cloud APIs.

Hello stranger!

You need to Register an InfoQ account or or login to post comments. But there's so much more behind being registered.

Get the most out of the InfoQ experience.

Tell us what you think

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

Excellent by A Kheyrollahi

A really nice read. Keep it up!

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

1 Discuss

Educational Content

General Feedback
Bugs
Advertising
Editorial
InfoQ.com and all content copyright © 2006-2014 C4Media Inc. InfoQ.com hosted at Contegix, the best ISP we've ever worked with.
Privacy policy
BT