InfoQ Homepage Articles Description, Discovery, and Profiles: A Primer

Description, Discovery, and Profiles: A Primer

May 22, 2015 20 min read

InfoQ Article Contest

Share your knowledge Win a ticket to a QCon event
or an InfoQ Dev SummitFind out more

This series focuses on three key areas of "meta-language" for Web APIs: API Description, API Discovery, and API Profiles. You’ll see articles covering all three of these important trends as well as interviews with some of the key personalities in this fast-moving space.

This InfoQ article is part of the series “Description, Discovery, and Profiles: The Next Level in Web APIs”. You can subscribe to receive notifications via RSS.

The Next Level in Web APIs

While the process of implementing Web APIs has become common, the tooling for describing, discovering, and understanding the meaning of the tens of thousands of API-based services has yet to settle into a widely-accepted set of standards. There is still quite a bit of opportunity when it comes to defining and implementing tools around the "meta-level" of APIs.

For now, three areas continue to see quite a bit of interest and activity. They are:

Description

The ability to easily describe APIs including implementation details such as resources and URLs, representation formats (HTML, XML, JSON, etc.), status codes, and input arguments in both a human- and machine-readable form. There are a few key players setting the pace here.

Discovery

Searching for, and selecting Web APIs that provide the desired service (e.g. shopping, user management, etc.) within specified criteria (e.g. uptime, licensing, pricing, and performance limits). Right now this is primarily a human-driven process but there are new players attempting to automate selected parts of the process.

Profiles

Long a focus of librarians and information scientists, ‘Profiles’ that define the meaning and use of vocabulary terms carried within API requests and responses are getting renewed interest for Web APIs. Still an experimental idea, there is some evidence vendors and designers are starting to implement support for Web API Profiles.

This article takes a brief look at these three categories of Web API meta data and identifies key players and trends in each area.

Describing API Implementations

Currently, most of the focus on API design and implementation is on description formats. The formats most commonly mentioned today are Swagger, RAML, and API Blueprint but the list of available formats is quite long. They each take a slightly different approach but essentially offer the same basic features: a way to describe a Web API at varying levels of detail.

API-First

Most of the approaches today are support the API-First concept. You describe your API using a meta-language based on XML, JSON, or YAML and the resulting document (or set of documents) is used to auto-generate implementation assets such as server-side code, human-readable documentation, test harnesses, SDKs, or even fully-functional API clients.

An example of the API-First approach is Apiary's API Blueprint format. It’s based on Markdown and has the goal of supporting human-readable descriptions of APIs that are also machine-readable. In the example below you can see there is a single resource (/message) that supports both GET and PUT. You can also see there is support for human-readable text to describe the way the API operates.

Example API Blueprint Description

FORMAT: 1A

# Resource and Actions API
This API example demonstrates how to define a resource with multiple actions.

# /message
This is our resource

## GET
Here we define an action using the `GET` HTTP request method.

As with every good action it should return a response

+ Response 200 (text/plain)

        Hello World!

## PUT
OK, let's add an update action and send a response back confirming the posting was a success

+ Request (text/plain)

        All your base are belong to us.

+ Response 204

RAML, Swagger, and other similar formats work essentially the same.

With the API-First approach, you need tooling to convert the meta-language created at design-time into sometime useful at runtime. For example, the Swagger codegen tool parses description documents and generates compliant client-side code. And the RAML-for-JAX-RS project provides two-way transformation between RAML descriptions and JAX-RS-annotated Java code.

Code-First

Very few description models support the Code-First approach - where you generate the service description from the source code. However, the most famous of these - the Web Service Definition Language (WSDL) - is still popular with the enterprise community and there is a great deal of tooling and support for WSDL built into the common editing platforms such as Microsoft Visual Studio and Eclipse.

Below is an example of the a simple Web API described using WSDL.

HelloService WSDL Example

<definitions name="HelloService"
   targetNamespace="http://www.examples.com/wsdl/HelloService.wsdl"
   xmlns="http://schemas.xmlsoap.org/wsdl/"
   xmlns:soap="http://schemas.xmlsoap.org/wsdl/soap/"
   xmlns:tns="http://www.examples.com/wsdl/HelloService.wsdl"
   xmlns:xsd="http://www.w3.org/2001/XMLSchema">

   <message name="SayHelloRequest">
      <part name="firstName" type="xsd:string"/>
   </message>

   <message name="SayHelloResponse">
      <part name="greeting" type="xsd:string"/>
   </message>

   <portType name="Hello_PortType">
      <operation name="sayHello">
         <input message="tns:SayHelloRequest"/>
         <output message="tns:SayHelloResponse"/>
      </operation>
   </portType>

   <binding name="Hello_Binding" type="tns:Hello_PortType">
      <soap:binding style="rpc"
         transport="http://schemas.xmlsoap.org/soap/http"/>
      <operation name="sayHello">
         <soap:operation soapAction="sayHello"/>
         <input>
            <soap:body
               encodingStyle="http://schemas.xmlsoap.org/soap/encoding/"
               namespace="urn:examples:helloservice"
               use="encoded"/>
         </input>

         <output>
            <soap:body
               encodingStyle="http://schemas.xmlsoap.org/soap/encoding/"
               namespace="urn:examples:helloservice"
               use="encoded"/>
         </output>
      </operation>
   </binding>

   <service name="Hello_Service">
      <documentation>WSDL File for HelloService</documentation>
      <port binding="tns:Hello_Binding" name="Hello_Port">
         <soap:address
            location="http://www.examples.com/SayHello/" />
      </port>
   </service>
</definitions>

When using the Code-First approach, you need tools to turn your source code into usable API description meta-data. Eclipse and Visual Studio have a one-click experience for creating WSDL files from code. There are also several tools that can consume the WSDL file and generate implementation assets. For example SmartBear's SoapUI utility can generate code, create human-readable documentation and even build and run test suites based on WSDL files.

Figure 1. SoapUI consuming WSDL

Documenting APIs

Most API description formats also support generating human-readable documentation. This is true for RAML, Apiary, and Swagger. In fact, the open source Swagger-UI tool is so-well known as a documentation generator (see the figure below) that some mistakenly think that Swagger is just used generating human-readable API documentation.

Figure 2. Human-Readable Documentation generated by Swagger-UI Utility

There are also formats that are designed with a focus on generating human-readable documentation. A well-known example of this is Mashery’s I/O Docs (see example below) which also provides testing support.

Mashery’s I/O Docs Example

{
    "name": "Lower Case API",
    "description": "An example api.",
    "protocol": "rest",
    "basePath": "http://api.lowercase.sample.com",
    "publicPath": "/v1",
    "auth": {
        "key": {
            "param": "key"
        }
    },
    "headers": {
        "Accept": "application/json",
        "Foo": "bar"
    },
    "resources": {
        "Resource Group A": {
            "methods": {
                "MethodA1": {
                    "name": "Method A1",
                    "path": "/a1/grab",
                    "httpMethod": "GET",
                    "description": "Grabs information from the A1 data set.",
                    "parameters": {
                        "param1": {
                            "type": "string",
                            "required": true,
                            "default": "",
                            "description": "Description of the first parameter."
                        }
                    }
                },
                "MethodA1User": {
                    "name": "Method A1 User",
                    "path": "/a1/grab/{userId}",
                    "httpMethod": "GET",
                    "description": "Grabs information from the A1 data set for a specific user",
                    "parameters": {
                        "param1": {
                            "type": "string",
                            "required": true,
                            "default": "",
                            "description": "Description of the first parameter."
                        },
                        "userId": {
                            "type": "string",
                            "required": true,
                            "default": "",
                            "description": "The userId parameter that is in the URI."
                        }
                    }
                }
            }
        }
    }
}

Description is not Discovery

However, whether you are focused on generating code from meta-languages or documentation from code (or any of the other possibilities) API Description gets you only part of the way through the process of creating and deploying Web APIs. Another important part of the process is figuring out what APIs and services are ‘out there’ and what it takes to consume them. For that, you need to discover the APIs.

Discovering APIs in the Real World

API Discovery is the ability to locate the Web API needed for a particular job. For example, you may be looking for a Web API that allows you to support online shopping, or manage user accounts, or process helpdesk requests, etc. In an ideal world, you should be able to initiate a search, find the API that fits your needs, gain access to the API, implement connecting code and start using it with a minimum of effort.

However, reality is different.

Discussion about API discovery often conflates the application programming interface (API) with a live ‘up-and-running’ service. In the first case, we’re just talking about the interface - one that you might use to design, implement, and deploy your own service. In the second case, we’re referring to an existing instance of the service itself - something you can connect to remotely and start using immediately. Since the hurdles and benefits to each are different, is worth some time to review some examples.

API Commons

For cases when what you are looking for is a published set of API specifications that you can implement youself, the API Commons can be a good source. The goal of API Commons is to "Provide a simple and transparent mechanism for the copyright free sharing and collaborative design of API specifications, interfaces and data models." For example, if you have already designed an API and would like to share the design with others to so that they can implement their own service using your model, you can publish your model at API Commons and encourage others to use it.

If, on the other hand, you are about to implement an API and are wondering if someone else has already dealt with the same problem, you can search API Commons to see if there is an existing design that will work for you. This publish-and-subscribe pattern makes it possible for similar services to use the same interface without the need for detailed co-ordination between parties. In the best case, an API consumer built to work with an API design registered in the API Commons will be able to work with any other service that uses the same design.

API Commons' operating model was inspired by Creative Commons and is currently maintained by Kin Lane and Steve Willmott.

Searchable Service Listings

Most of the time, when people talk about API discovery, they mean discovering an available running instance - a usable service. There are a handful of services that keep track of actual usable services and most all of them are designed as human-readable search engines. One of the best-known examples is Programmable Web’s API Directory (see figure below).

Figure 3. Search Interface for Programmable Web

This kind of solution works when you want to:

search for services that that fit your criteria,
evaluate the ones that seem to meet your needs,
engage in the on-boarding process for that service and finally
write your API consumer code to match the API of your selected service.

It is possible that the service you select will support one or more of the API Description languages we covered above and that can ease the work of creating an API consumer for your selected service.

A potential downside for using this kind of discovery service is that not all directories are carefully vetted (some rely upon ‘self-registration’). You may need to wade through several APIs that don’t support the protocol or format you need, are no longer active, or do not meet your performance or licensing requirements, etc. Also, the process of going through the search, evaluation, on-board, and interface-generation loop can get tedious if you plan on using several third-party APIs.

Aggregation Services

There is another type of API discovery approach - the API aggregator. The aggregator acts as proxy for one or more existing web services and offers a single, unified API for you to code against. An example of this is Intel’s Mashery API Network.

Figure 4. Mashery's API Network

If you plan on consuming several third-party Web APIs, an aggregator offers the chance to greatly reduce your on-boarding and API integration efforts. Aggregator services take on the work of consuming and normalizing the back-end API, managing your access keys, and some even offer a custom API for you - one that makes is easier to share related data between each third-party API.

Configurable Runtime Discovery

There is another way people think about API Discovery services - as a configurable local service discovery engine. These discovery engines exists within the boundaries of a single company and handle finding and connecting to one or more running instances of a service such as a data store or business component. This approach has been growing over the last few years and examples include Apache Zookeeper, HashiCorp’s Consul, and CoreOS’s etcd.

Figure 5. Apache Zookeeper and HashiCorp's Consul offer runtime discovery within an enterprise.

The advantage of this discovery approach is that it provides an additional level of indirection when connecting to running services within your own organization. You can remove the actual address and connection parameters from your running code and place this information in configuration files. Some services will even allow you to set limits on latency and responsiveness in ways that will automatically ignore or route around slow-running or unavailable service instances and connect to the next available healthy instance.

Of course, this kind of abstraction has downsides. First, the added complexity only pays off in large installations. Second, configuration models for these services are currently not standardized and that means you end up creating a strong dependency on a single vendor. Finally, this kind of run-time discovery service is not yet available for use with publicly available third-party API services outside your organization.

Web-Scale Discovery wtih APIs.io and APIs.json

An interesting alternative to a single-source collection of APIs is the distributed search approach of APIs.io. A joint effort between 3Scale and Kin Lane of API Evangelist, this is a classic search engine that doesn’t actually host API documents but tracks where the discovery files are found on the WWW and offers a search interface against the data in those files.

All the discovery files are in a format known as APIs.json. Unlike an API description document, APIs.json files don’t contain details on all the available URLs, representations, and response code. Instead, these files contain pointers to those description documents along with pointers to terms of service, licensing, contact info, and other related data.

This format is quite new, but has the potential to offer some ‘glue’ to help connect API descriptions and other data all in a single place. Since the format is machine-readable, it offer the potentional to automate some of the search and possibly even the on-boarding details for Web APIs.

Still at the Detail Level

Both API Description and API Discovery aim at making it easier to build and locate APIs for the Web. However, many efforts focus on low-level implementation details such as describing protocol methods, return codes, and the shape of payloads. These are essential when it comes to writing actual code, but sometimes can get in the way of designing good APIs. Because most approaches focus on implementation details, they usually describe the specifics of a single instance (or mirrored cluster) of a running service.

If you want to be able to focus on higher-level design aspects (use-cases, inputs and outputs) free of the details of protocols, representations, and resources, you need something else.

Leveraging Profiles for Shared Understanding

A more recent development in API meta services is the idea of creating and sharing API Profile information. Unlike description documents, profile documents offer a high-level view of what the API supports and, in some cases, how clients and servers can expose features in a machine-readable way.

Profiles on the Web, A Short History

Web Profiles have been around for quite some time. The HTML 4.01 specification introduced the profile attribute in 1999. The Meta data profile was defined as either a) a globally unique name (URI) or, b) a link (URL) to an actual document. It was designed to allow document authors to provide additional descriptive information about the contents of the response (e.g. useful indexing properties of the document, terms of use, etc.).

In 2003, Tantek Çelik defined the XHTML Meta Data Profile. XMDP supports defining document profiles that are both human- and machine-readable (sound faimilar?). The document actually looks rather similar to the way API Description formats look today (see example)

An XMDP Example

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
 "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head><title>sample HTML profile</title></head>
<body>
 <dl class="profile">
  <dt id='author'>author</dt>
   <dd>A person who wrote (at least part of) the document.</dd>
  <dt id='keywords'>keywords</dt>
   <dd>A comma and/or space separated list of the
    keywords or keyphrases of the document.</dd>
  <dt id='copyright'>copyright</dt>
   <dd>The name (or names) of the copyright holder(s)
    for this document, and/or a complete statement of copyright.</dd>
  <dt id='date'>date</dt>
   <dd>The last updated date of the document, in ISO8601 date format.</dd>
  <dt id='identifier'>identifier</dt>
   <dd>The normative URI for the document.</dd>
  <dt id='rel'>rel</dt>
   <dd>
    <dl>
     <dt id='script'>script</dt>
     <dd>A reference to a client-side script. When used with the
      LINK element, the script is evaluated as the document loads and
      may modify the contents of the document dynamically.</dd>
    </dl>
   </dd>
  </dl>
</body>
</html>

However, the profile property did not gain widespread use and was dropped from the spec with the release of HTML5. There is an effort to define the profile property separately.

Removing the profile property from HTML prompted Erik Wilde to pen RFC6906 defining the profile Link Relation Value and registering it with the Internet Assigned Numbers Authority (IANA). Wilde’s idea was to standardize the URI-style profile; an opaque identifier that "allows resource representations to indicate they are following one or more profiles."

The ability to describe the details of payload via profiles has prompted an increased interest in applying profiles to not just human-readable documents, but also to API responses. In the last few years, several URL-style profile implementations have emerged, too. For this article, we'll look at two of them: DCAP and ALPS.

Dublin Core Application Profiles (DCAP)

In 2009, the Dublin Core Metadata Initiative (DCMI) released their DCAP format for describing profile metadata. Focused on supporting Resource Description Framework (RDF) documents, DCAP "defines metadata records which meet specific application needs while providing semantic interoperability with other applications on the basis of globally defined vocabularies and models."

Below is an example of a DCAP document:

An example DCAP document

Description template: Person id=person
   minimum = 0; maximum = unlimited
   Statement template: givenName
     Property: http://xmlns.com/foaf/0.1/givenname
     minimum = 0; maximum = 1
     Type of Value = "literal"
   Statement template: familyName
     Property: http://xmlns.com/foaf/0.1/family_name
     minimum = 0; maximum = 1
     Type of Value = "literal"
   Statement template: email
     Property: http://xmlns.com/foaf/0.1/mbox
     minimum = 0; maximum = unlimited
     Type of Value = "non-literal"
     value URI = mandatory

DCAP was created in order to improve the ability to share data semantics in a world where the representation format (RDF) is highly constrained and very generic (e.g. triples). A key advantage of DCAP is that it does not dictate which terms are used in a response (e.g. givenName, familyName, customer, user, etc.) but it does dictate how those terms are communicated. This paves the way for creating share-able vocabularies for online use.

There is some tooling for DCAP including online editors, validators, and HTML-generators but the use of DCAP has been limited the library, information science, and academic community. It is not common to find DCAP used for business-related APIs on the Web.

Application-Level Profile Semantics (ALPS)

In 2013, Leonard Richardson, Mark Foster, and I released the first ALPS Internet Draft. Similar to DCAP, ALPS also borrows ideas from XDMP. ALPS includes meta-data for both data elements (e.g. userName, userStatus, etc.) and use-case elements (e.g. find-user, etc.).

Below is an example of an ALPS document describing a simple search API:

A Search API described by an ALPS document

{
 "alps" : {
   "version" : "1.0",
   "doc" : {
     "href" : "http://example.org/samples/full/doc.html"
   },
   "descriptor" : [
     {
       "id" : "find-user",
       "type" : "safe",
       "doc" : {"value" :
         "User search form"
       },
       "descriptor" : [
         {
           "id" : "userName",
           "type" : "descriptor",
           "doc" : { "value" : "input for search" }
         },
         { "href" : "#userStatus" }
       ]
     },
     {
       "id" : "userStatus",
       "type" : "descriptor",
       "description" : {"value" : "results filter"},
       "ext" : [
         {
           "href" : "http://alps.io/ext/range",
           "value" : "active,inactive"
         }
       ]
     }
   ]
 }
}

ALPS documents focus on the interface-level interactions - what Eric Evans calls the Bounded Context. It does this without addressing implementation details like protocol (HTTP, XMPP, etc.), format (HTML, JSON, etc.) or even resource URLs. This freedom from implementation details means ALPS documents can be used as source material for API design tools, for generating human-readable docs, and even as part of a discovery process to help in the selection of APIs for a desired use.

Currently, ALPS is best-described as an ‘unstable’ specification and there very few examples of it’s use on the Web. Ronnie Mitra's experimental Rapido API designer (see figure) can use ALPS documents as inputs and Pivotal's Spring-Datatools produce ALPS documents as part of their API build process.

Figure 6. Managing Profile Vocabulary with the Rapido API Designer

Experiment or Leading Indicator?

While there is renewed interest in using profiles for Web APIs, it is too early to know whether this is just an experiment that will fade away or whether it is the start of a trend focusing on independently defining the data and action semantics of a Web API.

The Challenge Ahead

In this article you got a view into three key areas of Web API meta data including Description, Discovery, and Profiles. Initiatives from Swagger, RAML, and Apiary are currently dominating the description space and there are a handful of other players in this very healthy eco-system. There is also support for using description formats to automate the generation of both code and documentation and a strong set of utilities has grown around a few key formats.

The API discovery space continues to be dominated by human-driven search and selection while a few key API aggregators like Intel’s Mashery continue to provide an aggregaor approach for subscribing to remote APIs. There is a growing world of automated configuration-based location services aimed at supporting connection to enterprise-level service instances and some of that approach may start to show up in efforts to provide automated service discovery to WWW-based APIs.

Finally, API Profiles - commonly used in library and information sciences - have been getting renewed interest for business-related Web APIs. Current initiatives are either experimental or have limited reach but there are some vendors showing support for API Profiles.

The Web is a dynamic and fast-moving space and it should be interesting to keep an eye on this "meta-level" of the API eco-system for some time to come.

About the Author

Mike Amundsen is an internationally known author and lecturer, he travels throughout the world consulting and speaking on a wide range of topics including distributed network architecture, Web application development, and other subjects. In his role of Director of Architecture for the API Academy, Amundsen responsible for working with companies all over the world to provide insight on how best to capitalize on the myriad opportunities APIs present to both consumers and the enterprise. Amundsen has authored numerous books and papers on programming over the last 15 years. He is currently working on a new book "Learning Client Hypermedia" that covers common ways to build client applications that can take advantage of Hypermedi API services. His most recent published book is a collaboration with Leonard Richardson titled "RESTful Web APIs" published in 2013. Amundsen’s 2011 book, “Building Hypermedia APIs with HTML5 and Node”, is an oft-cited reference on building adaptable Web applications. He is currently working on a new book for O’Reilly: "Learning Client Hypermedia" which is due in the fall of 2015.