InfoQ Homepage Articles Programming with Semantic Profiles: In the Land of Magic Strings, the Profile-Aware is King

Programming with Semantic Profiles: In the Land of Magic Strings, the Profile-Aware is King

Jul 11, 2015 17 min read

Write for InfoQ

Feed your curiosity. Help 550k+ global
senior developers
each month stay ahead.Get in touch

This series focuses on three key areas of "meta-language" for Web APIs: API Description, API Discovery, and API Profiles. You’ll see articles covering all three of these important trends as well as interviews with some of the key personalities in this fast-moving space.

This InfoQ article is part of the series “Description, Discovery, and Profiles: The Next Level in Web APIs”. You can subscribe to receive notifications via RSS.

The concept of profiles was a bit of an unexpected end — and beginning — for me.

Back in 2012, I spent months writing a discovery service and related API client based on Google’s JSON API description language and client library for our “RESTful” SOA platform. During that project, there was one little criteria sitting on a story card that I kept ignoring, planning to get to it once our “RESTful” client was done: "Implement HATEOAS" (Hypermedia as the engine of application state). In hindsight, it would have saved all our company’s engineers and myself a lot of hassle if I worked on that card first.

Fundamentals

When I finally did pull that card, I began a rather transforming journey to understand APIs on a much deeper level; ultimately architecting and developing a Hypermedia messaging framework and discovery service for our platform.

In so doing, I arrived at a number of fundamental conclusions that, as it turns out, apply to Hypermedia APIs and HTTP/RESTish APIs as well:

An (Hypermedia) API should be designed (and understood) as a state-machine with clearly defined semantics for the data and transitions.
Burying the description of an API throughout routes, controllers, models and code comments makes it nearly impossible to visualize and understand resources and the aforementioned state-machine in your codebase.
As such, every resource should have an explicit declaration in some machine-readable API description document similar to API Blueprint, Swagger, or others, that in one place describes every aspect of the resource including its semantics, state-machine, related resources and even protocol specific implementation details.
Write tooling to use that API description document to generate responses. This tooling should abstract your local data model implementation completely behind the described resources for the API. Take a look at the Representor Pattern for more info about not marshaling your local data models and calling them resources in your APIs.
Externalizing API description documents is an anti-pattern that leads to tight coupling and brittleness (by depending on implementation details that should be internal). These descriptor documents should be used for design, documentation, and server-side generation of responses. They should not be used for client-side consumption of API messages.
A machine should be able to consume a well-designed (Hypermedia) API to allow it to perform work in following its nose in similar fashion to how a human would.
Everything a human or machine needs to consume an (Hypermedia) API should be shared at runtime if you want the loose coupling and evolvability benefits of REST in your system.

With all this rolling around in the back of my head, I ended up sitting in the QCon 2013 London conference listening to one of Mike Amundsen’s first presentations about Application Level-Profile Semantics (ALPS) and profiles. A particular slide of his quoting Roy Fielding’s dissertation really struck me:

“REST provides [for a hybrid client-server use of data] by focusing on a shared understanding of data types with metadata…” — Roy T. Fielding

When I heard that quote in the context of semantic profiles, all these ideas coalesced. Profiles, and in particular ALPS, pull together API resource and state-machine description, human and machine shared understanding and runtime definition of the semantics of messages.

I came back from that conference set on writing a Rails Hypermedia framework [1] using an internal API description language based on ALPS and its simple description of data and transitions. In so doing, I learned a lot about the benefits and limitations of profiles and how many bad preconceptions I had about what resources actually are and how APIs should share data with clients.

A Short Hike in the ALPS

What is cool about ALPS as a profile media type is that it completely decouples semantic description from messages or their format. Other profiles and approaches require you to embed the semantics in a message itself, e.g. Microdata in HTML or contexts in JSON-LD. ALPS transcends media types by relying on simple rules for mapping elements in a message to their semantic description for a particular media type.

In its simplest form ALPS describes semantics by specifying a semantically rich id to name a descriptor together with a type. The type attribute can be "semantic", corresponding to a data element description or “safe”, “idempotent”, or “unsafe”, corresponding to transition descriptions. Transition type descriptors also allow for the definition of custom relation types.

Take the following example ALPS profile located at http://example.com/alps/works:

http://example.com/alps/works

<alps version="1.0">
  <descriptor id="content" type="semantic">
    <doc>The content of a work.</doc>
  </descriptor>
  <descriptor id="publish" type="idempotent">
    <doc>Release a work for distribution.</doc>
  </descriptor>
</alps>

For an application/json message, this would correspond to:

GET /works/1 HTTP/1.1
Host: example.io
Links: profile="http://example.com/alps/works”
Content-Type: application/json

{
  "content" : "The ships hung in the sky in much the same way that bricks don't."
}

For an application/vnd.siren+json message, this would correspond to:

GET /works/1 HTTP/1.1
Host: example.io
Links: profile="http://example.com/alps/works”
Content-Type: application/vnd.siren+json

{
  "properties" {
    "content" : "The ships hung in the sky in much the same way that bricks don't."
  },
  "actions" : {
    "name" : "publish",
    "method" : "PUT",
    "href" : "..."
  }
}

Because Siren is a Hypermedia media type, the transition-related descriptor (<descriptor id="publish" . . ./>) can map directly to the message (e.g. Siren’s actions element). Plain JSON, on the other hand, has no concept of hypermedia links or forms, and thus the publish descriptor is missing from the JSON representation. To me, this illuminates the simplicity and power for re-use of ALPS profiles. You define the semantics for your API independent of representation media types and then clients can map the semantic details according to the representations they receive.

Profiles and APIs

Profiles provide a way to create a ubiquitous language for talking about APIs (resources) for both humans and machines. They transcend the use of "magic strings" tied to a particular domain that may be buried in API documentation and, instead, establish a re-usable vocabulary in a structured and reliable fashion.

In contrast to Resource Description Framework (RDF), Web Ontology Language (OWL) and many other vocabulary and profile definitions (as a colleague of mine once noted), ALPS is super simple. You define the semantics of the data and transitions in an intuitive structure with human-readable descriptions. If one stopped there, however, the semantics would still only be "magic strings" local to that profile.

ALPS also specifies the requirement of a registry that persists these definitions for consistent re-use. Frankly, the more fundamental requirement is that these profiles are hosted permanently so they can be referenced. As long as an author does not change the semantic meaning of descriptors in a registered or hosted profile (which would break the profile contract), it can be relied upon over time as a source of semantically reliable information.

We humans call these registries dictionaries, which brings up an interesting point: that profiles ultimately define vocabularies. To the extent that profiles re-use other available vocabularies, they will be ubiquitous. An example of this idea of re-use is a set of ALPS profiles built from Schema.org definitions at http://alps.io/schema.org.

Once an ecosystem of profiles exists — as the ALPS project envisions — there are numerous possible uses of ALPS when applied to the design, implementation and consumption of APIs.

Profiles and API Design

As an over-simplified example of leveraging profiles in API design, let’s build an API for interacting with a door. We start by brainstorming the semantics and state-machine of the data and affordances (transitions) of the resource representing a door, something like:

Having nailed down a rough idea of the semantics of the new resource, the next question is whether any related semantics exist for re-use. A brief survey might find the following two exsiting ALPS profiles registered in a registry hosted at http://example.com/alps:

http://example.com/alps/portals

<alps version="1.0">
  <descriptor id="material" type="semantic">
    <doc>What a portal is made of</doc>
  </descriptor>
  <descriptor id="open" type="idempotent"/>
  <descriptor id="close" type="idempotent"/>
</alps>

http://example.com/alps/fasteners

<alps version="1.0">
  <descriptor id="knob" type="semantic"/>
  <descriptor id="latch" type="semantic"/>
</alps>

Upon reviewing these two ALPS documents, we note:

The material semantic descriptor is already defined.
The open and close descriptors define the desired affordances.
There is no descriptor for our handle item. However, a similar definition for handle does exists: knob.

By adopting the term knob, we define our own resource based on existing semantics versus having to create a brand new semantic element just for our use case.

One might argue that client developers or product managers will want to use their own local semantic identifier: handle, particularly in the UI. That, however, is a presentation concern and, as I will discuss later, APIs should use the most broadly-used semantics possible and let variations of presentation and localization be completely separated from the semantics of the API itself.

Now, with the above ALPS documents as a reference, we can define the profile of a door as:

http://example.com/alps/doors

<alps version="1.0">
  <descriptor id="door" type="semantic">
    <descriptor href="http://example.com/alps/portals"/>
    <descriptor href="http://example.com/alps/fasteners#knob"/>
  </descriptor>
</alps>

Note that the self transition in the state-machine diagram is not defined explicitly as a descriptor in the ALPS documents as it is an IANA registered link relation type and relies on that definition. However, the open and close descriptors define custom link relation types associated with the related state-machine transitions. Though this is simplified and we could take other unstructured approaches in the use of the composed profiles, it highlights some key aspects of ALPS as a tool for API design.

A number of these ideas underlie the concepts in a next generation API description language that is profile-aware called Resource Blueprint. That language is based on the idea of defining semantics, re-using them throughout a set of resources, and being able to use and/or define ALPS profiles as an integrated part of API design tooling.

The power of backing APIs with ALPS can be found in the simplicity that is possible in communicating complete runtime semantic information to a profile-aware client. For example:

GET /doors/1 HTTP/1.1
Host: example.com
Links: profile="http://example.com/alps/doors", type="http://example.com/alps/doors#door"
Content-Type: application/vnd.siren+json

With the above minimal set of information, a server provides an ALPS and Siren media type-aware client complete runtime information suited to both human and machine alike. And, more importantly, it does not communicate internal implementation details in the process, as is the case with a number of approaches that expose an API description format to the client.

Profiles as Constraints

In my experience, architecture is related to a set of self-imposed constraints that produce a set of favorable properties of a system. As such, applying profiles to API design imposes constraints resulting in simplicity, clear understanding of a resource and the capacity for re-use.

One of our challenges as humans is that we are so semantically proficient we are sometimes unaware of things we are taking for granted when we design APIs. We begin to introduce elements that work well for humans but not so well for machines. When we do this, we introduce complexity unawares.

For example, we write documentation and glibly drop in a reference to a list of options and do it in a fashion that if you read the docs and write to the docs you are fine (but brittle). But, if you want a system that reacts to changes at runtime, this human ‘cheat’ does not translate.

Profiles keep you honest. This is why I think the constraint of a machine being able to consume your API (assuming it ‘understands’ the semantics of what it receives) drives a greater thoroughness and simplicity into our designs. In my experience, ff you articulate your API using ALPS, you will successfully constrain yourself to making your API simple and semantically rich.

Semantic governance across distributed teams

Depending on the size of your organization and the level of consistency you want to maintain across your APIs, profiles can be a very important tool. By using an API design toolset that allows you to validate your organization-wide API designs against a set of rules, it is relatively straightforward to prompt designers with semantic feedback. Alerting them when they are using semantics that mean the same as something else or helping them conform to the company standards produces a lot of consistency and time saving.

Even more useful is when you are describing some new semantic data element in which case you shouldn’t blindly use it without further thought or discussion. As an architect, being able to find out when APIs are being developed that are introducing new data semantics into your ecosystem is very useful. This is a great way to discover when front-end designers are shaping APIs based on what they may want to present versus the best semantics for the actual domain that the existing API supports.

Real-time design feedback

Similar to facilitating semantic governance, profiles can empower semantic ‘intelli-sense’ in next generation API Design IDEs. With a little context sensitivity, IDEs could recommend alternate data names and link relations including statistics of use of the similar semantics. The crowd is not always right, but it would be useful on many fronts to know, for example, that 90% of APIs in a profile registry use givenName instead of firstName.

Breaking type marshaling

One of the best lessons I learned in designing APIs around profiles was to stop type marshaling my server’s data-model to the client. So many voices attempt to convince us that the “RESTful” way to define resources is to embed our server models in URIs and see resources on a OOP model level translated to the client. This POV really obscures the fact that HTTP resources are just semantic data and controls.

Designing with semantics and profiles in mind breaks notions of type marshaling. It allows inventive possibilities for runtime creation of resources that are specified on the fly and provide semantic understanding for humans and machines alike. We just define semantics without relying on some concrete structure or bucket of properties. That idea may seem strange, but is really a powerful way to design data presented in APIs.

Profiles and Servers

On the server side, profiles have a number of strengths, but also a few potential weaknesses.

Resource generation/abstraction

Originally, as I was writing the aforementioned “ALPS-ian” Hypermedia framework, I had hoped that ALPS could be leveraged for rendering both data structure and state-transition details into messages. I say ALPS-ian because it rather quickly became apparent that using just a profile language to guide message rendering was rather limited since profiles, by nature, do not contain important implementation details like URIs or protocol methods, etc.

To solve my implementation problem, I ended up extending ALPS into a novel API description format that could track these other details and map data model properties to messages using ALPS profiles as a reference. One of the main ‘gotchas’ of relying on profiles for rendering structure became painfully obvious: if an author added new descriptors to a profile over time, then your implementation could immediately try to render new attributes, whether or not your domain model included that information. My conclusion is that, instead of delegating render details to profiles themselves, API description languages should drive the rendering (via custom tooling) while embedding profile information in responses.

Discovery Compliment

In a similar fashion to some of the ideas around discovery of resources in APIs at the root of an API (e.g. JSON Home), by simply supporting an api/profiles path — with a list of the available profiles — APIs can easily provide a semantic dictionary shortcut describing its resources. The availability of these profiles opens up a range of dynamic client configuration opportunities.

Dynamic creation of resources

Armed with "profile awareness", services can empower users to pre-define subsets of available data. One of the challenges in some industries still coming out of using SOAP-based APIs are the humongous schemas they typically built up. These industries want to start using mobile clients and are grappling with how to refine the amount of available information in a JSON-based API. Using profiles to define subsets of this information is one possible approach to this problem.

Related, but slightly different, is the possibility for composite services to generate completely new resources on the fly by leveraging profiles. By composing data and information in a message and dynamically adding profile metadata spanning the semantics of the new resource, profile-aware clients and workers can adapt and use the information in workflows.

Profiles and Clients

Once services are supporting profiles and rendering profile information in messages, both front-end and back-end clients (servers composing other APIs) can profit from a range of new possibilities.

Semantic Presentation and Modeling

One of the ideas around ALPS is sharing understanding about what one MAY receive in a message, not what they WILL receive. As such, client developers that are profile-aware can abstract a presentation layer between messages and their implementation. This could be in the form of a client-side model that transforms a response message into a more convenient model or presenter that handles what to do if possible information is not present in a message. This one technique can prevent a lot of the brittleness associated with memorizing API implementation details in client code and could lead to increased loose coupling and evolvability in hypermedia clients.

Dynamic mobile clients and localization

One particular application I think is really interesting (related to the former point) is the ability to build light mobile client frameworks that can be customized on the backend to interact with APIs. By using profiles, one can map how to render information in a mobile client (what controls, buttons, styles, etc.) to messages. Mobile clients can then load this information at runtime and, based on the profiles provided with the message, render forms, lists, etc., dynamically.

Similarly, client localization can leverage profiles to decouple any presentation and localization of information from APIs. Some media types try to accommodate localization with human-readable titles in their formats, for example. To use these, APIs need to implement localization internally versus simply returning one semantic set of information that can be reliably handled externally using profiles. That is, clients with localization libraries can recognize profile semantics and map them to their localized equivalents straightforwardly.

Intelligent machine clients/workers

Finally, absent profiles, it is virtually impossible for future automated machine clients and workers to perform self-directed work. For clients with semantic understanding, profiles are a requisite vehicle for understanding and utilizing messages in future generations of automated consumers of APIs. The sooner profiles are ubiquitous in the API space, the sooner these possibilities will expand.

Conclusion

Becoming profile-aware really helped me begin to think better about APIs. Absent profiles, the API space will be relegated to blindly passing around "magic strings", fooling ourselves into thinking we are passing reliable semantic information.

"I believe that the best way to get better programs is to teach programmers how to think better." – Leslie Lamport

Hopefully, in the not-so-distant future, profile-aware design tools, will help programmers to "think better" and build clients and servers that are able to work better on many levels — all when supported by an ecosystem of profiles.

About the Author

Mark W. Foster started out as a research scientist in optoelectronic bio-sensing, meandered through the world of Systems and Database Admin, IT infrastructure, Enterprise computing and came out an API developer and architect. He currently works at Apiary and participates in a variety of projects pioneering Hypermedia API tooling and is a co-author of the ALPS spec. You can follow Mark on Twitter at @fosrias.

References

1. The project Crichton was originally open-sourced, but is now currently being actively developed internally. Updates may show up here in the future.