2012-07-10 Using HTML as the Media Type for your API

There is an ongoing (and interesting) discussion on the API-craft mailing list revolving around designing new media types for enabling hypermedia APIs primarily for programmatic consumption.

As some folks may know, I like to use HTML as the media type for my hypermedia APIs. Steven Willmott opined

I thought this raised such an interesting implicit question, and I get asked about this enough that I thought it warranted a longer response. There are actually a variety of reasons I prefer using HTML:

  • rich semantics

  • hypermedia support

  • already standardized

  • tooling support

Rich Semantics

I’ve heard many folks say that HTML is primarily for presentation and not for conveying information, and hence it isn’t suitable for API use. Hogwash, I say! There are many web experts (like Kimberly Blessing ) who would insist that markup is exactly for conveying semantics and that presentation should be a CSS concern.

People seem to forget that web sites actually worked before CSS or Javascript was invented! I rely on this heavily for my HTML APIs.

Hypermedia Support

HTML offers <a>, <link>, and <form> as obvious examples of hypermedia controls.

In fact, the use of <form> to support parameterized navigation (where the client supplies some of the information needed to formulate a request) fairly well sets HTML apart from most existing standard (in the sense of being registered in the IANA standards tree for media types.

While currently this construct is not as powerful or expressive as it could be–c.f. only supporting GET and POST for methods–it’s actually enough to get by, and is certainly sufficient for a RESTful system (if you care about qualifying for the label). Furthermore, there are ongoing efforts within the HTML5 standards process to address this.

(As an aside, it’s worth noting that <audio>, <video>, <iframe>, and <img> are also hypermedia controls).

Already Standardized

HTML is shepherded by an existing open standards process and a large community of experts, which means it has all the social machinery for ongoing support and evolution. More than that, however, HTML has had the opportunity to be battle-hardened with real world use for decades, including the documentation that comprises its specification.

This is huge, because in documentation I can talk about “following links” and “submitting forms” without getting into details about how to construct those HTTP requests, because someone has already taken the trouble of writing that all down, including all the nasty corner cases.

I’m lazy–I don’t want to define and write down a bunch of rules that solve the same problems reams of experienced people that came before me have already solved.

Furthermore, due to its ubiquity, EVERYONE AND THEIR BROTHER understands HTML and lots of those people can write valid markup without consulting the HTML5 spec (of course, there are also lots who only think they can write valid markup without looking at the spec!).

While developers may not be used to using HTML to power APIs, they can nonetheless look at an API response and understand what’s going on. This is a huge advantage.

More importantly, HTML is already all over the Web, and there are both human and machine participants consuming it. If I’m starting from an API, then it’s entirely possible that someone from the “human-oriented” Web might link to my API, and presto, they can use it, because:

human + browser = client for my HTML API

Summary

So what this all boils down to is that HTML offers me quite a lot of convenience as a hypermedia-aware, domain-agnostic media type.

I have lots of off-the-shelf tooling, including getting my first client for free (the browser), and from a documentation point of view, between the HTML and HTTP, there’s a whole lot of mechanics I don’t have to discuss.

In fact, if I’m using microdata, I don’t even necessarily need to write much down about the particular application domain, at least from a vocabulary point of view.

It might even be sufficient to document an HTML API just by listing out:

  • URL of the entry point(s)

  • link relations used (with pointers to their definitions elsewhere!), and important <form> @class values and <input> @names of importance (I think forms need parameterized link relations to do this a little more formally, but we don’t quite have those yet)

  • pointers to the microdata definitions of importance (again, elsewhere).

That’s not a lot to have to write down.

See also