Recently, I stumbled over a blog post by Jonathan Channon explaining how he got to realize that hypermedia APIs are not some magical thing but rather a pragmatic approach to reduce coupling between services. Also, I’ve been traveling conferences with a talk on Domain-Driven Design and REST recently that covers the aspect of hypermedia as well and features an — as I think — rather nice example of how effective hypermedia can be when building distributed systems. I’ll come back to the example in a bit but let’s start with some fundamentals.
REST web services are an especially great fit if you develop a distributed system (client-server being the most simple instance of that) where the individual systems are managed by different parties and have to be deployed independently. That could be a simple mobile app for your business, or it could be a microservice architecture (especially en vogue these days). Independent deployability can even an important requirement if the same team manages individual services as it will allow you to roll out new features of an individual system without having to wait for others.
The crucial thing here is, that you want to avoid — or can’t even afford — to have to deploy the interacting systems together. Thus, breaking APIs is basically a no-go for the server or at least has to be avoided as much as possible. But when do we get into the state of a “broken API”? Whenever the assumptions the client made before are not met by the server anymore. We need to make sure the client makes as little assumptions about the server as possible. Hypermedia allows you to achieve exactly that as a lot of assumptions that you might naïvely code into your client can actually be simplified to the decision “Is a link with the given link relation present or not?”.
Let me give you a concrete example: the REST in Practice book ships with an example called RESTBucks. It basically simulates the ordering experience at a coffee shop and takes an order through some kind of life cycle (payment expected, in preparation etc.). See this diagram (also below) for an overview. Notice how the order can only be canceled in the payment expected state in this version of the diagram (state transition 3).
If you now coded a mobile client naïvely, it would probably look for the status field of the JSON payload, check whether it’s “payment expected” and if so, display a button to trigger the request for canceling.
GET /orders/4711
{
…,
"status" : "payment expected"
}
The client would have to know about the resource to issue the request to, the HTTP method it’s supposed to use and the payload it has to send to the server. You could even tip your toes into hypermedia and let the client look for the link to find the request target but hypermedia is not going to free the server from knowing about the request method and payload details. So, what is the problem with that approach of inspecting the representation in such detail?
Assume there are a couple of changed and new requirements coming up:
If the client is implemented like described above, both of these requirements require the client code to be changed. This stems from the fact that the client used detailed server side knowledge (allow to cancel in a certain state expressed through a certain value in the payload) and duplicated it in its implementation. It implicitly created coupling that we were thinking we avoided by using HTTP and JSON. Bummer! What could we have done instead?
As I also describe in the talk, using hypermedia is basically about reducing critical decisions in the client to the presence of a link with a certain name in a certain context.
We could’ve just implemented our mobile client that way: whenever the representation of an order contains a cancel
link, we display the corresponding button and trigger the documented HTTP call to issue the cancellation if it’s pressed.
GET /orders/4711
{
"_links" : {
"cancel" : "…"
},
"status" : "…",
…
}
This fundamentally reduces the knowledge baked into the client to nothing but the presence of the link. That slight change in the way the client is implemented would’ve made it capable to not break due to a change in the API (1, as it would just forward the value of the field to display to the user) and even pick up a new feature transparently (2, as it would transparently display the button in a completely different state of the order as well).
We basically traded two different kinds of complexity here. The naive approach described in the beginning of the post lets the system end up in the left part of the spectrum as adds semantics to certain parts of the payload and by that replicates business logic already implemented on the server. That duplication across system boundaries is what effectively creates the coupling which we have to avoid to make sure we can roll out changes to the server easily. That said, this reduction in coupling doesn’t come for free: we paid for it by slightly increasing the protocol knowledge (the client knowing about the media type, how to find links, knowing the semantics of the link relations).
So we actually make the client dumber and smarter at the same time: dumber about the business rules and smarter about the protocol. That’s a great trade-off as the protocol semantics are much less likely to change in comparison to the business rules, which effectively makes the client less brittle against changes on the server and thus allows the latter to change more significantly without breaking the client.
That said, please check out Jonathan’s article in which he outlines how important it is that clients really embrace hypermedia to use it for their benefits and that it’s not enough for servers to just serve hypermedia enabled representations.
2016-08-07: Added a paragraph to elaborate on the conceptual differences of the approaches and the architectural consequences on system coupling. Minor wording updates for the previously existing content.