Versioning JSON for APIs

I see this often. Someone comes out with new protocol. Almost invariably, the first examples look like this:

{
  "version": "1.0",
  "some": ["protocol", "stuff", "..."]
}

I’m sorry to say that my first reaction is 🤦.

It’s that version field. Version fields like that are close to useless for versioning.

This post explains why.

What can you do with a version field?

What you concretely do with a version field is rarely documented well in specifications for new protocols.

The semantic versioning model has the recipient of the document check that it “understands” the version according to semver rules. In those rules, you might have something like this:

Each major version will have dramatically different handling, so any major version you aren’t prepared to handle is an error.
Minor versions can add features, so you might have some minimum value for minor version, but only if there are features in that version you depend on.
Patch versions are rarely signaled in protocols, because they shouldn’t affect compatibility. If patch version information is available, its only real use is to work around specific implementations bugs.

In general then, the recipient of the version field checks it. If it checks out, they proceed; if it is not supported, they abort.

Aborting is safe, but not useful.

Disagree and abort

Fundamentally, these sorts of version checks are a safeguard against a disagreement about what protocol you are talking.

A disagreement about the protocol is quite bad. Any disagreement is highly likely to lead to bugs. There’s a good chance those are going to be security-relevant bugs.

If you are especially unlucky, things will continue to work for a lot of people. That might hide the problem for a while.

If you are managing the evolution of a protocol having a peer abort when there is a disagreement about which protocol is in use is sort of the minimum-viable protection.

The possibility of an abort is all that a version field like this can deliver. It does not help you evolve the protocol.

Aborting is not a migration plan

Roll around to the point that you have a new version to roll out. Version 2.0 adds a bunch of shiny, new, and incompatible features.

So you tell your server to start talking the new version:

{
  "version": "2.0",
  "some": {"new": "protocol", "stuff": ["..."]}
}

Congratulations, you avoided the bugs and security nightmare. Also, all your existing clients have stopped working.

That’s not a migration plan, that’s a plan for future headaches.

The only way to avoid those headaches is to design a migration strategy into the initial version. That means having a way to get off that initial version onto the next version.

Incremental additions don’t need versions

So the first step is acknowledging that – especially with JSON – you probably already have a simple way to add features.

One of the greatest JSON features was never formalized. It is the ability to add to objects/structs/dictionaries. It’s rarely written down, but the fact that most software ignores anything they don’t understand is amazingly powerful^[1].

This is the best and easiest way to evolve a JSON format. This doesn’t require any version signaling. You don’t need to update the minor version number for new features, you just add the new things you need. As long as old software that ignores the new stuff continues to work, you can add as much as you like.

That approach is exactly like minor versioning. Except that you don’t need to signal the minor version. Except that implementations can use the presence or absence of new members rather than the version number to decide if things are OK, which can lead to things working more often than otherwise.

Signaling of a minor version number therefore becomes utterly pointless.

Big changes don’t need versions either

Major changes that might need to be rejected by old software are best avoided. Still, there can come a time when incremental feature additions have stretched the format – and the code that handles it – so much that you need a clean break.

At that point, you might consider leaving older software behind and starting over with a completely redesigned format.

A version field inside the format can stop the old software from trying to use your new stuff. Well, that assumes that old software bothers to check that version field that was lying around unused for years. Some won’t and that will be fun.

Either way, the best case for that is a future where you come up with increasingly complex methods for managing how many interactions abort.

A better approach is pretty straightforward: make the version switch at a higher layer. In many cases, the ability to switch is already part of the systems you are using.

Higher-level switches work

In a lot of cases, putting the new format at a new URL is the best option. It’s easy, cheap, and gives you a bunch of really interesting options for managing the evolution of implementations and deployments.

If new clients can be configured with the new URL, that’s going to be much easier for all involved.

In cases where the location of endpoints is part of protocols, a new field can be added to include alternative URLs. For example:

{
  "url": "https://example.com/the/old/location",
  "urlv2": "https://example.com/the/new/location",
  "...": {}
}

This moves the version migration problem so that it uses a well-established method for adding features. What was a breaking change for the format is now a minor feature addition in a different part of the system. The hard problem has transformed into an easy one.

Prefer extension points that are already in use for other purposes. Making use of fewer, well-tested extension points is a major lesson of RFC 9170.

In HTTP

Just for completeness, here are some ways you can do higher-level switching with HTTP. After all, a lot of these cases involve HTTP at some level.

The high-level switching pattern can be used in HTTP header fields:

My-App: "https://example.com/the/old/location"
My-App-v2: "https://example.com/the/new/location"

Or maybe:

My-App: url="...old/location", url-v2="...new/location"

The same applies to anywhere you can make that switch, but it especially applies to places where you have easy and well-used extension points already.

HTTP also offers content negotiation, which has something of an uneven recognition by practitioners. Still, it can be a place to use that higher-level switching practice.

The advantage of content negotiation is that you can use the same URL as before.

To use content negotiation, your format is given a media type. Your new format is given a new and different media type. The HTTP Accept header field is populated by clients and the server chooses the format it prefers from that set. The choice of format is conveyed using Content-Type in the response.

Requests can also use content negotiation, though this costs a round trip if you guess wrong. Switching URLs is a better way to manage migrating the formats that clients produce.

Don’t be tempted to put a version attribute on the media type, just define an entirely new one. It’s far easier that way. Content negotiation works best by selecting from a list; attributes require special handling that won’t be automatically managed by servers. Also, attributes are often stripped or ignored, which will cause them to fail when you need them.

Fun times with IPv6

The IPv6 migration is a great object lesson here. IP uses an in-band version indicator: the first four bits of the IP packet.

The hope during IPv6 development was that this version indication would be enough. Routers would drop IPv6 packets until they were taught IPv6.

In practice, that failed. Ethernet now has a distinct code (or EtherType) for IPv6^[2]. That transformed a hard migration – teaching routers not to choke on IPv6 packets – for one they already managed gracefully.

The web platform design principles do say something. We recently updated this language. “Dictionaries, because of how they are treated by user agents, are also relatively future-proof. Dictionary members that are not understood by an implementation are ignored. New members therefore can be added without breaking older code.” ↩︎
I realize that I’m potentially getting higher and lower confused when talking about higher-layers conceptually or in networking stacks. ↩︎