writing

Meaningful data interchange in JSON

ECMA-404, the official JSON standard keeps repeating the following point:

The JSON syntax is not a specification of a complete data interchange. Meaningful data interchange requires agreement between a producer and consumer on the semantics attached to a particular use of the JSON syntax.

In practice this means that the meaning of data encoded and exchanged via JSON should be specified somehow, so that the data can be interpreted according to this meaning. This means specifying data types with specific shape and constraints which can be used when validating, deserializing, or otherwise mapping the JSON data for further processing.

Such specification can be encoded less formally in documentation and more formally in application code or in a schema.

To illustrate the point, take the following piece of JSON data:

{
  "id": "10765432100123456789",
  "validity": {
    "from": "2021-12-26",
    "thru": "2022-12-26"
  },
  "emails": [
    "one@example.com",
    "zwei@example.com",
    "three@example.com"
  ]
}

It might be accompanied by a specification which defines the following semantics and constraints:

We haven’t defined the semantics for the entire object itself, so we don’t know what it means. This is illustrates the point in a negative way: no defined semantics or constraints means we can only guess what the object represents or whether the properties included are mandatory or whether there are any unspecified optional properties. Just encoding the data in syntax is not enough for successful interchange.

Another thing we see here is that we used JSON strings to represent data types which are not built into JSON such as dates (“from”, “thru”) and e-mail addresses (“emails”).

We also chose to represent a 64-bit integer (“id”) with a string rather than a JSON number, because JSON numbers can cause interoperability problems, especially in this case.


© 2022 Darius J Chuck