writing

Restrictions of JSON numbers

As described previously, there are some practical interoperability issues associated with JSON numbers which can lead to serious bugs ([1], [2], [3]).

One of these issues, i.e. inability to express Infinity and NaN, stems from the extreme restrictiveness of the JSON number grammar.

Let’s see what other common number encodings are also restricted by it and how to alleviate that.

JSON number definition

ECMA-404 defines a number strictly as:

a sequence of decimal digits with no superfluous leading zero

An equivalent definition is provided by RFC8259:

A number is represented in base 10 using decimal digits.

This definition is however preceded by the following statement:

The representation of numbers is similar to that used in most programming languages.

which is somewhat similar to the description used at JSON.org:

A number is very much like a C or Java number, except that the octal and hexadecimal formats are not used.

Framing the definintion like this is akin to saying that a set of sequences of lowercase letters is very much like the set of C or Java identifiers, except that it does not include a significant subset of identifiers.

Let’s examine some common numeric values and formats used in programming languages like C, Java, or JavaScript which are impossible to represent in JSON.

Infinity and NaN

Whether flawless or not, the IEEE754 standard for floating-point numbers is currently the most interoperable.

Most programming languages, including C or Java support it.

JSON numbers however are explicitly incompatible with it, by excluding any representation for the standard special values Infinity and NaN.

Nondecimal number formats

Most programming languages have syntax for representing nondecimal numbers, particularly any subset of hexadecimal, binary, octal. For example JavaScript can represent all of these.

The support for nondecimal bases is included in programming languages because in certain contexts it is more human-readable to represent numbers this way.

JSON explicitly excludes them.

Numeric separators

An old readability-improving facility which is being adopted by more and more languages, is the numeric separator syntax.

Java supported it since 2011 (Java SE 7). JavaScript introduced it in ES2021. It is even included in an upcoming C standard revision

JSON, being derived from JavaScript’s syntax as it was around the year 2000, naturally does not include this facility.

Solution

What then to do if we need to use any of the above in JSON?

The answer is always the same for every data type and format not built into JSON: use strings. And inform your interchange parties how to parse them correctly.

Appendix: number grammars vs Infinity and NaN

The lack of Infinity or NaN in JavaScript is sometimes justified by observing the fact that these values are not part of JavaScript number grammar and/or that they can be altered. This is not a valid justification.

If it were valid, then by the same logic, JSON should forbid the leading minus sign, effectively forbiding negative numbers. Why? Because the JavaScript number grammar does not include the minus sign outside of the ExponentPart. Instead, it’s an operator, defined separately.

From this point of view JSON number grammar is not a subset of JavaScript number grammar. Additionally, the result of the operator’s application can be altered by a custom implementation of the valueOf method.

The fact that JavaScript allows doing dangerous things like these is an issue with JavaScript, and should not be relevant from the perspective of JSON which is supposed to be a language-independent syntax.

In fact Infinity and NaN are not part of the number grammar not only in JavaScript but in other programming languages as well, in particular C or Java. Instead these special values can be accessed via a variable or a macro. This is however a technicality irrelevant from the point of view of syntax which would remain identical if the variable/macro identifiers were made part of the number grammar.


© 2022 Darius J Chuck