In most programming languages, the comparison operators (<, <=, >, >=, ==) are defined to return false when one of the operands is NaN. In my opinion, this is a poor choice.

It breaks all kinds of fundamental properties that we expect from comparison operators. For example, we expect x == x to be true for all x, but this is not the case when x is NaN. We also expect x < y to be true if y <= x is false, and vice versa, but this is not the case when x or y is NaN.

You can’t test for NaN using ==. This is only a minor problem and can be solved by using isNaN instead, which may be a good idea anyway if you can have multiple different kinds of NaN values (we’ll get to that later). Still, it’s confusing for beginners.

Sorting arrays containing NaN values is broken. If you sort an array containing NaN values, the result depends on the internals of the sorting algorithm, and even the non-NaN values can end up in strange positions, e.g. in Python:

>>> sorted([6, 0, nan, 1, 2, 0, 9, 2])
[0, 0, 1, 2, 6, nan, 2, 9]


In Haskell, if you make a set of NaN values, then you can’t test for membership or delete the NaN value from the set. Furthemore, you can have multiple duplicate NaN values in your set, even though the point of a set is to have unique elements.

NaN is not the only problem in this regard: +0.0 == -0.0 is true, but 1.0 / (+0.0) == 1.0 / (-0.0) is false. This creates a similar problem: if you have a set {+0.0, -0.0} then one of the zeroes gets removed, but you can’t tell which one. Furthermore, if you map the function f(x) = 1.0 / x over such a set, then the set contains fewer elements than {f(+0.0), f(-0.0)}, which makes no sense.

## What should we do instead?

The IEEE 754 standard also defines a total order on floating point numbers. Programming languages should use this order for their comparison operators, and bitwise equality for their equality operator. This gets rid of all the problems mentioned above. This is that total order:

Bit Pattern Meaning
1 11111111 1yyyyyyyyyyyyyyyyyyyyyy Negative quiet NaN
1 11111111 0yyyyyyyyyyyyyyyyyyyyyy Negative signaling NaN
1 11111111 00000000000000000000000 -Infinity
1 xxxxxxxx yyyyyyyyyyyyyyyyyyyyyyy Negative number
1 00000000 yyyyyyyyyyyyyyyyyyyyyyy Negative denormal
1 00000000 00000000000000000000000 -0
0 00000000 00000000000000000000000 +0
0 00000000 yyyyyyyyyyyyyyyyyyyyyyy Positive denormal
0 xxxxxxxx yyyyyyyyyyyyyyyyyyyyyyy Positive number
0 11111111 00000000000000000000000 +Infinity
0 11111111 0yyyyyyyyyyyyyyyyyyyyyy Positive signaling NaN
0 11111111 1yyyyyyyyyyyyyyyyyyyyyy Positive quiet NaN

It is important to remember that operations on floats generally return the best possible result given the constraints of the floating point format. For example, 1.0 / 3.0 returns the best possible approximation of 1/3 that can be represented as a float.
And sure, 3/10 can’t be exactly represented as a float, just like 1/3 can’t be exactly represented as a finite length decimal. That’s just a conquence of floats being based on binary.