John Fremlin's blog: Nothing was a billion dollar mistake

Posted 2018-01-19 23:00:00 GMT

Tony Hoare, inventor of quicksort and many other foundational computer science concepts, claimed a billion-dollar mistake. It was the invention of the null reference in 1965. The special value of zero is used to indicate that a reference points nowhere. This is very efficient. But it's not a mistake. The mistake is to try to ban missing values. They often really are missing!

Java programmers particularly are afflicted by the NullPointerException, with millions of hits for this on Google. They abbreviate it to NPE, and bemoan it. They'll actually check whether they would throw it and then throw another exception instead — though Java code that distinguishes exception types is rare.

At a superficial level, the null pointer exception is the result of a mismatch of expectations: someone wrote some code that wanted a value, and someone called that code without giving it the value. Without looking deeper, an immediate obvious response is to systematically try to annotate or check that the values are really there when they are required.

The default behaviour in Objective C on SQL usually makes more sense to the output of the program, especially programs which should not crash, where null (or nil) just combines to form another null. Still, missing values disrupt the normal computation and thinking harder about what to do is often worth it.

There have been countless quixotic attempts to avoid this hard thinking: to ban null, to make complicated ways to declare that things are not nullable, and define away nullness. There is a pedantic tendency to try to harness punctuation elements like ? or define Option types — to try to force people to spend effort to admit that a value could be missing.

This is missing the point in the missing values: the hard problem isn't when someone is deliberately, maliciously trying to withhold information from the code they're calling. Instead, it's when they just don't have the information and are passing down what they have. The systematic response should generally not be to insist and demand the information. This would be convenient and make it very easy for one party in the exchange, as now we can write the code without figuring out what to do when there is incomplete information.

Unfortunately, incomplete information is the default state of the world. If the programmer won't deal with it, then the poor users of the program will have to, by entering bogus values. It's better to unambiguously indicate that something is missing rather than demand a lie!

Thousands of immigrants to the US have an invented first name, which is very inconvenient for them, because systems are set up to discourage a missing name: they are called Fnu. This is a bureaucratic consequence of a non-null annotation. It is used as an acronym for First Name Unknown, and in a bizarre twist of fate, for people who only have a first name — as their first name, so the last name is not null.

In the real world, incomplete information is the default. A convenient unambiguous representation is the null pointer, which doesn't take up space in terms of its binary representation or syntax — unlike the verbose imposition of Option types as in Haskell.

Incomplete information is the default with computers too. We need to learn to accept that. Recent language standardizations try to pretend it's not the case, forcing painful complexity on users.

The C++ variant, a type-safe union which was proposed in the last few years misses this lesson. In an opinionated attempt to ban null, there is no by default empty value for a variant. But a variant can be empty, they call it valueless_by_exception() and so code has to deal with it. This paradox afflicts non-nullable ! class fields in Kotlin too, which can be null before they're initialized. Requiring a null-check on something marked non-nullable is just silly.

Let's accept that information is incomplete, and be kinder when that's the case!

Post a comment