Nov
04
2022
--

PostgreSQL: Are All NULLs the Same?

PostgreSQL Are All NULLs the Same

I was recently reading the PostgreSQL 15 release notes and stopped at “Allow unique constraints and indexes to treat NULL values as not distinct”. At first glance, it seems to be a minor change, but this feature may change many designs and make many lives a lot easier.

The first thing here is to understand what is and how a NULL works. Many authors describe NULL as how missing or unknown values are represented, and some authors dedicate a good amount of time to explaining and describing NULL, like Date and Darwen’s “A Guide to the SQL Standard“. Also, checking Wikipedia we see that “In SQL, null or NULL is a special marker used to indicate that a data value does not exist in the database”. 

Okay, NULL indicates that there is an unknown value in a column, but how does it work? Can we operate with NULLs? Well, we can operate with NULLs but we’ll probably have an unexpected result. For example, arithmetic operations involving NULL will return… NULL! If we have a SQL operation that adds 10 to NULL the result will be NULL. If we do “NULL – NULL” the result will be NULL, not zero! Also, a boolean comparison between two values involving NULL may not return TRUE or FALSE. What about comparisons, are there NULLs larger than others? Are there NULLs equal to others? See how tricky this NULL thing can get?

This is why that change caught my attention. If we compare NULLs they will be in essence all different, and it’s because we don’t know what they are. If I compare an unknown value with another unknown value, even though the result is unknown, in the database world they will be different, making NULLs unique by nature. This is how most of the databases operate (if not all of them). This brings a problem to many applications that need to operate on NULLs, especially the ones that try to have UNIQUE constraints using NULL columns. Because ALL NULLs are different, there will never be a collision and the UNIQUE constraint will be bogus. Many application designers have used the ZERO or EMPTY workaround, putting a zero or an empty value ‘’ to represent the absence of value .” Remember, empty isn’t NULL!

This feature changes the behavior here. It now allows NULLs to be treated equally for the sake of distinct comparison using “UNIQUE NULLS NOT DISTINCT”. No more workarounds are needed! But be careful as it’s still an unknown value and the operations are kept the same. For a much better explanation of how NULLs work, check the book mentioned above. You may see databases from another perspective after reading it!

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com