The Perils of Multi-valued logic

Perils of NULL usage:

I know there is a religious war about the use of NULLs and I don’t want to go down that path. But about 10 years ago I worked for another one of Bill Gate’s companies and I had to prepare a sales report for him from our database. To cut to the chase – I almost sent a report that was missing a large amount of the sales because I forgot about nulls. And I have remembered it ever since.

And that is the point I want to make. When you allow nulls in your database design, you run the risk of developers or report writers forgetting about them and producing and making business decisions on incorrect data. I don’t consider this to be an acceptable risk for the most part – as confidence in your data is critical.

And we just don’t think about them. For example:

_________________________________________________________

declare @nvar1 nchar(1)

declare @nvar2 nchar(1)

set @nvar1 = 'A'

set @nvar2 = 'B'

select @nvar1 + @nvar2

_________________________________________________________

We know this should return ‘AB’

But what does it return if we do this:

_________________________________________________________

set @nvar2 = NULL

_________________________________________________________

The answer is: it depends. 

It depends on the CONCAT_NULL_YIELDS_NULL setting. If this setting is set ON for the session it will return NULL – as you cannot compare something with a null- which is the ANSI standard. If the setting is OFF it will return ‘A’, which is the older behavior. Also, please note that the CONCAT_NULL_YIELDS_NULL setting on the connection takes precedence over the database setting.

We need both options – one for the ANSI standard, and one for backward compatibility. But it means that you as a developer must be thinking about nulls all the time.

So let’s take a look at 3-valued logic in general.

When we write SQL expressions we are very aware of expected behavior when these expressions evaluate to TRUE or FALSE. However, we tend to forget about the third value – UNKNOWN. This can lead to results that we did not anticipated; they are not incorrect, but we forgot to test for them.

So, here is a little exam just to point out how easy it is to be confused about 3-valued logic (cover the answers before you take the test J)  

Complete each box for two expressions that are ANDed. (i.e., if exp1 evaluates to TRUE AND exp2 evaluates to TRUE, the Boolean AND of both is TRUE)

AND

TRUE

FALSE

UNK

TRUE

 

 

 

FALSE

 

 

 

UNK

 

 

 

Now complete each box for two expressions that are ORed. (i.e., if exp1 evaluates to true OR exp2 evaluates to TRUE, the Boolean OR of both is TRUE)

OR

TRUE

FALSE

UNK

TRUE

 

 

 

FALSE

 

 

 

UNK

 

 

 

 

 

These are the correct answers:

AND

TRUE

FALSE

UNK

TRUE

TRUE   

 FALSE

UNK

FALSE

FALSE

 FALSE

 FALSE

UNK

UNK

 FALSE

UNK

 

OR

TRUE

FALSE

UNK

TRUE

TRUE

TRUE

TRUE

FALSE

TRUE

FALSE

UNK

UNK

TRUE

UNK

UNK

If you didn't get it correct, then you need to consider the others in your team that won't get it correct as well. So best practice is to take an approach for null usage in all your code – and then publicize it and ensure compliance.