The ‘NULL’ Debate, and a few other interesting facts

This is for all my developer friends out there. I recently had a very interesting discussion with a friend of mine on the enigma called NULL and how it’s different from, say, an empty string. This is something that’s been under debate for as long as i can remember, and not just in the realm of RDBMS.

So what is NULL? A NULL is an undefined value, and is not equivalent to a space or an empty string. Let me illustrate with an example:

create table t1 (id int, name varchar(20))      --create a table with two fields

go

insert into t1(id) values(1)                           -- insert a row containing the value for the first field only

go

select * from t1            

id    name
1    NULL

Here, because we did not insert anything for the second field, the field was populated with a default value of NULL. Let’s see what happens if we insert a blank string for the second field:

insert into t1 values(2,'')   --just two single quotes, with nothing between  them
go

select * from t1

id    name
1    NULL
2   

In this case, because we specified an empty string, the value does not amount to NULL.

Similarly, if you insert a string containing only spaces in a cell, and then apply the trim functions (ltrim and rtrim) on it, the resultant value will not amount to NULL:

Insert into t1 values(3,'    ')
go

select id, ltrim(rtrim(name)) from t1

id    (No column name)
1    NULL
2   
3   

The Len function

Another interesting thing I discovered was w.r.t the Len function, used to find the length of a character expression. For example, the statement select Len ('Harsh') returns an output of 5. Also, Select Len(‘’) returns 0. Both of these outputs are as expected. However, what if run Select Len (‘ ‘) (this has about 5 whitespaces) ? The expected output is 5 right? Wrong. The output is 0.

Another twist is if you add a character to the end of the string, after the whitespaces, i.e., Select Len (‘ a’) will return an output of 5. Try the following cases as well, just for fun:

Select Len(‘ a ‘) --the character a enclosed by 2 whitespaces on each side

Select Len(‘h ‘) -- the character h followed by 4 whitespaces

For the first one, the output is 3, and not 5 as I expected. This is because the Len function, by design, ignores trailing spaces. In other words, you could say that it does an implicit rtrim on the string. This is also the reason why the second statement will return a length of 1, not 5 as expected.

In case your application is such that the presence of whitespaces in the data matters and you need them to be counted in the string length (this can be especially true if you’re writing code to move the data as-is to a table/database/application), then a suitable alternative would be the Datalength function. The Datalength function counts whitespaces, both preceding and trailing, when calculating the length. As a simple example, select datalength(' a ') (a enclosed by 2 whitespaces on each side) will return 5 as against 3 returned by Len.

Hope this helps a few of my developer friends out there.Any comments/suggestions/feedback are welcome.