How can a NULL terminated string be misinterpreted in a UNICODE_STRING?

Article
03/01/2006

A NULL terminated string can be mismisinterpreted if the Length field contains the NULL part of the string. Let's look at the the definition of DECLARE_CONST_UNICODE_STRING again before I go into how it can be misinterpreted.

#define DECLARE_CONST_UNICODE_STRING(_var, _string) \
const WCHAR _var ## _buffer[] = _string; \
const UNICODE_STRING _var = { sizeof(_string) - sizeof(WCHAR), sizeof(_string), (PWCH) _var ## _buffer }

note the- sizeof(WCHAR) in the initialization of the Length field

const UNICODE_STRING _var = { sizeof(_string) - sizeof(WCHAR), ...

For clarity's sake, it should really be sizeof(UNICODE_NULL), but they are functionally the same. sizeof(_string) will return the size in bytes of the string including the terminating NULL. By subtracting off the size in bytes of one unicode character, we initialize the Length field to the size in bytes of the NULL terminated string without counting the NULL itself.

So, how can the misinterpretation occur? Since the UNICODE_STRING gives the length of the Buffer, there is no need to stop at the terminating NULL when comparing two UNICODE_STRING Buffers against each other. This means that by including the NULL in the length, you change how much of the Buffer is evaluated (*). For instance, the registry code stores the name of a Value as a UNICODE_STRING (e.g. "MyValue"). If you passed in a UNICODE_STRING for the value name which was initialized like this

UNICODE_STRING name = { sizeof("MyValue"), sizeof("MyValue"), "MyValue" };

to retrieve the value contained by "MyValue", the query would fail because the registry API would be comparing "MyValue" (the value name in the registyr) against "MyValue\0" (the name you passed in). I know from personal experience that you have to be exact with this, especially when creating a UNICODE_STRING by hand that is not based on a constant string. I have been bit by this issue and have spent alot of time figuring out what I did wrong.

(*) A side affect of not stopping to evaluate the Buffer when you encounter a NULL is that you can create a multi-sz value without having to compute the length of each substring to find the total length of the multi sz.

How can a NULL terminated string be misinterpreted in a UNICODE_STRING?

Additional resources