LoadString can load strings with embedded nulls, but your wrapper function might not


Whenever somebody reports that the SHFileOperation function or the lpstrFilter member of the OPENFILENAME structure is not working, my psychic powers tell me that they failed to manage the double-null-terminated strings.

Since string resources take the form of a counted string, they can contain embedded null characters, since the null character is not being used as the string terminator. The LoadString function knows about this, but other functions might not.

Here's one example:

TCHAR szFilters[80];
strcpy_s(szFilters, 80, "Text files\0*.txt\0All files\0*.*\0");
// ... or ...
strlcpy(szFilters, "Text files\0*.txt\0All files\0*.*\0", 80);

The problem is that you're using a function which operates on null-terminated strings but you're giving it a double-null-terminated string. Of course, it will stop copying at the first null terminator, and the result is that szFilters is not a valid double-null-terminated string.

Here's another example:

sprintf_s(szFilters, 80, "%s\0*.txt\0%s\0*.*\0", "Text files", "All files");

Same thing here. Functions from the sprintf family take a null-terminated string as the format string. If you "embed" a null character into the format string, the sprintf function will treat it as the end of the format string and stop processing.

Here's a more subtle example:

CString strFilter;
strFilter.LoadString(g_hinst, IDS_FILE_FILTER);

There is no obvious double-null-termination bug here, but there is if you look deeper.

BOOL CString::LoadString(UINT nID)
{
// try fixed buffer first (to avoid wasting space in the heap)
TCHAR szTemp[256];
int nCount = sizeof(szTemp) / sizeof(szTemp[0]);
int nLen = _LoadString(nID, szTemp, nCount);
if (nCount - nLen > CHAR_FUDGE)
{
*this = szTemp;
return nLen > 0;
}

// try buffer size of 512, then larger size until entire string is retrieved
int nSize = 256;
do
{
nSize += 256;
nLen = _LoadString(nID, GetBuffer(nSize - 1), nSize);
} while (nSize - nLen <= CHAR_FUDGE);
ReleaseBuffer();

return nLen > 0;
}

Observe that this function loads the string into a temporary buffer, and then if it succeeds, stores the result via the operator= operator, which assumes a null-terminated string. If your string resource contains embedded nulls, the operator= operator will stop at the first null.

The mistake here was taking a class designed for null-terminated strings and using it for something that isn't a null-terminated string. After all, it's called a CString and not a CDoubleNullTerminatedString.

Comments (14)
  1. Raymundo Chennai says:

    As an occassional critic of your approach to blogging, I would like to sincerely compliment you on this post and the one that preceded it in the series.  This interesting material is presented clearly without any express or implied denunciation of the idiots whose errors led you to make the posts.  

  2. Joshua Drake says:

    Personally I enjoy it when you denounce idiots, even when I am the idiot in question, or at least one of their kin.

  3. string? says:

    What does "String" in LoadString refer to? A arbitrary byte buffer?

  4. Joe says:

    it refers to a NULL terminated string

  5. vobject says:

    This means that the ::LoadString() can handle embedded nulls, but MFC CString::LoadString() cannot?

    Thanks for telling :)

  6. Lawrence says:

    I just can’t agree with your last sentence. By that logic, why is it called LoadString and not LoadDoubleNullTerminatedString?

  7. someone else says:

    Well, it’s not called LoadCString either.

  8. Ben says:

    Well, maybe it should have been called LoadStrings.

  9. Gabest says:

    CString was handling 0s fine until vs2008 or sp1. It was a breaking change that made me fix up my code here and there.

  10. Dave says:

    I have had pretty good luck with ATL CComBSTR using .AppendBSTR() and .AppendBytes() to build double-null-terminated strings. Both of those methods seem to be null-friendly.

  11. acq says:

    In MSVC 6 the CString class certainly properly handles the strings with zeroes inside. If newer versions don’t then it’s a new bug.

    In MSVC 6 this:

    <code>

    char b[] = "abc";

    CString s( b, sizeof( b ) );

    printf( "%dn", s.GetLength() );

    CString t;

    t = s;

    printf( "%dn", t.GetLength() );

    </code>

    writes 7 and 7 (as it should) and t = s; line goes to the const CString& CString::operator=(const CString& stringSrc) which either just copies the pointer and increments the reference count or does memcpy without doing any strlen equivalent since it has proper length in CStringData member nDataLength.

  12. joe says:

    @acq

    that’s not the point. the point is some parts of the system rely on srings of the form "abc";

    the question is all arround the support for the last "" because if it is not there, those parts of the system will think the string isn’t terminated.

  13. Josh says:

    @Dave: BSTRs are designed to be arbitrary binary blobs, thus their tolerance for nuls.  They hide the length just before the string starts.  Of course, this only works as long as you pass them to BSTR functions; functions that want LPWSTRs or misuse the BSTR identifier (or the COM wrapper) will not work properly if they start looking for nuls.  And many functions start assuming they can truncate the string with a nul, or rewrite the string as they would a C style string, which means the next BSTR-aware function that gets such a mangled string will copy all the garbage left behind.

    In summary: Only write APIs for BSTRs if you know how they work.  C may think of them as equivalent to LPWSTRs, but using them as such is a recipe for disaster.

  14. acq says:

    to joe: I was referring to Raymond’s: "Observe that this function loads the string into a temporary buffer, and then if it succeeds, stores the result via the operator= operator, which assumes a null-terminated string. If your string resource contains embedded nulls, the operator= operator will stop at the first null."

    In MSVC 6 CString class both in MFC and in ATL doesn’t assume null-terminated string in operator=. The original authors of CString tried to implement the class to work properly with strings which are not zero terminated, making possible processing of the strings with as many zeroes inside as anybody needs. But I see now, the problem is in lines:

     TCHAR szTemp[256];

    (…)

     *this = szTemp;

    instead the assignment line should have been

    CString r( szTemp, nLen );

    *this = r;

Comments are closed.

Skip to main content