Psychic debugging: Why does FormatMessage say the resource couldn’t be found?


Solving this next problem should be a snap with your nascent psychic powers:

I'm trying use FormatMessage to load a resource string with one insertion in it, and this doesn't work for some reason. The string is "Blah blah blah %1. Blah blah blah." The call to FormatMessage fails, and GetLastError() returns ERROR_RESOURCE_TYPE_NOT_FOUND. What am I doing wrong?

LPTSTR pszInsertion = TEXT("Sample");
LPTSTR pszResult;
FormatMessage(
        FORMAT_MESSAGE_ALLOCATE_BUFFER |
        FORMAT_MESSAGE_FROM_HMODULE |
        FORMAT_MESSAGE_ARGUMENT_ARRAY,
        //I also tried an instance handle and NULL.
        GetModuleHandle(NULL),
        IDS_MY_CUSTOM_MESSAGE,
        MAKELANGID(LANG_NEUTRAL, SUBLANG_DEFAULT), // default language
        (LPTSTR) &pszResult,
        0,
        (va_list*) &pszInsertion); 

Hint: Take a closer look at the parameter IDS_MY_CUSTOM_MESSAGE.

Hint 2: What does "IDS_" tell you?

Resource identifiers that begin with "IDS_" are typically string resource identifiers, not message resource identifiers. There is no strong consensus on the naming convention for message resource identifiers, although I've seen "MSG_". Part of the reason why there is no strong consensus on the naming convention for message resource identifiers is that almost nobody uses message resources! I don't understand why they were added to Win32, since there was already a way of embedding strings in resources, namely, string resources.

That's why you're getting ERROR_RESOURCE_TYPE_NOT_FOUND. There is no message resource in your module. If you're not going to use a message resource, you'll have to use the FORMAT_MESSAGE_FROM_STRING flag and pass the format string explicitly.

DWORD_PTR rgdwInsertions[1] = { (DWORD_PTR)TEXT("Sample") };
TCHAR szFormat[256];
LoadString(hInstance, IDS_MY_CUSTOM_MESSAGE, szFormat, 256);
LPTSTR pszResult;
FormatMessage(
        FORMAT_MESSAGE_ALLOCATE_BUFFER |
        FORMAT_MESSAGE_FROM_STRING |
        FORMAT_MESSAGE_ARGUMENT_ARRAY,
        szFormat,
        0,
        0,
        (LPTSTR) &pszResult,
        0,
        (va_list*) &rgdwInsertions); 

I also made a slight change to the final parameter. When you use FORMAT_MESSAGE_ARGUMENT_ARRAY, the last parameter must be an array of DWORD_PTRs. (The parameter must be cast to va_list* to keep the compiler happy.) It so happens that the original code got away with this mistake since sizeof(DWORD_PTR) == sizeof(LPTSTR) and they both have the same alignment requirements. On the other hand, if the insertion were a DWORD, passing (va_list*)&dwValue is definitely wrong and can crash if you're sufficiently unlucky. (Determining the conditions under which your luck runs out is left as an exercise.)

Comments (25)
  1. dave says:

    I don’t understand why they were added to

    Win32, since there was already a way of

    embedding strings in resources, namely,

    string resources.

    The kernel guys use structured 32-bit values for all status codes, and there is a strong need for a (localizable) message for every status code.

    "Win32 resources" are not an exact fit this requirement, for a number of reasons. A more natural (IMO) approach is to have a single source file that defines the numeric codes and the message strings associated with same.

    I suspect that the best reasons for this, though, are

    1) They kernel guys came from the VMS culture where use of a message source file/message compiler was the system norm.

    2) The kernel, and subsystem, use of same was in existence long before there was a Win32, i.e. back when the OS was "NT OS/2".

    This doesn’t of course give any direct reason why Win32 had to adopt the same convention as the kernel.

  2. Tom says:

    Yeah…What dave said.

  3. Nathan says:

    Why is the last arg explicitly typed versus the old “…” indicator for a variable argument list ?

    Is it a code-safety related change (so you don’t pass garbage, or forget a few params that cause garbage to be read off the stack, giving a potential attack vector), a MS preferred implementation manner, a language standard change or __ ?

    [Because if it did that, you would be complaining “Why does it take a … instead of a va_list?” It would be impossible to write wrapper functions. Better inconvenient than impossible. -Raymond]
  4. KJK::Hyperion says:

    Strings in message tables are assumed to be in the "current ANSI codepage", whatever that means, and for me that alone is almost incentive enough to never use them.

    Strings in string tables are UTF-16, but they are, curiously, packed in bundles of 16 per resource, they are not NUL-terminated and have, instead, a prefix USHORT with the character count (basically making them wide-character Pascal strings). Whoever was responsible for that design sure had to be very proud of it.

    Win32 doesn’t have a standard counted string type, so they cannot be used directly from memory and need to be copied instead. Word on the street is LoadString lets you access the original pointer+size in the resource section as an undocumented feature – figure it out by yourself. Personally, I prefer a combination of FindResource/LoadResource/LockResource and manually parsing the string bundle.

    For some other curious reason, LoadString and wvsprintf are implemented in user32.dll rather than kernel32.dll, pulling in user32.dll (and gdi32.dll, and related kernel mode overhead) unnecessarily from many otherwise minimalistic libraries.

    Raymond: I believe DWORD_PTRs are supposed to be equivalent to pointers in everything (size, alignment) but semantics.

  5. Mihai says:

    <<Strings in message tables are assumed to be in the "current ANSI codepage">>

    No, they are not.

    Just compile a .mc file (using mc.exe) and take a look inside the resulting .bin file.

  6. BryanK says:

    > Better inconvenient than impossible. -Raymond

    The “standard” solution to that is to provide two functions, one that takes a … and another that takes a va_list.  See, for instance, fprintf and vfprintf.

    (But since the … version merely forwards to the va_list version, with some va_start/va_end wrapping, it’s not like it’d be hard for users to write their own wrapper for FormatMessage.  It might be nice if there was a standard wrapper, but oh well.)

    [No matter what you do, people will complain that you didn’t do more. Sometimes you have to accept meeting halfway. Inconvenient is better than impossible. -Raymond]
  7. dave says:

    Re:

    >Strings in message tables are assumed to be in the "current ANSI codepage"

    No, they are not.


    Maybe he meant that, by default, the .mc source file is assumed to be in the current ANSI code page.  Indeed it is.

    But, actually, mc gives you "Unicode or ANSI" control over both input (-a/-u) and output (-A/-U), with -a -U being the default.

    Possibly such control is a relatively new feature; I dunno.

  8. Alexei Lebedev says:

    I think the question is valid.

    Given that the function name is FormatMessage, you kind of expect the word MESSAGE to be in some of the flags that are passed to it. In particular, FORMAT_MESSAGE_FROM_HMODULE sounds like a request to load the "message to be formatted" from a resource. The person who asked the question probably didn’t know about the existence of "message resources". Similarly, FORMAT_MESSAGE_FROM_STRING, when compared to FORMAT_MESSAGE_FROM_HMODULE, sounds like the function takes a pointer instead of a resource ID.

    Better names would be

    FORMAT_MESSAGE_FROM_STRING_RESOURCE and

    FORMAT_MESSAGE_FROM_MESSAGE_RESOURCE

    The bigger problem with FormatMessage is that it takes too many options. In the number of lines the guy spent calling FormatMessage, he could open a text file, scan down to the line containing his message ID, and rip out the string in question. Localizing the file in question would then be a job for the installer, which seems like the right level at which solve the problem. Such text files can also be worked on by the less technically inclined localization guys (as compared to .rc files, which can break the build).

    And again, in order to make life easier for the programmer, why deal with these IDs at all? String IDs just tend to repeat, in a mangled form, the contents of the message. A solution I’ve seen successfully used involved simply wrapping each string in the source that’s supposed to be localized with a function call:

    LocalizeString("some text")

    A command-line tool then scanned the entire source tree for occurences of the pattern LocalizeString(c++ string). All these strings were saved to the messages file, to which the localization engineer would add translations:

    en: some text

    ru: рыба

    LocalizeString then looked things up in this file. The engineers got readability out of this, and saved one identifier per message.

    Orphan messages were also eliminated. Isn’t this a big problem? Any given Windows program that uses string resources probably has 20% of "garbage" strings (strings that are no longer used).

  9. J says:

    "I think the question is valid."

    Who said it wasn’t?

  10. Alexei Lebedev says:

    “Who said it wasn’t?”

    The sarcastic mention of psychic powers required to solve the problem is another way of saying “it doesn’t take a wizard to figure this out”. So the person asking the question must be step below plebeian.

    If you asked me a question and I offered you to engage your “nascent psychic powers”, wouldn’t you feel put down?

    [No, that’s not what “psychic powers” means. “Psychic powers” = “debugging with incomplete information”. “Nascent psychic powers” = “another trick to add to your arsenal”. “Don’t be helpless” = “It doesn’t take a wizard to figure this out.” -Raymond]
  11. Alexei Lebedev says:

    "As indeed it does"

    Right… So basically FormatMessage can’t access string resources, for which LoadString should be used instead.

  12. KJK::Hyperion says:

    Mihai: turns out we are both right. Some tables contain ANSI strings (see: ntoskrnl.exe) but most have UTF-16 strings. It makes sense that the symbolic names for bugcheck codes (which is what ntoskrnl.exe’s message table contains) would be ANSI, since the kernel debugging API doesn’t support Unicode. I guess there’s a flag for it in the resource format

  13. dave says:

    >Similarly, FORMAT_MESSAGE_FROM_STRING,

    >when compared to FORMAT_MESSAGE_FROM_HMODULE,

    >sounds like the function takes a pointer

    >instead of a resource ID.

    As indeed it does.

    The doc:

    The lpSource parameter is a pointer to a null-terminated message definition. The message definition may contain insert sequences, just as the message text in a message table resource may.

    /* Here, "message definition" means "array of characters".  */

    and:

    dwMessageId

    [in] Message identifier for the requested message. This parameter is ignored if dwFlags includes FORMAT_MESSAGE_FROM_STRING.

  14. SvenGroot says:

    Alexei: in my experience psychic debugging involves someone making a mistake which you are able to guess because it’s a common mistake, and maybe a mistake you’ve made yourself in the past.

    It doesn’t mean that the person asking the question is an idiot for not knowing the answer.

  15. dave says:

    That sure looks like lpSource should point to a

    message resource.  Even though a message is a

    string, MSDN calls for a message definition.

    If someone didn’t guess the difference between

    FORMAT_MESSAGE_FROM_HMODULE and

    FORMAT_MESSAGE_FROM_STRING, your blog is

    the only way they can find out.

    Well, there’s always "try it and see".  But that’s perhaps too difficult?

    I don’t understand why you think this is such a big complicated deal. It’s obvious to me that "FROM_STRING" means "from a string". Perhaps I’m way too literal-minded.

  16. Norman Diamond says:

    > If you’re not going to use a message resource,

    > you’ll have to use the FORMAT_MESSAGE_FROM_STRING

    > flag and pass the format string explicitly.

    I believe you, but look at this:

    *  FORMAT_MESSAGE_FROM_STRING

    *  The lpSource parameter is a pointer to a

    *  null-terminated message definition.

    That sure looks like lpSource should point to a message resource.  Even though a message is a string, MSDN calls for a message definition.  If someone didn’t guess the difference between FORMAT_MESSAGE_FROM_HMODULE and FORMAT_MESSAGE_FROM_STRING, your blog is the only way they can find out.

    Tuesday, May 29, 2007 3:36 PM by Alexei Lebedev

    > In the number of lines the guy spent calling

    > FormatMessage, he could open a text file, scan

    > down to the line containing his message ID,

    > and rip out the string in question.

    Thereby making execution very inefficient on every target machine every time it gets executed, instead of once while trying to figure out how to code the function call.  If all the lines of parameters were understandable then it would be better to accept the number of parameters.

    > String IDs just tend to repeat, in a mangled

    > form, the contents of the message.

    Not always.  Thank you for helping provide a counterexample.

    Contents of the message:  рыба

    ID:  Not  ID_рыба

    > LocalizeString then looked things up in this file.

    And that’s why, for example, a vendor’s web page shows a list of products, in which the left hand column (except for the top row) is an integer starting at 1 and counting up, and the left hand column’s header (top row) is a word which means the opposite of "Yes".  The vendor started with a word that means the opposite of "Yes" in one or two languages, and localization took that meaning, instead of finding localizations of a different meaning of that word.

  17. KJK::Hyperion says:

    Norman Diamond: it’s called a "message definition" and not a "message" because you pass a formatting string with argument placeholders. The input string is a "message definition", the "message" is the final output

  18. JustMe says:

    At KJK::Hyperion:

    You wrote:

    "Win32 doesn’t have a standard counted string type,"

    Yes, that is right. But the kernel does have it: As ANSI_STRING as well as UNICODE_STRING. In fact, these are most often used for strings there.

  19. ac says:

    BSTR’s are a part of win32 and they are counted

  20. KJK::Hyperion says:

    ac: see what JustMe said. BSTRs have their own Very Special allocator (requiring, in this case, to copy the string anyway), you can’t use them to refer to a string in an arbitrary range of memory

  21. Jonathan says:

    > almost nobody uses message resources

    Except for those who write to the event log.

    [That’s what the word almost means. Sigh. Do you want a medal? -Raymond]
  22. Norman Diamond says:

    Thursday, May 31, 2007 8:17 AM by Jonathan

    > almost nobody uses message resources

    Except for those who write to the event log.

    Either that, or including those who write to the event log ^_^

    For a few years I was confused by a ton of event log messages talking about not having resources for remote computers.  Then one day I wanted to add some debugging traces to a program, but didn’t want to spend a few days figuring out how to obey MSDN’s rules just to record debugging traces, so I recorded strings the same way I used to do with printk.  Oh, so that’s where all those log messages came from, talking about not having resources for remote computers.  When a very small software company made <deleted> around 7 years ago, they were as lazy as I was.

Comments are closed.