Error handling, part 5: an error infrastructure for Windows


<<Part 4  Part 6>>

Overview

The first error infrastructure I've done was for my Triceps project. Its description can be found in the Triceps manual. It has proven itself so convenient and useful that I wanted something similar on Windows. And that Windows implementation is what I want to describe here. Some of the features from Triceps got dropped to reduce the time spent and because of a bit different focus (Triceps is really a programming language, so it needs the good support of error nesting for its compilation error reports, but it's something that can be lived without in other uses, though I'd still like to add it to the Windows implementation at some point). Some of the features with the localization and the IDs got added. Like I've said before, it's not everything I could think of but it's a good approximation, and it might get extended in the future.

I'd want to show the other examples of Windows code as well, such as an easy way to write a Windows service. But this code uses the error reporting library, and that makes me want even more to present the error handling library first.

Even though a lot of my posts here are about PowerShell, I really write most of the code in C/C++. The library I'm about to show is in C++. I've been experimenting with the better error reporting in PowerShell too but that's a separate subject to be discussed later.

The source code can be found in the attached file. Since it has turned out that only one file can be attached to a post, I've combined both files ErrorHelpers.hpp and ErrorHelpers.cpp into ErrorHelpers.txt.You'd need to split them out manually if you download it.

Basic errors

Let's start with the examples of how the library gets used. The most widely used class is Erref ("error reference"). It's really a C++11 counted reference with a bit of helper methods added:

class Erref: public std::shared_ptr<ErrorMsg>

One reason to add the methods to it is that the reference may contain a NULL (such as if there are no errors), and some methods are much more convenient when they can work on the NULL references too. Another reason is for the methods that build the chains of errors. There methods may need to change the original reference, they must apply to the reference and not to the error object stored within it. I'll tell more about the methods in a moment.

The typical usage goes like this:

void SomeMethod(arg1, arg2, Erref &err);
...
Erref err;
SomeMethod(x, y, err);
if (err) {
  // handle or report the error ...
}

If the method experiences an error, it leaves a reference to it in the Erref.

The actual error is represented with the class ErrorMsg. It contains:

    const Source *source_; // The source of this error. NULL means "Windows NT errors."
    DWORD code_; // The error code, convenient for the machine checking.
    std::wstring msg_; // The error message in a human-readable format.
    std::shared_ptr<ErrorMsg> chain_; // The chained error, or NULL.

The contents represents the ideas I've described in the previous installments. msg_ describes the error in the human-readable way. code_ is the machine-readable error code. The code has a meaning within the code module that reported the error. Different modules may have the overlapping code spaces. And source_ represents the code space (or if you prefer, "namespace") of the errors in a module. A module obviously can't cross the DLL boundaries, but nothing stops you from having multiple error sources (i.e. error sources) in one DLL or one program. I'll tell more about the sources in a moment. Finally, the error messages can be chained, with the head of the chain providing the high-level descriptions and the mode detail provided the deeper you go into the chain.

The ErrorMsg objects can be constructed directly but that's mostly used by the internals of the implementation, normally the Source acts as a factory for the messages, and the copying of the message chains is normally done with the Erref method:

Erref err1, err2;
err2 = err2.copy(); 

Here we come back to the class ErrorMsg::Source. Each module that will report errors has a static ErrorMsg::Source object. Each source has a name, and optionally a GUID. You define it as:

ErrorMsg::Source MyErrorSource(L"MyModuleName", &MyModuleErrorGuid);

If you don't care about GUIDs, you can use NULL for the GUID pointer. Personally, I haven't found much use for the guids, and the library doesn't use them in any way at the moment. They've been put there more as a placeholder that might become useful in the future, and so far they haven't. The basic Source let's you create the non-localized error messages. Which is convenient for things like the simple debugging messages or as a last resort if you can't get the localization. The basic method that creates an error messages uses the printf()-like formatting:

Erref err = MyErrorSource.mkString(errorCode, L"some error message with numbers %d 0x%x", 1, 2);

It's also often necessary to create an error that details a system error code. And there is a special method for that:

Erref err = MyErrorSource.mkSystem(GetLastError(), INPUT_FILE_OPEN_FAILED, L"failed to open the input file \"%ls\"", fileName);

It takes two error codes: the system error returned by the Windows functions and the application-level code for the application-level explanation (that code it up to you to define). And as usual the explanatory string, created with the printf()-like formatting. Since there is space for only one error code in an ErrorMsg, you might wonder, where does the second error code go? The answer is that this method creates not a single ErrorMsg but a chain of two of them. The first one contains the application-level explanation and code as usual. The second, chained, one will contain the system error code along with the error string for that code extracted from Windows. That system error string will be localized, as it's normally done by windows.

The ErrorMsg objects for the Windows system errors are a bit of a special case, they contain NULL in the source_ pointer. There is also a special case for the errno errors returned from the C stdlib, I'll describe it later.

Localized errors

The proper programs need to use the localized error messages, and the subclass ErrorMsg::MuiSource helps with that. MUI is the windows term for the subsystem of the localized messages. You start by defining the messages in an .mc file that looks like this:

;//
;//  Status values are 32 bit values layed out as follows:
;//
;//   3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1
;//   1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0
;//  +---+-+-------------------------+-------------------------------+
;//  |Sev|C|       Facility          |               Code            |
;//  +---+-+-------------------------+-------------------------------+
;//
;//  where
;//
;//      Sev - is the severity code
;//
;//          00 - Success
;//          01 - Informational
;//          10 - Warning
;//          11 - Error
;//
;//      C - is the Customer code flag
;//
;//      Facility - is the facility code
;//
;//      Code - is the facility's status code
;//

MessageIdTypedef=ULONG

SeverityNames=(
    Success=0x0:STATUS_SEVERITY_SUCCESS
    Informational=0x1:STATUS_SEVERITY_INFORMATIONAL
    Warning=0x2:STATUS_SEVERITY_WARNING
    Error=0x3:STATUS_SEVERITY_ERROR
)

FacilityNames=(
    MyFacilityName1=0x0001:MY_FACILITY_Name1
    MyFacilityName2=0x0002:MY_FACILITY_Name2
)

MessageId=0x0001
Facility=MyFacilityName1
Severity=Error
SymbolicName=MUI_INPUT_FILE_OPEN_FAILED
Language=English
Failed to open the input file  "%1!ls!".
.

There you define every which message for every which language you plan to support.

Or alternatively you can define a .man file and define the error strings there along with the things like the ETW events that your program may send. Personally, I find the .mc files a lot more readable. Either way, then you compile the messages into a binary section:

mc -h DirectoryForGeneratedHeaders -r DirectoryForBinaries MyMessages.mc

It would compile an .mc file, or a .man file, or (a little-known fact) both together but no more than one of each:

mc -h DirectoryForGeneratedHeaders -r DirectoryForBinaries MyMessages.mc MyManifest.man

This will produce DirectoryForGeneratedHeaders\MyMessages.h (and/or DirectoryForGeneratedHeaders\MyManifest.h) with the macro definitions for the message codes, and in DirectoryForBinaries it will place MyMessages.rc and MSG*.bin. The .bin files will contain the actual messages for various languages, the .rc (file for the resource compiler) will contain the reference to the .bin files. And then the compiled resource file (and the binary messages referred by it) are fed into the linker to create the executable myprogram.exe and the localized files for it myprogram.exe.mui. I'm a bit fuzzy on how exactly it happens, the build system takes care of it for me, so I'll leave the details of that step up to the inquisitive reader.

Coming back from the detour into the creation into the message files, the MuiSource for reading the message files gets defined like this:

ErrorMsg::MuiSource MyMuiErrorSource(L"MyModuleName", &MyModuleErrorGuid);

Again, feel free to use NULL instead of the pointer to the GUID:

ErrorMsg::MuiSource MyMuiErrorSource(L"MyModuleName", NULL);

Then the ErrorMsg object can be created with:

Erref err = MyMuiErrorSource.mkMui(MUI_INPUT_FILE_OPEN_FAILED, fileName);

It's the same as working with plain strings, only the MUI message ID is used instead of the direct format string. The arguments follow it.

And the localized wrapper messages for the system errors are created like this:

Erref err = MyMuiErrorSource.mkMuiSystem(GetLastError(), MUI_INPUT_FILE_OPEN_FAILED, fileName);

The MUI error sources can't be used with mkString() to avoid the accidental misuse. You must always use the mkMui() methods with them.

Error chaining

How do you chain the errors together? Copying an example from above, if we have some function that receives an error report from some other function it calls, how would it add the extra information and return the enhanced error? The basic logic will go like this, only we need to fill in the bit for combining the errors:

void SomeOtherMethod(int x, int y, Erref &err)
{
  Erref nesterr;
  SomeMethod(x, y, nesterr);
  if (nesterr) {
    // enhance and report the error
    err = MyMuiErrorSource.mkMui(SOME_METHOD_FAILED, x, y);
    // here we heed to chain nesterr to err
    return; // the error indication is in err
  }
  ...
}

The two Erref methods typically used to chain together two errors are wrap() and append(). Append() appends the second error to the first one:

err.append(nesterr);

Wrap() does the opposite, prepends the second error to the existing chain:

nesterr.wrap(MyMuiErrorSource.mkMui(SOME_METHOD_FAILED, x, y));
err = nesterr;

Note that this way you don't really need the extra variable nesterr, you can call directly SomeMethod(x, y, err), and then add the wrapping directly to err.

Technically, the working of wrap() is a bit more complicated than append(): if the argument error is a chain itself, only its first error will be prepended, and the rest will be appended to the end of the chain. This is done because normally the argument of wrap() is expected to contain only one message, and if there are more messages, they would be explanations of some internal error, such as the library being unable to open the MUI file. Because of that these explanations of the internal errors get placed at the end of the chain. There is also the method splice() that splices in another object into the current one, but wrap() is more convenient to use. With splice() the same meaning can be achieved with:

err = MyMuiErrorSource.mkMui(SOME_METHOD_FAILED, x, y);
err.splice(nesterr);

All the methods append(), wrap(), and splice() can handle NULLs in both the argument and in the reference itself. A NULL argument will leave the current reference unchanged. A NULL in the current reference will have it changed to point to the same unchanged error chain as the argument.

Printing the errors

Now you've build an error chain, what do you do with it? You can convert it to a string, and then do whatever you please with it:

wstring s = err->toString();

It works even if err is a NULL reference, then it will return an empty string. The string conversion added a bit of indenting to the messages under the top one, making them easier to read. Each error will start with its source name and the error code in decimal and hex, followed by the text of the message. The errors are separated by \n.The source name for the system errors is printed as "NT".

Since the error chains can be pretty long, if you have to deal with writing to some limited-size buffers (such as constructing the ETW messages from the errors), you may need to break up a single error into multiple buffers. The method toLomitedString() helps with that:

Erref cont;
wstring s = err->toLimitedString(MY_LIMIT_IN_CHARS, cont);

It will take as many messages in the chain and convert them to a string. And if there are buffers left, the reference cont will point to the point somewhere in the middle of the chain, that can be used in the next call of toLimitedString(), giving the continuation buffers. After the whole chain is converted, cont will be set to NULL.

Other helper methods in Erref

Erref has a few more helper methods:

bool v = err.hasError();

Checks that the reference is not empty and the error code in it is not 0. The code 0 (ERROR_SUCCESS) can be used to indicate some warning or informational message without an error. In practice, this didn't work out too well. It's not flexible enough to indicate the warning or informational (or verbose or debug) level, and the presence of errors doesn't propagate all the way up through the chain (unlike the error handling in Triceps). And it just doesn't mesh with the concept of each message having its code, even if it's an informational message, so it doesn't coexist well with MUI. This is something to consider for the future, for now when I need the messages of different levels, I keep them  in the separate chains:

SomeFunction(x, y, Erref &err, Erref &warn, Erref &info);

The next helper method returns the code from the referenced error:

DWORD c = err.getCode();

If err is NULL, it returns 0 (ERROR_SUCCESS).

The next error in the can can be read with:

Erref tail = err.getChain();

As usual, it's safe to call if err is NULL, and will return NULL in this case.

And if you know that the message contains a chained message of a known type in it (such as if it was created with mkSystem() or mkMuiSystem()), you can get the nested error code in one go:

DWORD c = err.getChainCode();

I've already mentioned the copying:

Erref err2 = err.copy();

Copying comes handy if you want to use an error in two places, chaining it to two chains. Just chaining it to two chains will cause two chains to merge, with all kinds of confusing side effects. Instead make a copy for one chain, and use the original error for another one. The copying is full-depth, it copies the whole chain growing from the error message.

And for the simple test program, the following method comes handy:

err.printAndExitOnError();

If the code in the error is not 0, this method converts it to string, prints to stdout, and exits with the value 1.

Errno errors

The errors from the C standard library (errno) are completely independent from the normal Windows errors. The errno uses the same values for the completely separate meaning. Its translation to strings is also different. This is where the concept of different namespaces as defined by the error sources gets useful. A special pre-defined source ErrnoSource (with name "Errno") can be used to report the errno errors. There also are a couple of helper static methods in the class ErrorMsg:

static std::shared_ptr<ErrorMsg> mkErrno(DWORD code);
static std::shared_ptr<ErrorMsg> mkErrno(); // calls _get_errno() to get the code

They create the errors from the ErrnoSource and include the human-readable text of the error. The function _get_errno() is the nicer thread-safe way to get the value of the errno for the current thread. The typical use goes like this:

if ( (f = fopen(filename, "rt")) < 0) {
  err = ErrorMsg::mkErrno();
  err.wrap(MyMuiErrorSource.mkMui(MUI_INPUT_FILE_OPEN_FAILED, filename));
  return;
}

Internal errors

The error subsystem itself may experience errors. For example, if the MUI file is not present, it won't be able to print the MUI messages. In this case the original error will be left with the proper error and source but an empty text, and the explanation of the internal error will be chained to it. The internal errors are reported on the private error source, with the name "ErrorMsg".

Other methods of ErrorMsg

If you want to define some wrappers for the ErrrorMsg creation, you would need to use varargs, and then pass the va_list to the low-level ErrorMsg factory methods. They are:

mkStringVa()
mkSystemVa();
mkMuiVa();
mkMuiSystemVa();

I won't describe them here in detail, just so that you know that they exist if you need them.

Helper functions for formatted printing to C++ strings

I'm very surprised that we don't have the standard functions that work like prinft(), only produce a C++ string. The <iostream> formatted output outright sucks. So at every place I write a version of the printf() to C++ string. The error library uses them, so I've got them included,  and you can as well use them directly:

// Like [v]sprintf() but returns a string with the result.
std::wstring __cdecl wstrprintf(
    _In_z_ _Printf_format_string_ const WCHAR *fmt,
    ...
    );
std::wstring __cdecl vwstrprintf(
    _In_z_ const WCHAR *fmt,
    __in va_list args);

// Like wstrprintf, only appends to an existing string object
// instead of making a new one.
void __cdecl wstrAppendF(
    __inout std::wstring &dest, // destination string to append to
    _In_z_ _Printf_format_string_ const WCHAR *fmt,
    ...
    );
void __cdecl vwstrAppendF(
    __inout std::wstring &dest, // destination string to append to
    _In_z_ const WCHAR *fmt,
    __in va_list args);

An interesting point of their implementation on Windows is that unlike the standard sprintf(), _vsnwprintf_s() doesn't return the needed buffer size in case if the string is longer than the available buffer. Instead the buffer size has to be pre-calculated with _vscwprintf().

The internals of reading the MUI strings

Reading the localized strings from the .mui file is another somewhat interesting subject. I haven't found a ready recipe anywhere and had to piece it together by myself.

It starts by getting the handle of the loadable module (DLL or EXE) where the strings  are defined:

        if (!GetModuleHandleExW(
            GET_MODULE_HANDLE_EX_FLAG_FROM_ADDRESS
            | GET_MODULE_HANDLE_EX_FLAG_UNCHANGED_REFCOUNT,
            anchor,
            &m)
        ) {
            // handle the error ... 
        }

        muiModule_ = m;

Here anchor is any address within the DLL's code or static data. I use the address of the MuiSource object for this purpose, since it obviously would be defined in the same module where the strings are defined.

Then this handle can be fed to FormatMessageW:

        DWORD res = FormatMessageW(
            FORMAT_MESSAGE_ALLOCATE_BUFFER
            | FORMAT_MESSAGE_FROM_HMODULE,
            source->muiModule_,
            code,
            MAKELANGID(LANG_NEUTRAL, SUBLANG_DEFAULT),
            (LPWSTR)&buf,
            0, &args);

By the way, there is another special kind of errors that could use their own handling, the WMI errors. The errors from the WMI subsystem can be translated by reading them from the module C:\Windows\System32\wbem\wmiutils.dll. The module handle for it can be obtained with LoadMUILibraryW(). But I haven't got around yet to add the support for it similar to errno.

<<Part 4  Part 6>>

ErrorHelpers.txt

Comments (0)

Skip to main content