Should I check the parameters to my function?


I just had an interesting discussion with one of the testers in my group.

He had just finished filing a series of bugs against our components because they weren’t failing when he passed bogus pointers to the API.  Instead, they raised a 0xC0000005 exception and crashed his application.

The APIs did fail if he passed a null pointer in, with E_POINTER. 

But he felt that the API should check all the bogus pointers passed in and fail with E_POINTER if the pointer passed in didn’t point to valid memory.

This has been a subject of a lot of ongoing discussion over the years internally here at Microsoft.  There are two schools of thought:

School one says “We shouldn’t crash the application on bogus data.  Crashing is bad.  So we should check our parameters and return error codes if they’re bogus”.

School two says “GIGO – if the app hands us garbage, then big deal if it crashes”.

I’m firmly in the second camp (not surprisingly, if you know me).  There are a lot of reasons for this.  The biggest one is security.  The way you check for bad pointers on Win32 is by calling the IsBadReadPtr and IsBadWritePtr API.  Michael Howard calls these APIs “CrashMyApplication” and “CorruptMemoryAndCrashMySystem” respectively.  The problem with IsBadReadPtr/IsBadWritePtr is that they do exactly what they’re advertised as doing:  They read and/or write to the memory location specified, with an exception handler wrapped around the read/write.  If an exception is thrown, they fail, if not, they succeed.

There are two problems with this.  The only thing that IsBadReadPtr/IsBadWritePtr verifies is that at the instant that the API is called, there was valid memory at that location.  There’s nothing to prevent another thread in the application from unmapping the virtual address passed into IsBadReadPtr immediately after the call is made.  Which means that any error checks you made based on the results of this API aren’t valid (this is called out in the documentation for IsBadWritePtr/IsBadReadPtr).

The other one is worse.  What happens if the memory address passed into IsBadReadPtr is a stack guard page (a guard page is a page kept at the bottom of the stack – when the system top level exception handler sees a fault on a guard page, it will grow the threads stack (up to the threads stack limit))?  Well, the IsBadReadPtr will catch the guard page exception and will handle it (because IsBadReadPtr handles all exceptions).  So the system exception handler doesn’t see the exception.  Which means that when that thread later runs, its stack won’t grow past the current limit.  By calling IsBadReadPtr in your API, you’ve turned an easily identifiable application bug into a really subtle stack overflow bug that may not be encountered for many minutes (or hours) later.

The other problem with aggressively checking for bad parameters on an API is that what happens if the app doesn’t check the return code from the API?  This means that they could easily have a bug in their code that passes a bogus pointer into IsBadWritePtr, thus corrupting memory.  But, since they didn’t check the return code, they don’t know about their bug.  And, again, much later the heap corruption bug that’s caused by the call to IsBadWritePtr shows up.  If the API had crashed, then they’d find the problem right away.

Now, having said all this, if you go with school two, you’ve still got a problem – you can’t trust the user’s buffers.  At all.  This means you’ve got to be careful when touching those buffers to ensure that you’re not going to deadlock the process by (for instance holding onto a critical section while writing to the user’s buffer).

The other thing to keep in mind is that there are some situations where it’s NOT a good idea to crash the user’s app.  For example, if you’re using RPC, then RPC uses structured exception handling to communicate RPC errors back to the application (as opposed to API return codes).  So sometimes you have no choice but to catch the exceptions and return them.  The other case is if someone has written and shipped an existing API that uses IsBadReadPtr to check for bad pointers on input, it may not be possible to remove this because there may be applications that depend on this behavior.

So in general, it’s a bad idea to use IsBadXxxPtr on your input parameters to check for correctness.  Your users may curse you for crashing their app when they screw up, but in the long term, it’s a better idea.

Comments (25)

  1. Anonymous says:

    IMHO, the user shouldn’t be able to crash the app. The app should verify all information from any untrustworthy source (user input, file, network, etc.) and provide feedback when the data is corrupt. This makes it a "firewall" of sorts. The app itself is trustworthy, or at least it should be once it’s debugged.

    The application is better equipped to deal with errors than API’s, because (1) it knows where the data comes from, and (2) it has a feedback mechanism.

  2. Anonymous says:

    Maybe I wasn’t clear (actually I wasn’t clear).

    I’m referring to an API function in this case, not a function within a larger module.

    And it’s critical that an app validate ALL of it’s input, regardless of where it comes from (command line, network socket, file read from disk, web form).

    If it doesn’t, then it leaves itself open to security holes.

  3. Anonymous says:

    I guess what you’re really asking (or what you seem to be asking) is not "should I check my input parameters" but rather "should I throw exceptions that could potentially crash my caller, or should I return error codes which might be ignored." The pros and cons of these approaches have been discussed recently. I like to throw exceptions, as it seems you also do.

  4. Anonymous says:

    Why doesn’t IsBadReadPtr() detect stack guard pages and return a result indicating it as bad?

  5. Anonymous says:

    Joseph: Exactly. It goes to how much verification of an input is done in an API.

    rentzsch: How would IsBadReadPtr() detect them?

    I guess it could filter out guard pages exceptions and pass them on, but I’m not sure that’s enough. And it doesn’t change the fact that IsBadWritePtr works by corrupting memory (it reads the value, writes a new value, and then writes the original value back).

  6. Anonymous says:

    Calling IsBadWritePtr() indiscriminately can also break multithreaded code, as it is not thread-safe — the XP implementation simply does the equivalent of x=*p; *p=x;. A thread-safe implementation would use InterlockedXor(p, 0). That having been said, it’s hard to imagine a possible case that wouldn’t also be a race condition.

    My response to requests for debug checks like this is that debugging features are resources, like memory and disk space. You can only put so many checks, asserts, and debug print statements before the costs of increased maintenance and lost productivity due to unwieldy builds start to add up. That means that such features to be carefully allocated for maximum benefit and minimum cost. Indiscriminately peppering code with checks where errors are unlikely simply wastes development resources.

  7. Anonymous says:

    I look at it this way. The implementation of the API is obligated to check for any well formed but invalid parameters. Examples of this are enumerations out of range, undefined flags being set, a pointer to a required block of memory which is left as NULL.

    But you can’t protect yourself against all invalid parameters. Just because it points to a block of memory, is it really a string? Do I /have/ to deal with it being changed out from under me?

    Where does it end?

    The #1 most important reason to validate inputs is to prevent your code from being tricked into making an improper state transition.

    The #2 most important reason to validate parameters is to prevent people from writing code which depends on your laxness in validation.

    A good anecdote here is that the vast majority of Win9x -> NT compatibility bugs in games was because the Win9x DirectX implementation (at least some large portions of it) did not validate input parameters under the guise of "it was too slow". Lo and behold, on NT, the drivers did have to validate parameters because it wasn’t cool for the games to actually you know crash the system. There’s a big fat app compat shim that the app compat folks wrote that basically does nothing but convert invalid parameter error codes from the DX APIs to success and then a lot of games mysteriously start working.

    I’m told that most of the problems are actually around uninitialized memory/values being passed, so it’s not necessarily the fact that they’re trying to probe for whether a hardware accelerated feature is there or not.

    Re: IsBadReadPtr:

    To really make it reliable and correct would almost certainly involve a kernel transition and synchronization with all the page protection mechanisms.

    In practice, it’s a lot of work and it still doesn’t really get you to the idyllic world where there are no malformed pointers or data structures passed around. Arguably a system like the JVM or CLR gets you there but pointer/reference validity is really the smallest of problems.

    Finally, actually letting the real access violation occur, unless someone is a super genius rocket science and writes something like "__try { … } __except (EXCEPTION_EXECUTE_HANDLER) { return E_POINTER; }", it’ll actually hit the unhandled exception filter and let an OCA minidump get reported or the JIT debugger invoked.

    Some problems are very attractive to try to "solve" but it turns out that almost all the "solutions" make the situation worse than before you started trying.

  8. Anonymous says:

    It’s a toughie, to be sure.

    My own personal mantra for error handling is this:

    Either fail loudly, or not at all. Except in debug mode – where you should do basic validation on all function parameters.

    For example, what happens if either of these fail? What do you do?

    CloseHandle (not much you can do if the file didn’t close correctly…)

    DeleteObject() (not much you can do if your font or brush can’t be deleted).

    Although, to be frank, error handling still bugs the heck out of me. It’s a never ending job. I have the feeling that like optimization, it’s a halting problem that one can only ever approximate a solution to, but never get a full and complete one.

  9. Anonymous says:

    It is because of this point exactly that I prefer to pass by reference

    ie

    bool func( cSomeClass &inClass );

    rather than

    bool func( cSomeClass *inClass );

    I can’t be passed bogus data without some fancy C++ fanangling.

    Gregor

  10. Anonymous says:

    An interesting point Gregor. But you can’t do that in C.

    Which kinda leaves it out when you’re writing APIs 🙁

    For managed code, it’s more reasonable (since IL exposes reference parameters as a top level construct) but in managed code, there’s a strong exception handling framework, and you can pretty safely assume you’re not being handed a bad reference.

  11. Anonymous says:

    Do IsBadReadPtr() and IsBadWritePtr()actually need to read and write to memory ? Can’t they just check some attributes stored in the memory manager ? On x86 CPUs, it would be global/local descriptors. But you’re right, they could change right after checked them.

  12. Anonymous says:

    B.Y.: They could, but performance would suck bigtime.

    Basically instead of reading memory, IsBadXxxPtr would have to call VirtualQuery.

    Now consider the difference in performance between reading memory and calling VirtualQuery.

  13. Anonymous says:

    Well, that depends on the size of the memory. You could have a huge block of memory described by one descriptor.

  14. Anonymous says:

    I think APIs should also crash on NULL pointer parameters. Why return E_POINTER for one particular bad pointer but not others? I’ve seen seen some implementations of strlen() (and friends) that accept a NULL pointer and return string length 0. This can lead to subtle bugs because now strlen(NULL) == strlen("").

    Don’t bog down your API with extra checking. The caller should not be asking your API to do "NULL work". If it is a no-op, then the caller should not even call your API. However, out of convenience, I do think that free(NULL) should not crash. <:)

  15. Anonymous says:

    Re: not checking for NULL:

    the problem is that a lot of parameters aren’t even referenced in all situations so you might be able to go along for a very long time passing NULL to a "required" parameter because you don’t happen to hit the code path.

    I always write my code like:

    returntype foo(…) {

    // Initialize out-only parameters

    // Validate input parameters

    // do work

    }

    I consider it a bug in my code if I can cause an invalid parameter error to be returned by a layer below me. (Contrasted where this is commonly used as a version/capability check, or alternatively they are always just propagated out in which case your caller is left scratching their head wondering which parameters they passed in were bad.)

  16. Anonymous says:

    I had a test app that scanned a small block of memory and wrote out the content. It called IsBadReadPtr first and wrote out XX instead if it was unreadable. Using this, I found the edge of space allocated to the stack. On the second pass, the memory had been committed. (This was on 2000.)

    In retrospect, I was probably looking at one page below the guard, and IsBadReadPtr triggers a stack expansion and returns success.

    So I don’t think IsBadReadPtr does wreck stack guards, but it still can’t properly validate arbitrary pointers.

  17. Anonymous says:

    I believe that the pages that are marked as guard pages are physically mapped into memory, which is why it worked. But if you kept on going, the thread’s stack wouldn’t be able to expand further.

  18. Anonymous says:

    I have an implementation of IsBadReadPointer and IsBadWritePointer which is safe. Incredibly slow, but safe. (There is an obscure retail-build code path in the script engines which must check to see if a page is bad, the most likely case is that it IS bad, and we don’t want to risk either corruption or throwing first-chance exceptions that will confuse the ASP stress testers.)

    Of course, no check can determine that a page is valid now but is going to be invalid a few nanoseconds in the future. But my program doesn’t crash or corrupt memory or issue irritating first-chance exceptions.

    See the comments about assertions at

    http://weblogs.asp.net/ericlippert/archive/0001/01/01/105329.aspx

    for details.

  19. Anonymous says:

    > What happens if the memory address passed into IsBadReadPtr is a stack guard page

    There’s a version of that situation that’s even worse!

    What if the pointer is the LAST guard page, beyond which the stack cannot grow? The rule is that the first time it’s hit, you get an "out of stack" exception. You are then required to handle that exception and do some magic to tell the guard page that you took care of the problem.

    IsBadFooPtr does not reset the guard page when it catches the exception. Perhaps you see where this is going…

    The second time you hit the guard page — which will likely be soon if you’ve been passed a pointer to the end of the stack by some bozo — the operating system does not futz around. The operating system says "we told you once that you were out of stack! Fair warning!" It says that by immediately terminating the process, with no dialog boxes or anything. Just boom, gone.

    If you’ve ever been using an application that has a short pause and then completely disappears, a likely cause is that some unbounded recursion has hit the guard page twice in a row.

  20. Anonymous says:

    Guess that’s what was happening to Visual Studio 6 when it found the code too complex… It would just die peacefully and quietly, no crash’n’burn, no “Send my remnants home”, no “Restart and recover documents”. Save often.

    VS.NET is so much more stable.

  21. Anonymous says:

    In Windows Vista and IE7 we have changed the parameter validation code in WinInet to be more consistent…