Are you catching falling knives?

Article
09/06/2012

as I thumb through some people's code on Github, I see a fairly large number of "catch all" exception handling cases. It's difficult to blame folks for that, since there's generally (and sadly) very little discipline about exception contracts and exception masking, i.e. wrapping exceptions to avoid bubbling through failure conditions of underlying implementation details.

If you're calling a function and that sits on a mountain of dependencies and folks don't care about masking exceptions, there are many dozens of candidate exceptions that can bubble back up to you and there's little chance to deal with them all or even knowing them. Java has been trying to enforce more discipline in that regards, but people cheat there with "catch all" as well. There's also a question what the right way tot deal with most exceptions is. In many cases, folks implement "intercept, shrug and log" and mask the failure by telling users that something went wrong. In other common cases, folks implement retries. It's actually fairly rare to see deeply customized and careful reactions to particular exceptions. Again - things are complicated and exceptions are supposed to be exceptional (reminder: throwing exceptions as part of the regular happy path is horrifingly bad for performance and terrible from a style perspective), so these blanket strategies are typically an efficient way of dealing with things.

That all said ...

Never, never ever do this:

 try
{
    Work();
}
catch
{
}

And not even this:

 try
{
    Work();
}
catch(Exception e)
{
    Trace.TraceError(e.ToString());
}

Those examples are universally bad. (Yes, you will probably find examples of that type even in the archive of this blog and some of my public code. Just goes to show that I've learned some better coding practices here at Microsoft in the past 6 1/2 years.)

The problem with them is that they catch not only the benign stuff, but they also catch and suppress the C# runtime equivalent of the Zombie Apocalypse. If you get thread-abort, out-of-memory, or stack-overflow exceptions thrown back at you, you don't want to suppress those. Once you run into these, your code has ignored all the red flags and exhausted its resources and whatever it was that you called didn't get its job done and likely sits there as a zombie in an undefined state. That class of exceptions is raining down your call stack like a shower of knife blades. They can't happen. Your code must be defensively enough written to never run into that situation and overtax resources in that way; if it does without you knowing what the root cause is, this is an automatic "Priority 0", "drop-everything-you're-working-on" class bug. It certainly is if you're writing services that need to stay up 99.95%+.

What do we do? if we see any of those exceptions, it's an automatic death penalty for the process. Once you see an unsafe out-of-memory exception or stack overflow, you can't trust the state of the respective part of the system and likely not the stability of the system. Mind that there's also a "it depends" here; I would follow a different strategy if I was talking about software for an autonomous Mars Rover that can't crash even if its gravely ill. There I would likely spend a few months on the exception design and "what could go wrong here" before even thinking about functionality, so that's a different ballgame. In a cloud system, booting a cluster machine that has the memory flu is a good strategy.

Here's a variation of the helper we use:

 public static bool IsFatal(this Exception exception)
{
    while (exception != null)
    {
        if (exception as OutOfMemoryException != null && exception as InsufficientMemoryException == null || exception as ThreadAbortException != null || 
            exception as AccessViolationException != null || exception as SEHException != null || exception as StackOverflowException != null)
        {
            return true;
        }
        else
        {
            if (exception as TypeInitializationException == null && exception as TargetInvocationException == null)
            {
                break;
            }
            exception = exception.InnerException;
        }
    }
    return false;
}

If you put this into a static utility class, you can use this on any exception as an extension. And whenever you want to do a "catch all", you do this:

 try
{
    DoWork();
}
catch (Exception e)
{
    if (e.IsFatal())
    {
        throw;
    }
    Trace.TraceError(..., e);
}

If the exception is fatal, you simply throw it up as high as you can. Eventually it'll end up on the bottom of whatever thread they happen on (where you might log and rethrow) and will hopefully take the process with it. Threads marked as background threads don't do that, so it's actually not a good idea to use those. These exceptions are unhandled, process-terminating disasters with a resulting process crash-dump you want to force in a 24/7 system so that you can weed them out one by one.

(Update) As Richard Blewett pointed out after reading this post, the StackOverflowException can't be caught in .NET 2.0+, at all, and the ThreadAbortException automatically rethrows even if you try to suppress it. There are two reasons for them to be on the list: first, to shut up any code review debates about which of the .NET stock exceptions are fatal and ought to be there; second, because code might (ab-)use these exceptions as fail-fast exceptions and fake-throw them, or the exceptions might be blindly rethrown when marshaled from a terminated background thread where they were caught at the bottom of the thread. However they show up, it's always bad for them to show up.

If you catch a falling knife, rethrow.

Are you catching falling knives?

Additional resources