Crashes Are Bad, OK?

It's interesting to see what happens when you get slashdotted…

Let's go back and see what I said in the first place, and let me elaborate just a little – if the code crashes, we have roughly the following scenarios:

  1. It's exploitable, customers aren't happy, you ship a bulletin.
  2. Maybe it is exploitable, maybe not.
  3. It wouldn't be exploitable if Halvar Flake and David Litchfield worked together on it, but it is a flaw.
  4. You decided that the best way to deal with unexpected input is to just take down the app.

To those of you who pointed out that crashes are always bad code – you're right, and I said so in my first post here. I've said it over and over. The reason "Writing Secure Code" is named that way is because I have always admired "Writing Solid Code". It's a mistake, it gives people a lousy user experience, and it isn't the quality the customer expects. However, just because something crashes does NOT mean it constitutes an exploit.

I used to be in the exploit business, first by developing a network assessment tool, and second by having a job as internal penetration tester here. IMNSHO, DoS attacks are for ankle biters, with one exception – if you can take out a server long enough to spoof it, and then use that to gain an escalation of privilege, that's a real exploit. Real exploits gain privilege, gain information, or deprive someone of a needed resource. Causing a client application just to fail is about as much an exploit as ringing someone's doorbell and then running away. It's a childish nuisance, not an attack.

To those of you who thought code should never crash, I'd suggest you read some of Watts Humphrey's work – that simply isn't reality. If you've never shipped a bug, you probably haven't shipped much code. The best programmer I've ever worked with had a bug one time because he failed to check the return from printf, and yes, it was a real problem for that app. It should never crash, but mistakes will happen. We'd all prefer to have code that robustly handles all inputs, but the reality is that flaws will happen, some will be crashes, and if something does crash, that's why we have mechanisms like Watson and document recovery.

In fact, I have a funny story about things handling errors in interesting ways – just before we shipped the first version of the Internet Scanner for NT, we had a problem where the app would just disappear. It didn't repro consistently, but we could make it happen. After a lot of investigation, we found that the X11 library would just call exit() if things got too bad, which of course would take our whole app down. I did my best to make a solid fix, but that code is a mess, and it just wasn't practical. We ended up replacing exit() with a special exception, and if we caught it, we knew what happened.

More on this later…