Watson's revenge

My dear readers, I have a terrible admission to make.  But
it's time to come clean with you.  The fact is, of our hundreds of thousands
of users, a small number encounter crashing bugs in Visual Studio.  They
are working happily along, and *boom* some terrible crash will occur.  Then,
a much-dreaded dialog will come up, saying:

“Microsoft Development Environment has encountered
a problem and needs to close. We
are sorry for the inconvenience.”

And then a little farther down it says…

“Please tell Microsoft about this problem.

We have created an error report that you can
send to help us improve Microsoft Development Environment. We
will treat this report as confidential and anonymous.

To see what data this error report contains, click
here
.”

Then you see three buttons: Debug, Send Error Report, and Don’t Send.

[It occurs to me that the lack of the word “The” in front of “Microsoft Development
Environment” above makes it look like English is our second language.  I’ll
have to bug somebody about that…]

Obviously, in a perfect world users would never see this dialog.  It
means that our users had VS crash, possibly losing data.  During
product development we treat any bug that could make VS crash or otherwise lose data
as extremely serious, and few such bugs make it into the shipping product.  But
it does, alas, happen.

So, what does this dialog do, and what does it all mean?  We
call this the “Watson” dialog (not to be confused with the Dr.
Watson
tool, which is related but different).  The
dialog exists so that when a user hits a crash, if they choose to hit the “Send Error
Report” button then a condensed stack dump, which we call a minidump,
gets sent to a server at Microsoft.  If
you’re curious or paranoid, click where it says “click here” and you can see exactly
what’s going to be sent to our server.

Once that happens, you can hopefully go on with your work.  On
our side, however, the work is just beginning.  We
have people who go through the reported crash dumps and then open bugs in the main
product bug databases so that our developers can look at the minidumps and try to
figure out why the product crashed.  This
is an extremely unpleasant job, because the information in the minidump is so scant,
and sometimes we can’t piece together the cause of the crash.  Often,
however, we can figure out what went from the callstack and make a fix.  If
a lot of people are hitting a crash in an already-shipped product then we consider
rolling the fix into a service pack.  If
the report is against a version under development (e.g. an alpha or beta release),
or if a very tiny number of people are hitting a crash in an already-shipped versoin,
we generally roll the fix into the version under development.

There is, of course, a certain amount of “cosmic justice” in making VS developers
suffer through the investigation of these Watson reports, since the pain of having
the product crash from under you is much worse.  And
it’s a great motivation to code carefully so as to avoid code paths that could lead
to crashes.  (Which we do already through
many mechanisms, but I figure every extra drop of motivation is a good thing.)

If you are ever unfortunate enough to see this dialog, please accept my apologies
in advance.  But if it does every happen,
I hope you now understand what it means and I hope you will hit the “Send Error Report”
button so that we at Microsoft can get your crash report and investigate it.

That’s all for now! -Chris