Watson’s revenge


My dear readers, I have a terrible admission to make.  But
it’s time to come clean with you.  The fact is, of our hundreds of thousands
of users, a small number encounter crashing bugs in Visual Studio.  They
are working happily along, and *boom* some terrible crash will occur.  Then,
a much-dreaded dialog will come up, saying:

“Microsoft Development Environment has encountered
a problem and needs to close.  "urn:schemas-microsoft-com:office:office" />We
are sorry for the inconvenience.”

And then a little farther down it says…

“Please tell Microsoft about this problem.

We have created an error report that you can
send to help us improve Microsoft Development Environment.  We
will treat this report as confidential and anonymous.

To see what data this error report contains, click
here
.”

Then you see three buttons: Debug, Send Error Report, and Don’t Send.

[It occurs to me that the lack of the word “The” in front of “Microsoft Development
Environment” above makes it look like English is our second language.  I’ll
have to bug somebody about that…]

Obviously, in a perfect world users would never see this dialog.  It
means that our users had VS crash, possibly losing data.  During
product development we treat any bug that could make VS crash or otherwise lose data
as extremely serious, and few such bugs make it into the shipping product.  But
it does, alas, happen.

So, what does this dialog do, and what does it all mean?  We
call this the “Watson” dialog (not to be confused with the Dr.
Watson
tool, which is related but different).  The
dialog exists so that when a user hits a crash, if they choose to hit the “Send Error
Report” button then a condensed stack dump, which we call a minidump,
gets sent to a server at Microsoft.  If
you’re curious or paranoid, click where it says “click here” and you can see exactly
what’s going to be sent to our server.

Once that happens, you can hopefully go on with your work.  On
our side, however, the work is just beginning.  We
have people who go through the reported crash dumps and then open bugs in the main
product bug databases so that our developers can look at the minidumps and try to
figure out why the product crashed.  This
is an extremely unpleasant job, because the information in the minidump is so scant,
and sometimes we can’t piece together the cause of the crash.  Often,
however, we can figure out what went from the callstack and make a fix.  If
a lot of people are hitting a crash in an already-shipped product then we consider
rolling the fix into a service pack.  If
the report is against a version under development (e.g. an alpha or beta release),
or if a very tiny number of people are hitting a crash in an already-shipped versoin,
we generally roll the fix into the version under development.

There is, of course, a certain amount of “cosmic justice” in making VS developers
suffer through the investigation of these Watson reports, since the pain of having
the product crash from under you is much worse.  And
it’s a great motivation to code carefully so as to avoid code paths that could lead
to crashes.  (Which we do already through
many mechanisms, but I figure every extra drop of motivation is a good thing.)

If you are ever unfortunate enough to see this dialog, please accept my apologies
in advance.  But if it does every happen,
I hope you now understand what it means and I hope you will hit the “Send Error Report”
button so that we at Microsoft can get your crash report and investigate it.

That’s all for now! -Chris

Comments (12)

  1. MartinJ says:

    <quote>sometimes we can’t piece together the cause of the crash</quote>

    What about if you provided the option for the end user to be able to track the problem? Something similar to what happens when Windows crashes. That way, if the cause can’t be determined, the end user could supply this information (either in a comment added to the problem, or by phone).

  2. Chris Flaat says:

    Good point — actually we already do this, I just failed to mention it. Thanks. -Chris

  3. MartinJ says:

    I guess that I’ve just been lucky and haven’t crashed the IDE, yet. (I liked the un-implemented (or badly implemented) features that were in beta 2. You got the great "LAME!" message.)

    Now, I’ve had to kill VS because it wasn’t happy with some XML schema I was making. I think the IDE went into an infinite loop somewhere. Yeah, my memory usage started to climb. But, with the insane amount in my machine, I had to kill it before it could run out or cause page thrashing…

  4. pUnk says:

    Are you kidding me? I see this dialog between 5-10 time a day! Every time I close the environment (VS2003) I get this, and the next time I launch all my toolbars and prefs are trashed. On another machine, it works fine. Yes, I’ve sent the error report multiple times.

  5. dan says:

    pUnk – can you post the bucket number for the crash you keep getting? It’s in the event log, the application one – for each crash there will be two Error entries, the second one should have an 8(?) digit bucket number. It’s possible I can track down whether this has been fixed yet if I have the bucket. I can also see if many people are hitting it, or just you 🙂 (not that that would be any consolation, I know) — Dan [ms]

  6. pUnk says:

    Thanks Dan! I haven’t sent the error for a while, but the most recent bucket number is 49178620.

  7. diana says:

    ok,, this has been going on for over a year now and no one in ms has helped me…..
    just give me a clue how i can translate this so i can fix it myself….. kb is for s%&(&*^t when it comes to giving meaningful advise……all it does is tell you your error message may be caused by a conflict……reinstall or upgrade…. geeeeeeee. we want meaningful troubleshooting info to do it ourselves. just tell us how to identify the codes and where on the system we can find the addresses so we know what devise / driver is at that location and do what is neccessary from there……..
    please help me on this

    i fixed all the other problems without reinstalling or upgrading.


    Windows 98 Version 4.10.2222

    IEXPLORE caused an invalid page fault in

    module KERNEL32.DLL at 0187:bff9da1a.

    Registers:

    EAX=020700a8 CS=0187 EIP=bff9da1a EFLGS=00010202

    EBX=00000000 SS=018f ESP=0206ff8c EBP=020700ac

    ECX=02070058 DS=018f ESI=020700cc FS=1aef

    EDX=0207011c ES=018f EDI=0207011c GS=0000

    Bytes at CS:EIP:

    50 51 ff 75 08 8d 85 e0 fe ff ff 50 ff 75 0c e8

    Stack dump:

  8. diana says:

    ok,, this has been going on for over a year now and no one in ms has helped me…..
    just give me a clue how i can translate this so i can fix it myself….. kb is for s%&(&*^t when it comes to giving meaningful advise……all it does is tell you your error message may be caused by a conflict……reinstall or upgrade…. geeeeeeee. we want meaningful troubleshooting info to do it ourselves. just tell us how to identify the codes and where on the system we can find the addresses so we know what devise / driver is at that location and do what is neccessary from there……..
    please help me on this

    i fixed all the other problems without reinstalling or upgrading.


    Windows 98 Version 4.10.2222

    IEXPLORE caused an invalid page fault in

    module KERNEL32.DLL at 0187:bff9da1a.

    Registers:

    EAX=020700a8 CS=0187 EIP=bff9da1a EFLGS=00010202

    EBX=00000000 SS=018f ESP=0206ff8c EBP=020700ac

    ECX=02070058 DS=018f ESI=020700cc FS=1aef

    EDX=0207011c ES=018f EDI=0207011c GS=0000

    Bytes at CS:EIP:

    50 51 ff 75 08 8d 85 e0 fe ff ff 50 ff 75 0c e8

    Stack dump:

  9. diana says:

    ok,, this has been going on for over a year now and no one in ms has helped me…..
    just give me a clue how i can translate this so i can fix it myself….. kb is for s%&(&*^t when it comes to giving meaningful advise……all it does is tell you your error message may be caused by a conflict……reinstall or upgrade…. geeeeeeee. we want meaningful troubleshooting info to do it ourselves. just tell us how to identify the codes and where on the system we can find the addresses so we know what devise / driver is at that location and do what is neccessary from there……..
    please help me on this

    i fixed all the other problems without reinstalling or upgrading.


    Windows 98 Version 4.10.2222

    IEXPLORE caused an invalid page fault in

    module KERNEL32.DLL at 0187:bff9da1a.

    Registers:

    EAX=020700a8 CS=0187 EIP=bff9da1a EFLGS=00010202

    EBX=00000000 SS=018f ESP=0206ff8c EBP=020700ac

    ECX=02070058 DS=018f ESI=020700cc FS=1aef

    EDX=0207011c ES=018f EDI=0207011c GS=0000

    Bytes at CS:EIP:

    50 51 ff 75 08 8d 85 e0 fe ff ff 50 ff 75 0c e8

    Stack dump:

  10. Raymond Chen says:

    Artificial intelligence has not yet reached the point where crashes can be reliably diagnosed by a computer. The information above is insufficient to determine exactly what went wrong but I can tell that the proximate cause was a stack overflow. I suspect some plug-in has a recursion bug and is blowing its stack.

  11. Adi says:

    BUZZZ.EXE is a program written in Visual C++ 6.0

    BUZZZ caused an invalid page fault in
    module BUZZZ.EXE at 0187:00401732.
    Registers:
    EAX=00000001 CS=0187 EIP=00401732 EFLGS=00010202
    EBX=00560000 SS=018f ESP=0066f6f8 EBP=0066f74c
    ECX=00000002 DS=018f ESI=816e4860 FS=68d7
    EDX=00000000 ES=018f EDI=0066f74c GS=0000
    Bytes at CS:EIP:
    39 4a 04 75 2c 8b 45 f8 83 c0 01 8b 4d 08 39 41
    Stack dump:
    0066fdf8 816e4860 00560000 cccccccc cccccccc cccccccc cccccccc cccccccc cccccccc cccccccc cccccccc cccccccc cccccccc cccccccc cccccccc cccccccc

    WHAT IS THAT ERROR,PLS???? THANKS!!!

Skip to main content