Windows Error Reporting and Online Crash Analysis are your friends.


I normally don’t do “me too” posts, since I figure that most of the people reading my blog are also looking at the main weblogs.asp.net/blogs.msdn.com feed, but I felt obliged to chime in on this one.

A lot of people on weblogs.msdn.com have been posting this, but I figured I’d toss in my own version.

When you get an “your application has crashed, do you want to let Microsoft know about it?” dialog, then yes, please send the crash report in.  We’ve learned a huge amount of where we need to improve our systems from these reports.  I know of at least three different bug fixes that I’ve made in the audio area that directly came from OCA (online crash analysis) reports.  Even if the bugs are in drivers that we didn’t write (Jerry Pisk commented about creative lab’s drivers here for example), we still pass the info on to the driver authors.

In addition, we do data mining to see if there are common mistakes made by different driver authors and we use these to improve the driver verifier – if a couple of driver authors make the same mistake, then it makes sense for us to add tests to ensure that the problems get fixed on the next go-round.

And we do let 3rd party vendors review their data.  There was a chat about this in August of 2002 where Greg Nichols and Alther Haleem discussed how it’s done.  The short answer is you go here and follow the instructions.  You have to have a Verisign Class 3 code-signing ID to do participate though.

Bottom line: Participate in WER/OCA – Windows gets orders of magnitude more stable because of it.  As Steve Ballmer said:

About 20 percent of the bugs cause 80 percent of all errors, and — this is stunning to me — one percent of bugs cause half of all errors.

Knowing where the bugs are in real-world situations allows us to catch the high visibility bugs that plague our users that we’d otherwise have no way of discovering.


Comments (30)

  1. Mike Dimmick says:

    It still costs a bomb to get the code-signing ID that you need, but I suppose you don’t want to send user’s dumps to just anybody.

    We still need more entrants in the PKI marketplace. FreeSSL is by far the cheapest for SSL certificates, but isn’t supported on Pocket PC, for example (I was looking at Exchange ActiveSync last week – having just set up my employer’s first Exchange server – nothing like running before you can walk!)

  2. could MS do some sort of stability manager

    reporting uptimes and signalling non-certified software/drivers ?

  3. PocketPC doesn’t let you add a new CA? I didn’t know that.

    You’re right that the cost of entry is high, but that barrier to entry is there to ensure that the person getting the dumps for an application is the legitimate owner of the application. It’d be really bad if Sun could sign up for Oracle’s crash reports.

  4. Catatonic says:

    Exactly, if it gets too easy to obtain a certificate, then why should I trust them anymore?

  5. Stefan,

    How would we be able to do this for non certified software? If "client.exe" crashes, which vendors "client.exe" is it? The executable isn’t signed, so we don’t know that it’s Asheron’s Call that crashed (instead of someone’s LOB application).

  6. carlos says:

    1. Create a self-signed certificate.

    2. Use this certificate to sign your application.

    3. Issue your application.

    4. Use the certificate to sign a request to access the Windows Error Reporting data for your application.

    You have proved that the entity requesting the crash reports is the same entity that created the application. Verisign wasn’t involved (or any other CA) so it didn’t cost $400.

    This doesn’t work because Microsoft insist on a Verisign certificate, but this seems to be a commercial requirement rather than a technical requirement.

    I suspect that the real reason is to suppress demand for the crash reports – running this service can’t be cheap for Microsoft.

  7. Could be Carlos, but I’m not 100% sure about it.

    How can Microsoft trust your root CA? If the cert is self-signed, there’s no way of knowing who you are. At least if you’ve signed up for a code signing cert with verisign, we can prove that you (a) have $700 to pay for the certificate and (b) have proven (to Verisign) that you really are who you purport to be.

    I don’t believe that Microsoft gets ANY money from this program (I may be wrong), and if that’s the case, why would Microsoft do this simply to enrich Verisign (since they’re the ones getting the money)?

  8. Funny, when I went to https://winqual.microsoft.com , i got a warning about an "unknown Certificate Authority". :)

    The only question I have about the $400 VeriSign fine is why is there only 1 company that can offer this? Surely VeriSign isn’t the only company equipped for this service.

  9. Shannon, I don’t know why the GE CyberTrust Root CA’s not installed in your browser, that’s the root CA for that site.

    I don’t know why VeriSign’s the only company that Microsoft’s contracted with to provide code signing certs, there are probably complicated business reasons I don’t understand behind it.

  10. carlos says:

    You don’t need to know who I am, you just need to know that I’m the same person who signed the software. You don’t need to trust my root certificate to know this.

    On reflection, I think the reasons for requiring a Verisign certificate are more practical. Most software in the field is either unsigned (in which case Microsoft needs a Verisign certificate to identify the business requesting the crash reports) or signed with a Verisign certificate (so one exists anyway.) There’s probably no-one but me signing software with a self-signed certificate :)

  11. Shannon,

    I checked the cert out in Firefox and it also complained that it couldn’t find the issuing certificate.

    But it also didn’t allow me the option of adding the Microsoft CA. What’s wierd is that even though it doesn’t trust the Microsoft CA, I thought that the SSL cert verification was supposed to walk the chain of trust to the root CA and was supposed to trust a child CA if the parent CA is trusted.

    And in this case, the root CA (Ge CyberTrust) is trusted in Mozilla.

    Very wierd.

  12. Cesar Eduardo Barros says:

    I looked in Mozilla, and the problem is probably that the chain is not there.

    Looking at the certificate hierarchy, I only see winqual.microsoft.com, which is signed by "Microsoft Secure Server Authority". Mozilla might know the root CA, but it can’t know how to get from that certificate to the root CA, unless the server tells it (I know there is a way to make the server send both its certificate and other certificates in the chain, at least in Apache. I don’t know how to do it in IIS).

    So, the fix is to make the server send the full certificate chain, and then I suppose Mozilla will trust the server.

  13. I wonder why IE can walk the chain to the parent CA but firefox/mozilla can’t.

    IE. claims that the issuer of the Microsoft SSA is Microsoft Internet Authority, and the issuer of that cert is the GE CA.

  14. carlos says:

    IE can walk the chain to the parent CA because IE ships with knowledge of "Microsoft Secure Server Authority" and "Microsoft Internet Authority".

    Look under "Intermediate Certification Authorities" in the certificate manager.

  15. carlos says:

    Whoops, I should think before I post…

    The two Microsoft CA certificates don’t appear in "Intermediate Certification Authorities" until you’ve visited a site that uses them.

  16. Mike Dimmick says:

    Pocket PC has no UI to let you add a new CA. It only has UI to let you view or delete CAs on Pocket PC 2003 (it’s also part of the supplied SHELL component in CE 4.x).

    You can add a new CA programmatically. See http://support.microsoft.com/default.aspx?scid=kb;en-us;Q322956. This obviously turns it into a deployment problem.

  17. Cesar Eduardo Barros says:

    Well, I don’t know how MSIE can walk the chain to the parent, but it isn’t there. I just checked with OpenSSL:

    $ openssl s_client -connect winqual.microsoft.com:443 -showcerts

    depth=0 /C=US/ST=Washington/L=Redmond/O=WHDC (Old WHQL)/OU=Microsoft/CN=winqual.microsoft.com

    verify error:num=20:unable to get local issuer certificate

    verify return:1

    depth=0 /C=US/ST=Washington/L=Redmond/O=WHDC (Old WHQL)/OU=Microsoft/CN=winqual.microsoft.com

    verify error:num=27:certificate not trusted

    verify return:1

    depth=0 /C=US/ST=Washington/L=Redmond/O=WHDC (Old WHQL)/OU=Microsoft/CN=winqual.microsoft.com

    verify error:num=21:unable to verify the first certificate

    verify return:1

    CONNECTED(00000003)



    Certificate chain

    0 s:/C=US/ST=Washington/L=Redmond/O=WHDC (Old WHQL)/OU=Microsoft/CN=winqual.microsoft.com

    i:/DC=com/DC=microsoft/DC=corp/DC=redmond/CN=Microsoft Secure Server Authority

    —–BEGIN CERTIFICATE—–

    […]

    —–END CERTIFICATE—–



    Server certificate

    subject=/C=US/ST=Washington/L=Redmond/O=WHDC (Old WHQL)/OU=Microsoft/CN=winqual.microsoft.com

    issuer=/DC=com/DC=microsoft/DC=corp/DC=redmond/CN=Microsoft Secure Server Authority



    No client certificate CA names sent



    SSL handshake has read 1444 bytes and written 324 bytes



    New, TLSv1/SSLv3, Cipher is RC4-MD5

    Server public key is 1024 bit

    SSL-Session:

    Protocol : TLSv1

    Cipher : RC4-MD5

    Session-ID: […]

    Session-ID-ctx:

    Master-Key: […]

    Key-Arg : None

    Start Time: […]

    Timeout : 300 (sec)

    Verify return code: 21 (unable to verify the first certificate)



    DONE

    Decoding the certificate it gave me above (openssl x509 -text) I get the same information Mozilla gives me and a bit more, but no copy of the issuer. The only suspicious thing in there is:

    Authority Information Access:

    CA Issuers – URI:http://www.microsoft.com/pki/mscorp/msssa1(1).crt

    CA Issuers – URI:http://corppki/aia/msssa1(1).crt

    Getting that URI gives me a blank HTML page with a 0.1 second redirect to itself. (The CRL one seems valid, however.)

    So, nothing either in the SSL/TLS protocol or in the certificate itself gives me (or my browser) any clues as to where the intermediate certificates are. Windows probably has them built-in, and that’s why MSIE can do the verification.

    BTW, this site really needs a preview mode. Or at least a larger textarea.

  18. Wierd Cesar. I’ve forwarded this to the MS.COM people, I don’t know if I’ll get an answer but…

    Thanks for all that work, I’m actually pretty darned impressed.

  19. Fascinating. I just finished a lengthy email thread with the PM in charge of Windows Security.

    According to him, IE doesn’t hard code the windows CA in it’s cert validation logic. The only reason Windows trusts the cert is that it trusts the root CA (GE Cybertrust CA).

    According to him, there’s something wrong in OpenSSL/Mozilla’s certificate validation logic.

    Unfortunately I wouldn’t even begin to be able to figure out what is wrong with it.

  20. Cesar Eduardo Barros says:

    Tried again with GnuTLS (which is a completely different code base from OpenSSL and Mozilla PSM). Same results: it cannot find the issuer.

    I also can’t even begin to figure out what is wrong with it. Everything I try sees only one certificate, the one for the server itself.

    GnuTLS (-d 3):

    […]

    - Certificate type: X.509

    – Got a certificate list of 1 certificates.

    […]

    OpenSSL (-debug) shows a lengthy dump, in which I can clearly see only one certificate sent by the server (it shows the hex dump of all the packets, and the text inside the certificate is visible).

    The only possibility I can imagine (besides IE caching the certificate from somewhere else) is it using an extension to ask for more certificates than the server sends by default.

    I tried OpenSSL, GnuTLS and Mozilla on a site I knew I had seen a certificate chain on (www2.bancobrasil.com.br) and, as I expected, it returned 3 certificates.

    I think the way to debug it would be to turn off all encryption in IE (making it use the NULL cipher) and capture the traffic with Ethereal, to see if it’s really getting the intermediate certificates from the server. However, I don’t have a Windows machine at home, and in the places where I can use Windows I don’t have either Administrator or root.

  21. around n cold says:

    computer keep cuting off

  22. vince says:

    The server is down, yo.

    Or at least, IE doesn’t want to let me connect to https://winqual.microsoft.com

  23. Of course you also lose out on Windows Error Reporting.

  24. It’s up now, don’t know what happened.

  25. Norman Diamond says:

    Mr. Osterman, your employer does not agree with your opinion about WER/OCA and improving Windows reliability. This posting will be long because your employer’s practices have compounded the situation. This example is not unique, but it is frequent and is 100% reproducible.

    A USB floppy drive can be connected to a computer running Windows 2000, XP, or 2003. In each case the driver’s manufacturer is Microsoft and Windows automatically installs the driver. In Windows 2000 or 2003, the user can click on the icon in the notification area for permission to disconnect hardware, obtain permission to remove the USB floppy, pull out the cable, and continue working. In Windows XP, following the same procedure, Windows XP blue-screens when the cable is pulled out. When Windows XP reboots, ordinarily it does not offer to send a WER. In my experience a change of procedure yielded a WER: one time, as an experiment, I did not click the icon for permission, but simply pulled out the cable. Windows XP blue-screened the same as always. But this time when it rebooted, it offered to send a WER, and I let it.

    During the sending of the WER, a message box said that I could inspect the status for the next 180 days, but did not say how, did not say what program to run or what web site or anything.

    Windows XP includes a Help and Support tool that contains some pretence of troubleshooters but does not seem to have any connection to WER. Though when the Help and Support tool crashed, it offered to send a WER about its own crash, and I let it. No connection between that and the USB floppy driver blue screening though.

    A few months later, by some lucky accident, I discovered a Microsoft web page relating to OCA, and signed on with my passport and found my WER. Microsoft had posted an answer: the problem was in a driver but they didn’t have enough information and I should contact the driver’s manufacturer.

    Well, how do I contact the driver’s manufacturer when the driver is Microsoft? I did a bunch of hunting, there are phone numbers to call if I want to pay 4,200 yen or US$35 in order to inform Microsoft that Microsoft was the maker of the broken driver, but no I don’t want to pay for that. Some more hunting and there was a link that is supposed to provide a way to submit a generic question (not specifically WER/OCA but I was going to specify it in my submission). But clicking that link went to an error page in Microsoft Japan’s site, saying that the link in Microsoft US’s site pointed to a non-existent page in Microsoft Japan’s site.

    OK, more searching, and I found a way to provide feedback about the broken link. Microsoft eventually replied to that, telling me to clear all my cookies and history and make a few other settings in Internet Explorer and try again. I tried again, but the broken link was still broken. I replied to Microsoft’s reply, suggesting ways that they might be able to reproduce the broken link problem that they impose on customers who either have Japanese Windows systems or who are located in Japan, but Microsoft did not reply further and I don’t think they’re going to try to fix their broken links. All of these communications have been with Microsoft US, because I speak English better than Japanese (though I can use Japanese Windows systems in Japan).

    Meanwhile the original problem with the broken Windows XP USB floppy driver isn’t even being addressed. How can victims obey Microsoft’s instructions to contact the driver’s manufacturer when the driver’s manufacturer is Microsoft?

    If you think you want WER and OCA to accomplish anything, then you’ve got a huge job ahead of you to educate the rest of your employer.

  26. When a Windows program crashes, Windows XP gives you the opportunity to send an error report to Microsoft. The process is called Online Crash Analysis. My advice: Do it. Here’s a perfect example of why it’s good for you and for your fellow PC users. For years, I’ve encountered a sporadic…

  27. Fox TV here in the US has a show called "House".  Valorie and I started watching it sometime towards…