DebugDiag – The Saga to RTM


Ok, I am going to try something a little more dicey with this blog entry, but since I am very PO’d about the subject right now, here goes the fireworks… 😉


Some Context, Please…


For the past several months, some of you have probably been hearing murmurs of a cool IIS utility called Debug Diagnostics Tool, DebugDiag for short. It is one of those classic internal skunk works tools, two years in the making, and built by engineers for engineers to help solve real-world user problems on IIS.


As many of you should know, IIS is basically a really thin but powerful asynchronous IO completion layer sitting on top of HTTP which can execute arbitrary DLLs or EXEs in response to HTTP requests. Ok, the request/response does not even have to be HTTP, and IIS even allows you to insert arbitrary ISAPI DLLs into the request processing pipeline to alter server behavior, but you get the general picture. :-)


Anyways, it is a common occurrence that when you integrate several applications (either custom written or purchased) hosted on top of IIS that SOME sort of conflict is going to happen. The end result is that web applications on your server will either run very slowly, hang/not-respond, crash, or even run for some period of time before crashing/hanging. Now what? How do you troubleshoot this?



I know, I know, most of you think that your developers are heavenly angels that always write perfectly performant and scalable code and does so without defects, while the Evil Empire cannot write any software and as a prime example, IIS just crashes/hangs on its own all the time… but this is my blog entry, so hear me out. :-)


When you see your server hanging, not-responding, or periodically crashing, that is precisely the time when you want to run DebugDiag – to either wait and catch the crash as it happens, or take an inventory of what is going on inside of IIS processes to figure out what is hanging or leaking memory and causing a crash later. This information is often a smoking gun pointing towards the code that is the actual problem… and PSS statistics show that over 90% of their “IIS is crashing/hanging” cases resolve to 3rd party code doing the crashing/hanging. Just something for you to think about…


Now, how does any of this have anything to do with me being PO’d and ranting about something related to DebugDiag? Alright… I’m getting to it already; I just need to set some context.


Chris and Kamal demo’d DebugDiag at TechEd 2005 and will continue to demo it at various MS events because we really want to drive home the point that you can actually help yourself determine what code is the actual culprit with full evidence and engage with their product support, and that most user problems on IIS are not really problems with IIS (thank you, thank you, I really appreciated the phone calls 😉 ).


You see, this is basically goodness all around…


So, Where is DebugDiag?


That’s the million dollar question (literally). Where is DebugDiag? I know that there are various URLs pointing to beta downloads of pre-release versions of DebugDiag, but when is it finally releasing?


Well, I can tell you that the DebugDiag MSI that I built has literally sat on my machine the past 45 days while we hopelessly wrestled with the behemoth that is Windows, trying to jump through all their hoops to release this tool to you… because Microsoft Download Center won’t let us put the MSI up there without a checkbox from someone in Windows, and unfortunately those folks apparently do not want us to ship tools like DebugDiag in between OS releases. We now have many more administrivia hoops to jump through, all of which raise our cost of providing such tools to you, and this is all supposedly in the name of “improving product quality” and “improving customer satisfaction” of Windows.



Yeah, let’s raise customer satisfaction by not giving users any troubleshooting tools so that they cannot figure out what went wrong, and hopefully they will blame someone else. Uh… yeah… right… uh huh, you gotcha. As Raymond Chen recent noted, who do people blame for 3rd party driver BSOD, and where is their customer satisfaction?


For example, we have to pay another party some non-trivial amount of money just to compile our own source code using the exact same system that developers use, just slightly different commandline parameters.


Huh? I’m sorry for being old fashioned, but when I pay a non-trivial amount of money, I expect non-trivial amount of work performed – like writing the MSI installer for DebugDiag or generating the EULA – and not just for someone to sit and push some “build” button on the same compilation system that the same folks in Windows wrote.



I know, we should charge them royalty of the same amount for using the system… or better yet, pay me, and I’ll gladly push the button myself. :-)


Anyways, I think the more fearsome net effect is that from now on, you will find far fewer Windows tools from Microsoft for Windows Vista and beyond – no more Power Toys, no more DebugDiag, no more Resource Kit/Support Tools – because these tools are all technically “illegal” unless they go through the same ship cycle as Windows and achieve “Windows Quality”.



Yeah, like silly administrivia will EVER increase product quality and customer satisfaction.


Oh, this is not the end of this saga by far… I am not sitting in a cubicle, and I am not writing TPS Reports… :-)


//David

Comments (10)

  1. Carlos Terrón says:

    It’s a great tool for troubleshooting IIS problems. I have used it to debug a lot of problems with the IIS, and usually point to the code that has caused de problem :). I hope that the RTM version was published soon.

  2. Nicholas says:

    On an upgraded installation of Windows 2003 Server, my IIS crashes on bootup over and over (I get a w3wp exception over and over, in a window on my desktop), until I view/visit a website hosted by the server. This makes no sense to me. I’ve tried removing PHP and Perl, and it doesn’t fix the problem. I’m not running any other "add-ons", so I’m totally confused.

    Maybe this new program will help me figure out the problem?

  3. David Wang says:

    Nicholas – yes, what you need to do is use a tool like DebugDiag or IIS State ( http://www.iisfaq.com/default.aspx?view=P197 ) to capture the crash in w3wp.exe and report it to the: microsoft.public.inetserver.iis NNTP newsgroup on msnews.microsoft.com so that people can take an analysis of it and recommend further actions.

    Crashes are pretty much instances of arbitrary faulty logic, so it is normal that it does not make any sense at all. The only way to make sense of it is to debug it to figure out the cause of the faulty logic. I do not think you have debugged it, so I expect that it should totally confuse you and make no sense at all.

    Thus, my best advice when you find a crash is to find a way to just debug the problem (or capture the problem so that someone else can debug it) and NOT try to GUESS and do anything drastic like uninstalling programs, reseting ACLs, changing IIS configuration, etc – because those actions may be totally unrelated to the actual issue and may cause more damage in other areas because you are randomly misconfiguring things. That’s how some problems start.

    Why do I suggest this? Because frankly, a crash/hang/leak is the easiest thing for anyone to debug and figure out what is wrong – so do not make an easy task harder by introducing other system configuration changes. :-)

    //David

  4. Ed says:

    I don’t think there’s an x64 PowerToys yet.

    I guess not everyone their understands what your core business really is :(

  5. David Wang says:

    Well, 64bit is really not related to this.

    I think the issue with 64bit is that it is really a tiny minority right now, so it is just going to get ignored until it gets bigger. Only the hardcore, bleeding edge fanatics should be on 64bit right now, and this stuff is usual for them to deal with (having the latest and greatest usually means incompatibilities and missing functionality initially). If you cannot deal with it, then you belong in the mainstream and should wait to move to 64bit.

    Same sort of thing happened in the move from DOS to Windows 3.1, then again to Windows 95, then again to Windows NT, and now to x64 Windows XP Pro. Just wait a while for critical mass and then things will happen.

    But to expect all the bells and whistles of the past decade to magically appear on a completely new architecture and platform – totally unreasonable.

    At this point, I think that PowerToys is the least of your concerns. Where is the massive choice of devices with optimized drivers, native 64bit productivity software, security software, etc ?? Yup, not even there yet, so you just have to deal with the bledeing edge…

    Enjoy :-)

    //David

  6. Alicia says:

    I have to say that this is THE tool for any IIS Admin’s bag of tricks. I’ve been playing with the beta for a couple of months and it’s already pointed me to quite a few problem areas. It’s nice being able to go back to our third party vendors and say "it’s not IIS, it’s your code. This is what you have to fix".

    The delay in releasing this is beyond belief.

    The first time I ran it I just sat there thinking "thankyou, thankyou, thankyou".

    so THANKYOU :)

  7. David Wang says:

    Alicia – Yeah, DebugDiag is a great tool… especially if you treat it as a source of followup information. Treat it as our attempt to give you the logic/reasoning of some of our best debugging advice so that you can take some sort of action.

    It finally gives you a view of the situation that has always been there but may have been unapproachable by most users. I mean, as Microsoft developers, we use the exact same tools that are freely available to you in the Microsoft Debugging Toolkit to do this troubleshooting. In fact, DebugDiag uses the debugging toolkit’s Debug Engine to do the grunt work. And we still have a ways to go to improve its auto diagnosis.

    Just realize that tools like DebugDiag makes it easier for you as a user, to realize that most of the time something "funny" is going on with your application on IIS, it is not IIS at fault. :-)

    For certain, we are working hard to push this tool to release soon.

    //David

  8. JeremyT says:

    Great work on DebugDiag! It is a fantastic tool and we hope you continue to persue an official release.

    Dont let MS strike back with thier build button cos we want to see return of the Debug Diag

    谢谢

  9. TobiasN says:

    I agree totally with Jeremy, don’t let ’em get ya down Dave. Keep ya chin up and keep clicking Rebuild Solution!!! :) but seriously this IS a great tool.

    We use it constantly, its ability to quickly make sense of "IIS" style dumps is awesome.

    This tool single handedly fixes problems that nothing else seems to catch.

    谢谢你 from Tobias

  10. David Wang says:

    JeremyT, TobiasN – thanks for the encouragement. DebugDiag is getting on track to be released; I have some details to follow up on.

    //David