I recently picked up John Robbins’ excellent Debugging Microsoft .NET 2.0 Applications and was flipping through it to discover just what new things I would learn and/or remind myself of. His brilliant description of setting up a local symbol server could not have been more well-timed – I spoke with a customer literally the next day who a symbol server was the perfect solution for, and rather than having them wade through the documentation to understand exactly how to do this, I could just lift up the book (as I said, it was litterally the next day, so it was still in my bag) and say “get this and go to this chapter.” Saved me a lot of time, and solved a serious customer isue.
However, I did notice one thing that may cause some grief on Windows Vista due to Session 0 isolation. In Windows Vista, you see, services alone run in Session 0. This provides defense in depth against elevation of privlege attacks (supported also by MIC, UIPI, etc.). You can no longer send a windows message to a service. The boundary of a windows message, you see, is the desktop. Each desktop lives in a Windows Station, each Windows Station lives in a Session, and if you aren’t even in the same session, you certainly can’t send a windows message.
How is this relevant? Well, here John talks about using gflags.exe to configure startup options for the service. He sets everything up correctly, and launches windbg when the service is launched. Except he is launching windbg directly. In the same desktop. The desktop you are no longer looking at, because you are in Session 1 or higher, and not Session 0. An easy assumption to make, and one that service debuggers have been using successfully for quite a long time. Not a huge deal, but you can set up an even better solution using the implementation of remote debugging in our debugging tools.
Here is how you can set up and debug a service on Vista and have everything appear in your default desktop:
- Configure the service control manager to be a bit more patient. It likes to detect hung services and restart them, and if you are debugging you are intentionally hanging the service in the debugger and you don’t want it restarted. So you can add a new DWORD value called ServicesPipeTimeout to HKEY_LOCAL_MACHINE SYSTEM CurrentControlSet Control, and set the value as the number of milliseconds you want to configure the timeout for. Set this guy to 28,800,000 and you’ll get 8 hours of interruption-free debugging. (Obviously, you probably want to set this back or remove it when you are done, because it’s nice not to have hung services just stay hung for 8 hours.)
- Tell the SCM to start the debugger. But, instead of launching WinDBG, launch NTSD and set up a remote. In the registry again, configure HKEY_LOCAL_MACHINE SOFTWARE Microsoft Windows NT CurrentVersion Image File Execution Options <name of the service exe> Debugger, and instead of directly feeding it WinDBG, tell it to launch NTSD and set up a remote. On my machine, I have my debugging tools installed to c:debuggers, so I provide the value of C:Debuggersntsd.exe -server tcp:port=1234 -noio. (Any unused TCP port will do – pick your favorite.)
- Reboot the machine, so the SCM will pick up the new information.
- Attach WinDBG to the remote you now have running using windbg.exe -remote tcp:server=localhost,port=1234
- Start debugging!
Of course, you can also use the gflags approach John recommends. The only difference here is setting up a remote rather than launching WinDBG directly. Now, don’t let this dissuade you from picking up John’s book (or attending one of his classes in person). Obviously none of you write applications with bugs, but I bet your co-workers do, and you wouldn’t want to miss out on all of the great tips in here!
I have heard some feedback from folks who are trying to use this technique to debug services experiencing some problems. If the service you are attempting to debug is an auto-start service (I recommend you change this configuration for the service you are trying to debug), you will block any demand start services. They will queue up behind the auto-start services that the service control manager is attempting to start.
As an example, this will block UAC from working correctly, as the AppInfo service (which is the back end for elevation) is a demand start service.
Other demand start services will see similar behavior. Just wanted to make sure you know what to expect.
(I also updated the timeout to actually be 8 hours instead of 0.8 hours – I neglected a 0.)