A Little Scripting Saved My Day (;-])

I have mentioned in previous blog posts that I tend to write many of my blog posts and walkthroughs for IIS.NET based on code that I’ve written for myself, and today’s blog post is the story of how one of my samples saved my rear over this past weekend.

One of the servers that I manage is used to host web sites for several friends of mine. (It’s their hobby to have a web site and it’s my hobby to host it for them.) Anyway, sometime on Sunday someone let me know that one of my sites didn’t seem to be behaving correctly, so I browsed it with Internet Explorer and saw that I was getting an HTTP 503 error. I’ve seen this error when an application pool goes offline for some reason, so I didn’t panic – yet – because I knew that the web site was in a separate application pool. With that in mind, I browsed to a web site that is in a different application pool. Same thing – HTTP 503 error. This was beginning to concern me.

I logged into the web server and ran iisreset from a command-line – this threw the following error - and now I was really starting to become agitated:

I knew that the cause of the error should be in the Windows Event Viewer, so I opened the System log in Event Viewer and saw the following error:

Log Name: System
Source: Microsoft-Windows-WAS
Date: 7/26/2009 10:59:52 AM
Event ID: 5172
Task Category: None
Level: Error
Keywords: Classic
User: N/A
Computer: MYSERVER
Description: The Windows Process Activation Service encountered an error trying to read configuration data from file '\\?\C:\Windows\system32\inetsrv\config\applicationHost.config', line number '308'. The error message is: 'Configuration file is not well-formed XML'. The data field contains the error number.
Event Xml:

Now that I was armed with the file name and line number of the failure in my configuration settings, I was able to go straight to the source of the problem. (I love IIS 7's descriptive error messages - don't you?) Once I opened the file and jumped to the correct location, I saw several lines of unintelligible garbage. For reasons that are still unknown to me – my applicationHost.config file had become corrupted and IIS was dead in the water until I fixed the problem. I looked through the file and removed most of the garbage and saved the edited file to IIS – this got the web sites working, but only partially. Some necessary settings had obviously been removed while I was clearing all of out the unintelligible garbage, and it might take me a long time to discover what those settings were.

The next thing that I did was to take a look in my two readily-accessible backup drives; I have two external hard drives that keep a backup of the web server - one hard drive is directly plugged into the web server via a USB cable, and the other hard drive is plugged into a physically separate server that rotates drives with off-site storage on a monthly basis. The problem is, my weekly backups had just run, so the copy in each backup location had been overwritten with the corrupted version. (I’m going to have to rethink my backup strategy after this – but that’s another story.) The backup copy in my off-site storage location should be intact, but that copy would be a few weeks old so I would be missing some settings, and I would have to drive an hour or so round-trip in order to pick up the drive. This wasn’t an ideal solution – but it was definitely a feasible strategy.

It was at this point that I remembered that I had written following blog post some time ago:

Automating IIS 7 Backups
https://blogs.msdn.com/robert_mcmurray/archive/2008/03/08/automating-iis-7-backups.aspx

I wrote the script in that blog post for the server that I was currently managing, and because of this preventative measure I had dozens of backups going back several weeks to choose from. So I was able to quickly find a copy with no corruption and I restored that copy to my IIS config directory. At this point all of my web sites came online with all of their functionality. Having fixed the major issues, I used WinDiff to verify any settings that might have been changed between the restored copy and the corrupted copy.

So in conclusion, this story had a happy ending, and it left me with a few lessons learned:

  • You can never have too many backups
  • I need to rethink how I roll out my backup strategy with regard to using external hard drives
  • Writing cool scripts to automate your backups can save your rear end

That sums it up for today’s post. ;-]