Plan for Diagnostics in Cloud Computing From the Git-Go

“Git-Go” is something we say in the South that means “right at the start”. I’ve seen several applications for on-premise systems that don’t have much in the way of diagnostics - the developers rely on a debugger, the event logs on the server and client workstation, and most of all, the ability to watch the system from end-to-end.

This approach is a mistake for an on-premise system, and it’s definitely a problem for a distributed architecture. You simply do not own all of the components from end to end in a cloud environment, nor are you always able to attach a debugger or other remote monitoring tools to the various areas within the code path. So you need to make sure that from the very outset of your design that you build in diagnostics. My personal preference is to build a system such that a control file turns on deeper information gathering from the system, up to a minimal level.

When I do that, I set a high level of logging, a medium level, and a moderate level. I normally use the deepest level of information during the testing and acceptance phase of the deployment, then switch to moderate and then the least level of information gathering. Also in my design I often set an error condition to begin gathering the deeper information along with the exception, where possible.

There are decisions you need to make as to where to store the diagnostics (many operations in the cloud cost money), how often you collect them, and so on. You can get a quick overview on using the diagnostics that come with Windows Azure here: This is where you should start first. More detail on that:

My friend Dave Pallman has a great tool he’s released for free:

If the issue is in storage apps:

If you have System Center, this is the quickest and easiest way to implement the monitoring – really handy:

Comments (0)

Skip to main content