Where is thy instance monitored and how that affects dependency monitor state?

Article
11/14/2008

This area is rather complex and multiple factors weight in while configuration service makes the final decision, but I will try to provide somewhat simplified, rule of thumbs examples/points to help to make an educated guess while troubleshooting some state issues. I will start with dependency monitor.

Dependency monitor

Dependency monitor is used to roll up health of other objects (contributors) across a relationship and changing the state of target instance according to used algorithm. (Following is an article describing how to create this monitor using authoring section of operations console MSDN).

Roll up is an important term here! You could “visualize” our monitoring topology as some kind of root system with RMS at the top, followed by layer of management servers and with agents at the bottom. Such perspective then evokes that health state only rolls up from agent to server. Such observation is very correct, BUT one must add that rollup works within self as well.

When rollup works

In other words, dependency monitor is able to rollup state when contributor resides on (is monitored by) the agent while target instance is monitored by server. Monitor is also able to roll up state when both contributors and target are monitored by same health service (regardless of it being agent or server)

When rollup doesn’t work

Above statement also means that rollup won’t work when target is monitored by agent while contributor resides on sibling agent or on server! (Just imagine “root system” again please).

NOTE: This statement is correct at least for all up to date releases of Operations Manager including OpsMgr2007 R2 future release.

Where is this instance monitored?

1000000$ question right there! If you are an operator, you should not need to worry, unfortunately once you become MP author, answering this question may happen to “explain” some of your issues. As I tried to hint, there is many factors affecting final place where instance is monitored, but there are some hints that provide educated guess.

1. Hosting: Due to its nature, hosting is relationship that binds life time of the host to life time of the hosted instance. For that reason, same health service monitoring host instance will also monitor hosted instance. To put it into perspective and provide real life sample, everything hosted by the instance of the computer will be monitored by health service running on that computer.

2. Group and singleton: instances of such managed entity type are almost always monitored by Root Management Server. This is true for all extensions of original base class. Real life example could use “All managed computers group” as something that is monitored by RMS.

3. Logical entity and some others: (as well as all types extending those types like System.ApplicationComponent and Microsoft.Windows.ApplicationComponent) are possibly monitored by RMS. Instances of those ME types are “tricky” as long as they do not bring in “Hosted=’true’ ” attribute. What I mean by this is the fact that if agent discovers instance of this ME type, proxying must be enabled for discovery to succeed and insert the instance into operational DB (not very known fact, but fact it is!). If proxy was enabled and no additional work done, then instance is always monitored by RMS.

NOTE: I will discuss how to make discovering health service the one to perform monitoring in some future article!

Real life example of how to use this knowledge

Kevin Holman was asked by customer to provide state view which displays state of “Computer Down” monitor. (This was really not the first time I saw similar request, and vast majority of the times, same solution was tried and failed – well, it is an obvious one especially when you are not aware of the dependency rollup limitation – and original plan probably was that you as an author shouldn’t have to be aware of that limitation, but it is here now and needs to be considered.)

Solution which failed

· Created a custom class derived from local application (Microsoft.Windows.LocalApplication) and discover its instance based on registry while targeting instance of the computer (Microsoft.Windows.Computer).

· Create containment relationship between this class and health service watcher (requested “Computer Unreachable” monitor is part of HSW health state) followed by the creation of the dependency monitor to rollup health so health of the custom instance changes with the health changes on requested monitor.

· Create a state view to visualize particular monitor’s state (thru the health of the instance of custom class)

So why would this solution fail? Here is the recap.

· Custom class extends ME type which is already hosted. It is hosted by the computer (that is why discovery required targeting instance of the computer – to provide PrincipalName value for one of the keys). Based on this post, it means that instance of the custom class will be monitored by the same health service as one running on the computer.

· Relationship is starting at RMS. Instances of health service watcher are monitored by RMS. ME type used for HSW is hosted by group and as I said above, instance of group is almost always monitored by RMS.

· Place where contributor is monitored breaks dependency monitor - remember its limitation about rolling up and not down!

· MP verification is currently UNABLE TO WARN you about this problem, its implementation has no knowledge about instance space, it only understands type space of the installation.

Here are snapshots I received from Kevin. They display the problem as observed in health explorer.

computer not reachable

monitor doesn't rollup

How should this be solved?

By evaluation of the failure for original proposal, we can see that instances of health service watcher are monitored by RMS. This article also suggests that dependency monitor is able to roll up the health state when both contributor and target are monitored by same health service. So in fact, we are able to “reuse” the idea from original approach. Only change required is that we simply need to assure that instances of the custom class are monitored by RMS as well. There is a couple of possible ways how to do that, I chose following:

Class: extending ApplicationComponent. Such managed entity is not hosted so is easily discovered by RMS (while targeting instance of the Health Service Watcher).

</ClassType>

Relationship: is containment with HSW as target. This is base stone for dependency monitor.

<Source>Microsoft.SystemCenter.Community.InstanceSpace.ComputerUnreachable.Holder</Source>

<Target>SCLibrary!Microsoft.SystemCenter.HealthServiceWatcher</Target>

</RelationshipType>

Dependency monitor: uses requested monitor as contributor. In this case it is “Computer No Reachable” monitor belonging to HealthServiceWatcher. This monitor also uses state Error when contributing instance is in Maintenance Mode – there is no way to insert custom class into MM when health service watcher class enters maintenance mode, so leveraging this feature of the dependency monitor helps (at least in my opinion).

<Category>StateCollection</Category>

<Algorithm>WorstOf</Algorithm>

<MemberInMaintenance>Error</MemberInMaintenance>

</DependencyMonitor>

Once attached MP is imported, you can observe it picks state from “Computer Down” monitor and presents such in separate view. Below is state view with health explorer.

initial rollup from HSW

rollup of computer not rechable

WARNING: this solution may not scale well in big environment!

Reason for scalability issue lies in the fact of having as many instances of health service watchers as many health services are present in management group (except health service for RMS). Provided solution uses instance of HSW as target for discovery of the custom class, which ultimately means that many workflows are loaded by RMS. There is also additional pressure from dependency monitor (each dependency monitor contributes with another (yet hidden) workflow).

PLEASE consider this an sample, more complex and advanced solution may be needed for real life situation. (Please contact and I will try to help if my bandwidth allows)

DISCLAIMER:

Please evaluate in your test environment first! As expected, this solution is provided AS-IS, with no warranties and confers no rights. Use is subject to the terms specified at Microsoft.

Microsoft.SystemCenter.Community.InstanceSpace.xml