How Stuff Works

Article
10/26/2006

In reply to a comment I received, I wanted to put together a post about how things work in general, mostly with respect to the SDM as implemented in SCOM 2007.

For those of you familiar with MOM 2005, you'll know that things were very computer centric, which might make the 2007 concepts a bit foreign to you. In trying to bring SCOM closer to a service-oriented management product, the idea that the computer is central has somewhat been removed. I say somewhat, because much of the rest of the world of management has not moved entirely off the computer being of central importance, so we still have to honor that paradigm to make integration with other products easier. One example of this compromise is that on an Alert object you will see the properties NetbiosComputerName, NetbiosDomainName and PrincipalName; these are present for easier integration with ticketing systems that are still largely computer centric.

With this shift in thinking, we are able to model an enterprise to a much finer level of detail on any individual computer, as well as move above a single computer and aggregate services across machines in a single service-based representation. So what does this mean practically? For one, it means potentially a huge proliferation of discovered objects. Instead of just having a computer with a few roles discovered, we can go so far as to model each processor, each hard drive, every process, etc on every box as its own object. Now, instead of seeing a computer, you can actually see a more accurate representation of the things that are important to you in your enterprise. More importantly, however, is that this allows you to define each objects health separately from the health of the objects that depend on it and define how it's health affects the health of those things. For instance, just because one of the hard drives on a particular machine is full, doesn't necessarily mean the health of the machine is bad. Or maybe it it, and that can be modeled as well.

As was the case with MOM 2005, at its core SCOM 2007 is driven by management packs. Managements packs define the class hierarchy that enable this form of deep discovery and they define all the tools necessary to manage these objects.

Let's begin with discussing classes and the class hierarchy. We define a base class that all classes must derive from called System.Entity. Every class ever shipped in any management pack will derive at its core from this class. This class is abstract, meaning that there can never be an object discovered that is just a System.Entity and nothing else. We ship with an extensive abstract class hierarchy that we are still working on tweaking, but should allow for users to plug in their classes somewhere in the hierarchy that makes sense for them. You will be able to define your own abstract hierarchies as well as your own concrete classes. Concrete classes (i.e. non-abstract) are discoverable. Once you define a class as concrete, it's key properties (those that define the identity of an instance of that class) cannot be changed. For instance, if you want to specialize System.Computer, you can't define a new key property on it that would change it's identity, although you are free to add as many non-key properties as you like. In our system, the values of the key properties for a discovered instance are what uniquely identify that instance. In fact, the unique identifier (Guid) that is used internally to identity these instances is actually built off the key property values. Extending this computer example, if you do extend computer yourself, and someone else does as well, it is possible for a computer to be both of your classes at the same time, however, it will be identified as the same instance. You could imagine that Dell ships a management pack and some of your computers as both Dell Computers and Windows Computers, both of which would derive from the non-abstract Computer class that defines it's key properties. Thus an instance that is discovered as both would always we both in every context. The reason that the class of a discovered instance is important is that targeting of all workflows relies on this, but I'll talk more about this a bit later.

In addition to discovering individual instances, the system also allows you to define relationships between instances. The relationships also have a class hierarchy that works very similarly to the class hierarchy for individual instances. The most base class here is call System.Reference and all relationship types (classes) will derive from this. There are two abstract relationship types that are of great importance that I wanted to discuss here. First, there is System.Containment which derives directly from System.Reference. While reference defines a loose coupling of instances, Containment implies that the source object of the relationship in some way contains the target object. This is very important internally to use as containment is used to allow things to flow across a hierarchy. For instance, in the UI alert views that might look at a particular set of instances (say Computers) will also automatically include alerts for anything that is contained on that computer. So a computers alert view would show alerts for hard drives on that computer as well. This is something that is an option when making the direct SDK calls, but in the UI it is the method that is chosen. Even more important than this is the fact that security scopes flow across containment relationships. If a user is given access to a particular group, they are given access to everything contained in that group by the System.Containment relationship type (or any type that derives from it). An even more specialized version of containment that is also very important is System.Hosting. This indicates a hosting relationships exists between the source and target where the target's identity is dependant upon its host. For instance, a SQL Server is hosted by a computer, since it would not exist outside the context of that computer. Going back to what I said in the previous paragraph about using key properties of an instance to calculate the unique id of it, we actually also use the key properties of all its hosts to identify it as well. Taking the SQL Server as an example, I can have hundreds of Instance1 SQL Servers running in my enterprise. Are they all the same? Of course not, they are different based on the computer they are on. That's how we differentiate them. Even in the SDK, when you get MonitoringObjects back, the property values that are populated include not only the immediate properties of the instance, but also the key properties of the host(s).

All the examples I've mentioned thus far talk about drilling down on an individual computer, but we can also build up. I can define a service as being dependant on many components that span across physical boundaries. I can use the class hierarchy to create these new service types and extend the relationship type hierarchy to define the relationships between my service and its components.

Before we talk about how all these instances get discovered, let's talk about why being an instance of a particular type is important. SCOM actually uses the class information about a discovered instance to determine what should run on its behalf. Management pack objects, such as rules, tasks and monitors, are all authored with a specific class as their target. What this actually means is that the management pack wants the specified workflow to run for every instance of that class that is discovered. If I have a rule that monitors the transaction log for SQL, I want that rule deployed and executed on every machine that has a SQL server discovered. What our configuration service does is determine what rules need to be deployed where, based on the discovered instance space and on where those particular instance are managed (and really, if they are managed, although that is a discussion for another post). So another important attribute about instances, is where they are being managed; essentially every discovered instance is managed by some agent in your enterprise, and it's usually the agent on the machine where the instance was discovered. When the agent receives configuration from the configuration service, it instantiates all the workflows necessary for all the instances that it manages. This is when all the discovery rules, rules and monitors will start running. Tasks, Diagnostics and Recoveries are a bit different in that they run on demand, but when they are triggered, they will actually flow to the agent that manages the instance that workflow was launched against. Class targeting is important here as well, as Tasks, Diagnostics and Recoveries can only execute against instances of the class they are targeted to. It wouldn't make sense, for instance, to launch a "Restart Service" task against a hard drive.

Discovering instance and relationships is interesting. SCOM uses a "waterfall" approach to discovery. I will use SQL to illustrate. We'll begin by assuming we have a computer discovered. We create a discovery that discovers SQL servers and we'll target it the Computer. The system will then run this discovery rule on every Computer it knows about. When it runs on a computer that in fact has SQL installed, it will publish discovery data to our server and a new instance of SQL Server will be instantiated. Next, we have a rule targeted to SQL Server that discovers individual databases on the server, Once the configuration service gets notified of the new SQL instance, it will recalculate configuration and publish new configuration to the machine with the SQL server that includes this new discovery rule. This rule will then run and publish discovery information for the databases on the server. This allows deep discovery to occur without any user intervention, except for actually starting the waterfall. The first computer needs to be discovered, either by the discovery wizard, manual agent installation or programmatically via the SDK. For the latter, we support programmatic discovery of instances using the MCF portion of the SDK. Each connector is considered a discovery source and is able to submit discovery data on its behalf. When the connector goes away, all instances that were discovered solely by that connector also go away.

The last thing I wanted to talk about were Monitors. Monitors define the state of an instance. Monitors also come in a hierarchy to help better model the state of an instance. The base of the hierarchy is called System.Health.EntityState and it represents THE state of an instance. Whenever you see state in the UI, it is the state of this particular monitor for that instance, unless stated otherwise. This particular monitor is an AggregateMonitor that rolls up state for its child monitors. The roll up semantics for aggregates are BestOf, WorstOf and Percentage. At the end of a monitor hierarchy chain must exist either a UnitMonitor or a DependencyMonitor. A UnitMonitor defines some single state aspect of a particular instance. For example, it may be monitoring the value of a particular performance counter. The importance of this particular monitor to the overall state of the instance is expressed by the monitor hierarchy it is a part of. Dependency monitors allow the state of one instance to depend on the state of another. Essentially, a dependency monitor allows you to define the relationship that is important to the state of this instance and the particular monitor of the target instance of this relationship that should be considered. One cool thing about monitors, is that their definition is inherited based on the class hierarchy. So System.Health.EntityState is actually targeted to System.Entity and thus all instance automatically inherit this monitor and can roll up state to it. What this means practically is that if you want to specialize a class into a class of your own, you don't need to redefine the entire health model, you can simply augment it by deriving your class from the class you which to extend and adding your own monitors to your class. You can even simply add monitors to the existing health model by targeting them anywhere in the class hierarchy that makes sense for you particular monitor.

As always, let me know if there are any questions, anything you would like me to elaborate on or any ideas for future posts.

How Stuff Works

Additional resources