Sample Alert and State Change Insertion

 Update: I have updated the Management Pack to work with the final RTM bits

 First, a disclaimer. Not everything I write here works on the Beta 2 bits that are currently out. I had to fix a few bugs in order to get all these samples working, so only to most recent builds will fully support the sample management pack. I will, however, provide at the end of a the post a list of the things that don't work =).

I've attached to the post a sample management pack that should import successfully on Beta 2, please let me know if it doesn't and what errors you get. This management pack is for sample purposes only. We will be shipping, either as part of the product or as a web-download, a sealed SDK/MCF management pack that will help alert and state change insertion programmatically and that will support all the things I am demonstrating here.

What I would like to do, is go through this management pack and talk about how each component works, and then include some sample code at the end that goes over how to drive the management pack from SDK code.

This first thing you will notice in the management pack is a ConditionDetectionModuleType named System.Connectors.GenericAlertMapper. What this module type does is take as input any data type and output the proper data type for alert insertion into the database (System.Health.AlertUpdateData). This module type is marked as internal, meaning it cannot be referenced outside of this management pack, and simply provides some glue to make the whole process work.

Next, we have the System.Connectors.PublishAlert WriteActionModuleType which takes the data produced by the aforementioned mapper and publishes it to the database. Regardless of where other parts of a workflow are running, this module type must run on a machine and as an account that has database access. This is controlled by targeting as described in the previous post. This module type is also internal.

Now we have our first two public WriteActionModuleType's, System.Connectors.GenerateAlertFromSdkEvent and System.Connectors.GenerateAlertFromSdkPerformanceData. These combine the aforementioned module types into a more useable composite. They take as input System.Event.LinkedData and System.Performance.LinkedData, respectively. Note, these are the two data types that are produced by the SDK/MCF operational data insertion API. Both module types have the same configuration, allowing you to specify the various properties of an alert.

The last of the type definitions is a simple UnitMonitorType, System.Connectors.TwoStateMonitorType. This monitor represents two states, Red and Green, which can be driven by events. You'll notice that it defines two operational state types, RedEvent and GreenEvent, which correspond to the two expression filter definitions that match on the $Config/RedEventId$ and $Config/GreenEventId$ to drive state. What this monitor type essentially defines, is that if a "Red" event comes in, the state of the monitor is red, and vice-versa for a "Green" event. It also allows you to configure the event id for these events.

Now we move to the part of the management pack where we use all these defined module types.

First lets look at System.Connectors.Test.AlertOnThreshold and System.Connectors.Test.AlertOnEvent. Both these rules use the generic performance data and event data sources as mentioned in an earlier post. They produce performance data and events for any monitoring object they were inserted against, and as such, you'll notice both rules are targeted to Microsoft.SystemCenter.RootManagementServer; only have a single instance of each rule will be running. The nice thing about this is that you can generate alerts for thousands of different instances with a single workflow, assuming your criteria for the alert is the same. Which brings me to the second part of the rule, which is the expression filter. Each rule has its own expression filter module that matches the data coming in to a particular threshold or event number.  Lastly, each includes the appropriate write action to actually generate the alert, and using parameter replacement to populate the name and description of the alert.

The other two rules, System.Connectors.Test.AlertOnThresholdForComputer and System.Connectors.Test.AlertOnEventForComputer, are similar, only they use the targeted SDK data source modules and as such are targeted at System.Computer. It is important to note that targeting towards computer will only work on computers that have database access running under an account that has database access. I used this as an example because it didn't require me to discovery any new objects, plus, I had a single machine install where the only System.Computer was the root management server. The key difference between these two rules and the previous rules is that there will be a new instance of this rule running for every System.Computer object. So you can imagine, if you created a rule like this and targeted to a custom type you had defined for which you discovered hundreds or thousands of instances, you would run into performance issues. From a pure modeling perspective, this is the "correct" way to do it, since logically you would like to target your workflows to your type, however, practically, it's better to use the previous types of rules to ensure better performance.

The last object in the sample is System.Connectors.Test.Monitor. This monitor is a instance of the monitor type we defined earlier. It maps the GreenEvent type state of the monitor type to the Success health state and the RedEvent to the Error health state. It defines via configuration that events with id 1, will make the monitor go red and events with id 2 will make it go back to green. It also defines that an alert should be generated when the state goes to Error and also that the alert should be auto-resolved when the state goes back to Success. Lastly you'll notice the alert definition here actually uses the AlertMessage paradigm for alert name and description. This allows for fully localized alert names and descriptions.

This monitor uses the targeted data source and thus will create an instance of this monitor per discovered object. We are working on a similar solution to the generic alert processing rules for monitors and it will be available in RTM, it's just not available yet.

Now, what doesn't work? Well, everything that uses events should work fine. For performance data, the targeted versions of workflows won't work, but the generic non-targeted ones will. Also, any string fields in the performance data item are truncated by 4 bytes, yay marshalling. Like I said earlier, these issues have been resolved in the latest builds.  

Here is some sample code to drive the example management pack:

using System;

using System.Collections.ObjectModel;

using Microsoft.EnterpriseManagement;

using Microsoft.EnterpriseManagement.Configuration;

using Microsoft.EnterpriseManagement.Monitoring;

 

namespace Jakub_WorkSamples

{

    partial class Program

    {

        static void DriveSystemConnectorLibraryTestManagementPack()

        {

            // Connect to the sdk service on the local machine

            ManagementGroup localManagementGroup = new ManagementGroup("localhost");

 

            // Get the MonitoringClass representing a Computer

            MonitoringClass computerClass =

                localManagementGroup.GetMonitoringClass(SystemMonitoringClass.Computer);

 

            // Use the class to retrieve partial monitoring objects

            ReadOnlyCollection<PartialMonitoringObject> computerObjects =

                localManagementGroup.GetPartialMonitoringObjects(computerClass);

 

            // Loop through each computer

            foreach (PartialMonitoringObject computer in computerObjects)

            {

                // Create the perf item (this will generate alerts from

                // System.Connectors.Test.AlertOnThreshold and

                // System.Connectors.Test.AlertOnThresholdForComputer )

                CustomMonitoringPerformanceData perfData =

                    new CustomMonitoringPerformanceData("MyObject", "MyCounter", 40);

                // Allows you to set the instance name of the item.

                perfData.InstanceName = computer.DisplayName;

                // Allows you to specify a time that data was sampled.

                perfData.TimeSampled = DateTime.UtcNow.AddDays(-1);

                computer.InsertCustomMonitoringPerformanceData(perfData);

 

                // Create a red event (this will generate alerts from

                // System.Connectors.Test.AlertOnEvent,

                // System.Connectors.Test.AlertOnEventForComputer and

                // System.Connectors.Test.Monitor

                // and make the state of the computer for this monitor go red)

                CustomMonitoringEvent redEvent =

                    new CustomMonitoringEvent("My publisher", 1);

                redEvent.EventData = "<Data>Some data</Data>";

                computer.InsertCustomMonitoringEvent(redEvent);

 

                // Wait for the event to be processed

                System.Threading.Thread.Sleep(30000);

 

                // Create a green event (this will resolve the alert

                // from System.Connectors.Test.Monitor and make the state

                // go green)

                CustomMonitoringEvent greenEvent =

                    new CustomMonitoringEvent("My publisher", 2);

                greenEvent.EventData = "<Data>Some data</Data>";

                computer.InsertCustomMonitoringEvent(greenEvent);

            }

        }

    }

}

 

System.Connectors.Library.Test.xml