Using SQL Server 2008 Management Data Warehouse for database monitoring in my application

SQL Server 2008 introduced what we call the Management Data Warehouse. The Management Data Warehouse is a relational database that contains data that is collected from a server using the new SQL Server 2008 Data Collection mechanism. The Warehouse consists primarily of the following components:

· An extensible data collector: https://msdn.microsoft.com/en-us/library/bb677248.aspx

· A database schema which is indirectly extensible by additions in the data collection: https://msdn.microsoft.com/en-us/library/bb677306.aspx

· Stored procedures which allow the DBA to create their own data collection set and own the resultant data collection items: https://msdn.microsoft.com/en-us/library/bb630337.aspx

· Three Data Collections Sets which are delivered with SQL Server 2008 and which can be enabled at any time: https://msdn.microsoft.com/en-us/library/bb964725.aspx

· Standard reports delivered with SQL Server 2008 Management Studio display data collected by the three predefined Data Collection Sets. For DBA created Data Collections, reports need to be generated by the DBA or the data can be queried with normal T-SQL queries.

However, this DBA/management toolset can also be used by ISVs to collect and monitor performance data of SQL Server through their applications. The Management Data Warehouse as delivered with SQL Server can be used as a framework by an application as a basis for database monitoring functionality. Due to the high flexibility of the Management Warehouse this is actually a very easy task. Using the functionality of this new component, as a database monitoring framework, can make support much easier; it can make it simple to find answers to questions like: ‘Was the performance issue users complained about a few hours ago rooted on the database side, the storage back-end, or somewhere in the application?’

In the following sections I'll describe the primary steps to enable SQL Server Management Data Warehouse and Data Collections:

After the customer deploys the application, the customer would need to configure the Management Data Warehouse with SQL Server Management Studio (Object Explorer à Management à Data Collection). Thereby the customer needs to define which database the database schema of the Management Data Warehouse needs to be deployed in. The customer can now be instructed to create a database of a certain name and a certain database file locations or to use a database the application already created during the ISV application installation. The application can reference this database and can query the database after the DBA assigns the application users into the ‘dbo’ role of the Management Warehouse database. SQL Server now will create the following entities in the Management Warehouse database:

· A user named ‘mdw_check_operator_admin’.

· A schema named ‘core’ with a set of tables:
https://msdn.microsoft.com/en-us/library/bb677306.aspx

· A schema ‘snapshot’ which contains the tables needed for the System Data Collector Set which are delivered with SQL Server 2008 already: https://msdn.microsoft.com/en-us/library/bb677306.aspx  

Other functionality required for making the whole warehouse operational is already deployed in SQL Server’s msdb database.

In order to start the collection of the 3 default collectors the customer needs to have SQLAgent started and needs to enable the 3 collection sets manually. However, these 3 collection sets only cover minimal aspects and often are not sufficient to detect whether the issue is on the database side or whether the issue is somewhere else. Additionally, if the performance problem is currently not present, but happened hours or days ago, it can be difficult to figure out what happened using the default collection mechanism.

Another possibility to enable the Management Warehouse and to configure it, can be done by T-SQL stored procedures as they are delivered in msdb.

Extending the Data Collections to fit our Need

The nice thing about the Management Data Warehouse is the fact that the data collector is a framework which allows the creation of user defined data collection sets and data collection items. The extension of the data collection or definition of an additional data collection also will extend the schema of the warehouse automatically. For an application vendor it means collecting what their typical support scenarios are. As a great example on how to extend such a collection set, one could script the 3 default collection sets into a Query Window. This can be done by marking one of the sets, click the right mouse button and select the options ‘Script Data Collection As’ à ‘Create to’ à ‘Query Window’. Unfortunately all the collections are ‘TSQL Query Collector Type’ and ‘Query Activity Collector Type based, whereas the SQL Server Data Collector also can be enhanced reading Windows Performance Counters and SQL Server Trace information. For different Collector types please see: https://msdn.microsoft.com/en-us/library/bb677328.aspx

In order to demonstrate how to extend the data collector we use one of the cases which keep us busy very often supporting ISV applications running on SQL Server and databases in general. ‘How was the I/O performance during the time the users were complaining of the application being so slow?’

Build a Data Collection which collects Query Performance Counters

Being faced with a support situation where hours or days ago users were complaining about performance of the application, it becomes extremely tricky to figure out what really happened. When everybody points to the database as source of the problems, it is hard to find any evidence at all unless the customer has a comprehensive set of Performance Counters recorded steadily. That however, is a rare case all too often. One cannot blame a customer not having a comprehensive Perfmon trace since the administration and archiving of such traces can be rather cumbersome. Therefore we want to show in our first case of extending the SQL Server 2008 Data Collector, a case where we collect Performance Monitor data. Opposite to the usage of the Perfmon on the OS side, the Data Collector and the Management Data Warehouse take responsibility of administering the data. First we’ll show the script to extend the Data Collector and then walk through it step by step.

So the script generating such a collection would look like:

use msdb;

Begin Transaction

Begin Try

Declare @collection_set_id_1 int

Declare @collection_set_uid_2 uniqueidentifier

EXEC [dbo].[sp_syscollector_create_collection_set]

      @name=N'Disk Performance and SQL CPU',

      @collection_mode=1,

      @description=N'Collects logical disk performance counters and SQL Process CPU',

      @target=N'',

      @logging_level=0,

      @days_until_expiration=7,

      @proxy_name=N'',

      @schedule_name=N'CollectorSchedule_Every_5min',

      @collection_set_id=@collection_set_id_1 OUTPUT,

      @collection_set_uid=@collection_set_uid_2 OUTPUT

Select @collection_set_id_1, @collection_set_uid_2

Declare @collector_type_uid_3 uniqueidentifier

Select @collector_type_uid_3 = collector_type_uid From [dbo].[syscollector_collector_types] Where name = N'Performance Counters Collector Type';

Declare @collection_item_id_4 int

EXEC [dbo].[sp_syscollector_create_collection_item]

@name=N'Logical Disk Collection and SQL Server CPU',

@parameters=N'<ns:PerformanceCountersCollector xmlns:ns="DataCollectorType">

      <PerformanceCounters Objects="LogicalDisk"

            Counters="Avg. Disk Bytes/Read"

            Instances="*" />

      <PerformanceCounters Objects="LogicalDisk"

            Counters="Avg. Disk Bytes/Write"

            Instances="*" />

      <PerformanceCounters Objects="LogicalDisk"

            Counters="Avg. Disk sec/Read"

            Instances="*" />

      <PerformanceCounters Objects="LogicalDisk"

            Counters="Avg. Disk sec/Write"

            Instances="*" />

      <PerformanceCounters Objects="LogicalDisk"

            Counters="Disk Read Bytes/sec"

            Instances="*" />

      <PerformanceCounters Objects="LogicalDisk"

            Counters="Disk Write Bytes/sec"

            Instances="*" />

      <PerformanceCounters Objects="Process"

            Counters="% Privileged Time"

            Instances="sqlservr" />

      <PerformanceCounters Objects="Process"

            Counters="% Processor Time"

            Instances="sqlservr" />

</ns:PerformanceCountersCollector>',

@collection_item_id=@collection_item_id_4 OUTPUT,

@frequency=5,

@collection_set_id=@collection_set_id_1,

@collector_type_uid=@collector_type_uid_3

Select @collection_item_id_4

Commit Transaction;

End Try

Begin Catch

Rollback Transaction;

DECLARE @ErrorMessage NVARCHAR(4000);

DECLARE @ErrorSeverity INT;

DECLARE @ErrorState INT;

DECLARE @ErrorNumber INT;

DECLARE @ErrorLine INT;

DECLARE @ErrorProcedure NVARCHAR(200);

SELECT @ErrorLine = ERROR_LINE(),

       @ErrorSeverity = ERROR_SEVERITY(),

       @ErrorState = ERROR_STATE(),

       @ErrorNumber = ERROR_NUMBER(),

       @ErrorMessage = ERROR_MESSAGE(),

       @ErrorProcedure = ISNULL(ERROR_PROCEDURE(), '-');

RAISERROR (14684, @ErrorSeverity, 1 , @ErrorNumber, @ErrorSeverity, @ErrorState, @ErrorProcedure, @ErrorLine, @ErrorMessage);

End Catch;

GO

Now let’s go through it step by step. As mentioned previously; vital functionality already is delivered in the SQL Server msdb database. Therefore we execute the script above in msdb.

The first step is to create an additional Data Collection Set. The name 'Disk Performance and SQL CPU' will show up in SQL Server Management Studio and is the name we will use to query for results.

Also very important information to define is the data retention period with the parameter named @days_until_expiration (in the example above this is set to 7). Since we don’t want to end up blowing the volume of the Management Data Warehouse beyond the size of the application to monitor, this parameter needs to be set appropriately. Data beyond that expiration age gets purged.

Another important piece of information to provide is the schedule which should be used to execute the collection. The creation of the Management Data Warehouse created pre-defined schedules with the following names:

· CollectorSchedule_Every_5min

· CollectorSchedule_Every_10min

· CollectorSchedule_Every_15min

· CollectorSchedule_Every_30min

· CollectorSchedule_Every_60min

· CollectorSchedule_Every_6h

The names also can be found in msdb.dbo.sysschedules. Sure, DBA created schedules can be defined and can be used instead of these defaults as well.

Another parameter is worth mentioning; In the Books Online Documentation about the Data Collector that we pointed out earlier, it is mentioned that one can cache the data for a while before uploading into the Management Data Warehouse. Whether one wants to do so is determined with the parameter called @collection_mode. In our case the value of 1 means a direct upload after the collection without any caching.

The data about the collection set now is entered into a table in msdb. The important thing now is to get the ID and UID of the collection set which are delivered as output of the procedure to create the collection set.

The set of collections exists already and their names can be retrieved out of msdb.dbo.syscollector_collection_sets_internal.

The second step is to get the UID of one of the 4 different Data Collection types we mentioned above already. This is done with this query in the script:

Select @collector_type_uid_3 = collector_type_uid From [dbo].[syscollector_collector_types] Where name = N'Performance Counters Collector Type';

 

The other 3 Data Collection types would need to be defined by these names:

· Generic T-SQL Query Collector Type

· Generic SQL Trace Collector Type

· Query Activity Collector Type

And n

ow, the third and probably least documented step so far. The specific data collection item is going to be defined. The first parameter is the name of the collection. The second parameter which simply has the name ‘parameter’ defines what really has to happen. As one can see it is a XML structure which has kind of a one line header and then the same structure for any of the definition of a counter again. Looks pretty simple as one can see above where we define 6 different counters of Logical Disks over all disk partitions visible to the server. We also added collecting the Privileged CPU and the overall CPU SQL Server is consuming. Everything is set now. One just needs to enable the new Data Collection Set with the stored procedure sp_syscollector_start_collection_set: https://msdn.microsoft.com/en-us/library/bb630366.aspx

Once SQLAgent is started, the data collection starts.

How to read the data

In order to programmatically read the data one needs to look a bit into the different tables of the Management Data Warehouse schema where the data is getting stored in:

snapshots.performance_counter_values: will store the raw and formatted data identified with a performance_counter_instance_id, the snapshot_id and the date of the snapshot. So we already have two important components of data with a query like this:

--IMPORTANT - run the following queries in the database that contains the DMW schemas

select spcv.formatted_value as 'Formatted Value',

spcv.collection_time as 'Collection Time'

from snapshots.performance_counter_values spcv

order by spcv.collection_time desc

In order to get the Path and the name of the counters based on the id, we need to join the table snapshots.performance_counter_instances with the table containing the values. A query could look like this:

select spci.path as 'Counter Path', spci.object_name as 'Object Name',

spci.counter_name as 'counter Name', spci.instance_name,

spcv.formatted_value as 'Formatted Value',

spcv.collection_time as 'Collection Time'

from snapshots.performance_counter_values spcv,

snapshots.performance_counter_instances spci

where spcv.performance_counter_instance_id = spci.performance_counter_id

order by spcv.collection_time desc

So far so good. However the data we are selecting so far might be from different collection sets even from different database instances (if one allows centralizing). Therefore we need to restrict the data now to the data collected by our custom build data collection set. In order to do so we need to look at several different tables to make this connection. At the end the query looks like:

select spci.path as 'Counter Path', spci.object_name as 'Object Name',

spci.counter_name as 'counter Name', spci.instance_name,

spcv.formatted_value as 'Formatted Value',

spcv.collection_time as 'Collection Time',

csii.instance_name as 'SQL Server Instance'

from snapshots.performance_counter_values spcv,

snapshots.performance_counter_instances spci,

msdb.dbo.syscollector_collection_sets_internal scsi,

core.source_info_internal csii,

core.snapshots_internal csi

where spcv.performance_counter_instance_id = spci.performance_counter_id and

scsi.collection_set_uid=csii.collection_set_uid and

csii.source_id = csi.source_id and csi.snapshot_id=spcv.snapshot_id and

scsi.name = 'Disk Performance and SQL CPU'

order by spcv.collection_time desc

Well, that is the first example of how one can extend the SQL Server 2008 Data Collector and use the Management Data Warehouse as a base for monitoring through an application. We’ll continue the series introducing more extensions covering other areas of monitoring. We hope you enjoy this powerful, customizable feature.