SQL Server Performance Baselining Reports Unleashed for Enterprise Monitoring !!!


In my previous blog post, I shared how you can leverage SQL Server Performance Dashboard SSRS Reports for monitoring and point in time troubleshooting for large deployments of SQL Server. In this post, we will talk about why dbas should establish SQL Server Performance baselines for their environment and how you can leverage the SQL Server Performance Baselining reporting solution shared in our Tiger Toolbox Github repository to achieve that. The solution is already being deployed and leveraged by some of our customers and further contributed by SQL community members like SQLAdrian in GitHub.

Why SQL Server Performance Baselining is important?

  • Performance Tuning – DBAs, Consultants and Support team often get called in a situation when the business application running on SQL Server is running slow. Before troubleshooting, one of the first question to ask is how is the performance measured and what was the performance previously ? In majority of the situations, there are changes in the workload, application codes which leads to changes in performance but one can prove it only when the previous state of the SQL Server was captured. In such situations, performance baselining can assist you in learning from the historical data and trends to detect the anomalies or changes on the pattern on the server which may have caused performance issues observed.
  • Capacity Planning – For DBAs managing large scale deployments and mission critical instances of SQL Server, it is important to proactively keep an eye on resource utilization (CPU, Memory, IO and storage) and workload trend over a period, to forecast and plan for more capacity as the workload trend and resource utilization is changing. For capacity planning, performance baselining reports are the key to forecast and predict capacity required for future by analyzing the historical trends.
  • Consolidation Planning – As shadow IT and number of business applications running SQL Server grows in enterprises, companies can save cost by consolidating some of their databases under single instance for efficiently utilizing their hardware and resources. To plan resources and consolidate SQL Server databases, again performance baselining is required.
  • Virtualization\Migration to Azure – Most enterprises today are looking to reduce their IT capital and operational expenses and overheads by migrating to cloud. When migrating to cloud, it is important to identify right VM size or performance tier to run your SQL Server databases which is easy when you have performance baselines captured and established.

In general, Performance Baselining is capturing last known good state metrics which can be used as a starting point for comparison. Performance monitoring is the first step for baselining. In a production environment, the goal of performance monitoring should be to be non-invasive with minimal overhead. Baselining always requires all the metrics to be captured across time dimension to perform historical time analysis. Historical analysis of performance data is useful for anomaly detection or to forecast future performance based on current workload.

In SQL Server, at a very high level, we have 3 types of performance monitoring data available to capture the state of SQL Server

  • Perfmon
  • DMVs (dynamic management views)
  • Xevents

In SQL Server, one needs to capture at least following details at minimum over time to successfully establish comprehensive performance baseline for SQL Server instance.

Processor Information

  • Processor(*)% Processor Time
  • Process(sqlservr)% Processor Time
  • Processor(*)% Privileged Time
  • Process(sqlservr)% Privileged Time

Memory

  • Available Mbytes
  • Memory Pages/sec
  • Process(sqlservr)Private Bytes
  • Process(sqlservr)Working Set
  • SQLServer: Memory Manager Total Server Memory
  • SQLServer: Memory Manager Target Server Memory

Physical Disk

  • PhysicalDisk(*)Avg. Disk sec/Read
  • PhysicalDisk(*)Avg. Disk sec/Write
  • PhysicalDisk Avg. Disk Queue Length
  • Disk Bytes/sec
  • Avg Disk Bytes/Transfer
  • Process(sqlservr)IO Data Operations/sec

Network

  • Network InterfaceBytes Received/sec
  • Network InterfaceBytes Sent/sec
  • Network Interface(*)Output Queue Length

SQL Server Workload

  • SQL Server: SQL Statistics Batch Requests/sec
  • SQL Server: SQL Statistics SQL Compilations/sec
  • SQL Server: SQL Statistics SQL Recompilations/sec
  • sys.dm_exec_requests
  • Sys.dm_exec_sessions
  • sys.dm_exec_connections
  • Max Workers
  • Workers Created
  • Idle Workers
  • Deadlocks

Waits

  • SQL Server: Wait Statistics
  • Sys.dm_os_waiting_tasks
  • Sys.dm_os_wait_stats
  • Latch Waits > 15 sec
  • Locks > 30 sec
  • IO Latch Timeouts
  • Long IO Status

Database Indexes

  • Sys.dm_db_index_physical_stats
  • Sys.dm_db_index_operational_stats
  • Sys.dm_db_index_usage_stats

When monitoring large deployments of SQL Server instances, if you capture all the above data on a central location or central SQL Server database, the central database becomes a bottleneck or the point of contention. To scale the performance baselining solution for large deployments of SQL Server, we use the same architecture as discussed in SQL Server Performance Dashboard SSRS Reports as shown below. In this architecture, the performance data is captured locally in a database called dba_local with performance monitoring SSRS reports hosted on a central monitoring server. The SSRS reports are parameterized and have dynamic connection string which connects to SQL Server instance specified in the report parameter. The report queries the dba_local database from the specified SQL Server instance to report performance data captured over time.

 

If you download SQL Performance Baselining Reporting Solution from GitHub, you will find scripts to create a database and monitoring jobs which needs to be ran on each instance of SQL Server you wish to monitor. Following are steps to deploy SQL Performance Baselining Reporting solution,

Data Collection Steps for each SQL Instance to Monitor

  1. Connect to SQL instance to monitor
  2. Run CREATEDATABASE.sql
  3. Run CREATEOBJECTS.sql
  4. Run CREATECOLLECTIONJOB.sql
  5. Check SQL Agent JOBs History to see if it runs successfully
  6. Repeat for each SQL Instance you want to monitor

Setting up & Deploying Reporting

To deploy Performance baselining SSRS reports, you can leverage the same central monitoring SSRS server hosting SQL Server Performance dashboard report. As mentioned in SQL Server Performance dashboard blog post, you can use the same steps to setup and configure SSRS server by deploying reports using SSDT or Report Manager. Alternatively, you can use PowerShell script created by Aaron (b|t) with some modifications mentioned in the previous blog post as well.

Once the reports are deployed, you can use the following reports to establish your performance baseline

The above report is drillthrough report so when you click on the chart for any specific data, it takes you to the daily performance trend report for that day as shown below

In the previous report, you observed that around 8PM, the sessions in suspended state increases which indicates high blocking causing performance slowness or degradation. Around same time, most sessions are waiting on LCK_M_SCH_M and disk write latency is high around the same. And later we found

Ohh, our reindexing job runs at around 8PM and we don’t use WITH ONLINE=ON while reindexing which explains the blocking.

The Memory Utilization report shows you the trend of SQL Server Memory and Available Memory on the server. From the report, you can easily deduce, Available Memory on the server went down at around 2:30PM which caused external memory pressure causing target server memory to fall below total server memory in SQL Server. SQL Server responded to the memory by increasing the lazy writer activity to free up pages which in turn drops the page life expectancy on the server.

Finally, the database storage report gives you a quick glance of the free space on all the databases hosted on the SQL Server.

Hope you find the reports useful and if you are a SSRS developer, I would encourage you to design some cool reports for performance monitoring using perfmon, DMV or Xevents data and contribute back to the community by sending a pull request to our solution in GitHub repository.

Parikshit Savjani
Senior PM, SQL Server Tiger Team
Twitter | LinkedIn
Follow us on Twitter: @mssqltiger | Team Blog: Aka.ms/sqlserverteam

Comments (17)

  1. Jay says:

    Thank you for sharing.

  2. Awesome solution, Thank you very much.

    I am working to integrate your solution with PowerBI and will post my solution.

    B/w if you can add below steps to enable cmdshell, it will resolve job failure issue for user like me who got this error.

    sp_configure ‘xp_cmdshell’, 1
    go
    reconfigure

  3. Ken says:

    Well this is deliciously fantastic, thanks for sharing!

  4. Rafael says:

    Thanks a lot for sharing.
    The first time I saw these reports was in youtube.

  5. Rogelio Dominguez says:

    AMAZING!!

  6. Awesome, thank you. SQL Server needs built-in solution like that.

    1. Marcus Klemp says:

      Ever try Data Collectors? Surprised they did not mention them.

  7. Pete says:

    Thank you. Its a good tool and framework.

    I noticed it only works out of the box for default instances. Had some issues getting the dashboard to work on named instances. Any guidance on how we can use this on named instances without much tweaking would be cool.

    1. Pete, it should work for named instances as well by specifying the [servername\instancename] or using servername, portno.

      1. londondba says:

        Yes, there is an issue for named instances and the spGetPerfCountersFromPowerShell stored procedure. See this line in that stored proc:
        REPLACE(REPLACE(REPLACE(ct.[output],’\\’+@@SERVERNAME+’\’,”),’ :’,”),’sqlserver:’,”)[CounterName]
        It only strips out the sql server name from the leading part of the counter name if the sql server is a default instance. That sql code doesn’t cater for named instance where the leading part is of the form \\myserver\mssql$myinstance:. So you end up with entries like this in the PerformanceCounter table:
        \\myserver\mssql$myinstance:buffer manager\buffer cache hit ratio

        I’ve replaced that one line with this to overcome the issue:

        REPLACE(
        CASE WHEN LOWER(ct.[output]) LIKE ‘%\mssql$%’ OR LOWER(ct.[output]) LIKE ‘%sqlserver:%’
        THEN SUBSTRING(ct.[output],CHARINDEX(‘:’,ct.[output])+1,500)
        ELSE SUBSTRING(ct.[output],LEN(@@SERVERNAME)-CHARINDEX(‘\’,REVERSE(@@SERVERNAME))+4,500)
        END,
        ‘ :’,”
        ) [CounterName]

        Hoping someone can fix this in the github code repository.

      2. londondba says:

        Thanks very much for making this solution available to the SQL Server community. While testing it I like what I see. A couple of enhancements to the SQL Performance Baselining Reports would be good:
        1) I’m UK based so my computer regional settings show dates in dd/MM/YYYY format. I notice that almost all of your PerfCtr data sets present the datesampled on the charts using CONVERT(varchar(20),DateSampled,101) as ‘Date’ ie US format MM/dd/YYYY. That’s no real problem just presenting them a chart. But the Daily Performance Trend has an OnAction for the Available Memory Trend chart to go to the SQL Memory Utilization Report. The Date in US format is passed to this report which immediately errors with an incorrect date format error. Similarly the OnAction to go to the User Sessions Details report from the Daily Performance Workload Trend report.
        Also, the language for all your reports is harcoded to be en-US so with these issues I’ve resorted to setting my workstation region to US.
        Would be really good to eliminate these date format issues from these SSRS reports.
        2) the Database Storage report doesn’t take into account mount points. One of its data sets uses xp_fixeddrives which completely ignored mount points. Would be really good to address this issue. The report would be very useful then.

        Thanks again for the nice addition to the toolbox !

  8. Arpit_Mehra says:

    Hello,

    Report is empty. I am missing something? Please advise.

    1. Arpit, if the SQL Agent jobs are not running and populating data, you will see empty reports. You should check with Data collection is setup accurately.

  9. Lex says:

    Hi, the job DBA – PerfMon Counter Collection failed with the following error –> Arithmetic overflow error converting expression to data type money. Can u help me to fix it?. THX!

    1. Lex says:

      I found the error, i must replace the comma (,) for the point (.)

  10. Lex says:

    Hi, i found that sometimes the numbers of perfmon counters have an “E-05” in the characters this significant that this number is exponent 10 (-5), so have to divide the number 100000 to obtain the real value for this counter.

Skip to main content