Very often when investigating ASP.NET or other .NET performance issues we find threads stuck waiting for SQL Server or some other database to respond.
I am by no means an expert in SQL Server so I thought I’d invite one of the escalation engineers, Graham Kent, in the SQL Server support team to write a little bit about how you can determine if there is a performance issue in SQL Server or if you can discard SQL Server as the root cause of your perf issue.
Graham Kent, SQL Server Escalation Engineer
In this first installment he is talking about how to collect SQL Server performance data with PSSDiag
Part 1: Collecting the right data
SQL Server performance is a huge subject in its own right, but sometimes I chat with people who work with SQL Server less often, and they’re looking for a quick way to help them analyze whether they have a SQL Server performance problem at the root of an application problem that they’re debugging. I was chatting with Tess the other day and we thought this would make a good post. After all there’s no point debugging asp.net if your bottleneck is somewhere completely different.
If you ever talk to any of my colleagues in the SQL Server PSS support team, and you have a suspected SQL Server performance problem, you’ll undoubtedly be asked to run the PSSDIAG tool which usually collects the data that we need. This tool acts as a one stop shop for us to pick up a variety of data sets, including SQL Profiler, Performance Monitor, SQL Server configuration information, SQL Server logs, blocking scripts and many more custom scripts of our choosing. We published a version of the tool for SQL Server 7 and 2000 under http://support.microsoft.com/kb/830232 . Although we still maintain a version for SQL Server 2005, the tool was such a success that it began to be incorporated into the core product itself thereby enabling people to analyze their own performance problems more easily, or collect the data that PSS would need to help them.
Nowadays the public version of the tool is named SQLDIAG and installs by default with all SQL Server 2005 installations. You can read about its options in Books Online http://msdn2.microsoft.com/en-us/library/ms162833.aspx.
It works very simply in default mode, try it now on a test box: open a command prompt, switch to this directory
C:\Program Files\Microsoft SQL Server\90\Tools\binn
The output willl look something like this:
By default it connects to the local machine name, and every SQL 2005 instance it can find. There are many switches and options and a detailed configuration file that can be used to tailor the process (more on this later) but if you’re experiencing a problem and you want to grab some data super quickly without thinking too much, you can just do this. The collector is now writing out data to a sub-directory named sqldiag, and as it says, when you’re ready stop, simply press Ctrl+C. The collector might take a few minutes to shutdown depending upon the speed of your server. When it’s finished, take a look in the sqldiag sub-directory and you’ll see a collection of files. By default you’ll have collected the following:
- Default Trace files (if server option enabled)
- sqldiag output text file (contains sql error logs, configuration information, internal performance counters , the output from several key DMVs and much more)
- sqldumper error log
- log and control files for the sqldiag process.
I’ll talk about interpreting this data at another time, but you’ve already collected some interesting data about your SQL Server instance and what it was up to whilst the collector was running.
What you should also notice is that when you ran the tool in this mode for the first time, it unpackaged several .XML files which it uses for configuration purposes. These are as follows:
By default if you don’t use the /I switch (to specify an input file) the tool uses the SQLDiag.XML config file and you get the results as above. The true power of SQLDiag in my opinion though is in amending these config files and collecting data sets of your choice. The extra config files automatically created have the following collections turned on:
- Windows Event Logs
- SQL Profiler (including several verbose events such as performance statistics, SQL:BatchStarted and SQL:BatchCompleted)
- Windows Event Logs
- SQL Profiler (with less verbose events)
Within both the perfmon and SQL Profiler events, you have the ability to configure every potential sub event on or off, based upon the situation that you wish to monitor. All you do is edit a copy of the XML file and change individual elements like these from “true” to “false” and vice versa.
<PerfmonObject name="\MSSQL$%s:Buffer Manager" enabled="true"> <PerfmonCounter name="\*" enabled="true" /> <PerfmonCounter name="\Buffer cache hit ratio" enabled="true" /> <EventType name="Locks"> <Event id="148" name="Deadlock Graph" enabled="true" description="Occurs when an attempt to acquire a lock is canceled because the attempt was part
of a deadlock and was chosen as the deadlock victim. Provides an XML description of a deadlock." />
In many cases though, one of the default templates with contain enough info for you to make an initial assessment of situation. To run sqldiag with one of the other config files (or one you create yourself) use the following syntax:
and you might also want to add a new output directory:
sqldiag.exe /ISD_General.XML /Oc:\sqldiagoutput
One thing that the default configs don’t enable though is the blocking collector, which I do think is particularly useful. To enable this collector you need to create a copy of the SQLDiag.XML and edit the following section:
<BlockingCollector enabled="false" pollinginterval="5" maxfilesize="350"/>
Changing the BlockingCollector enabled value to “true” will turn on the blocking collector, and provide a blocked process report for any blocking experienced whilst the collector is running. This is a particularly common cause of application performance issues and well worth enabling.
Over time you’ll find it useful to build up a selection of your own config files, and then you can just copy to them to any server you wish to monitor, based upon the situation you want to look at. For example I keep custom ones to hand which focus on the following subjects
- Memory Usage
- Blocking and general Profiler Trace
- Disk Performance
- Profiler Trace
Remember though, if you enable a custom configuration file like the ones above, your collections can be extremely verbose. You should therefore consider the following:
- Disk Space for output: potentially many GBs on heavy loaded machines
- Disk Drive chosen for output: If you want to monitor disk performance don’t write the output to the same drive, and don’t ever write the output to a network drive
- Verbose events such as performance statistics can place a heavy CPU load on a server. I only enable these if I’m certain that I need them
Finally you might wish to configure the collector to only run against specific machines or instances, and following XML elements allow you to do this:
Change the machine name element to the name of the server you wish to connect to (the default of “.” means connect to the local machine)
<Machines> <Machine name=".">
Or use the /M switch
Change the instance name element to the SQL Server instance that you wish to connection to (the default of “*” connects to every instance it can on the machine specified above.
<Instance name="*" windowsauth="true" ssver="9" user="">
In Part II Graham will show an example collection and talk about interpreting some simple results.
Let me know what you thought about this “guest blog” and if there are any other topics you would like to see that perhaps fall a bit outside of my area of expertise.