As the market share of SQL Server grows over time, there has been a increasing demand for SQL Server performance tuning. There has been various approaches taken by different teams and individuals for improving the performance of the SQL Server, and I think it might be beneficial to the community to have the basic steps documented for troubleshooting and improving the SQL Server Performance for various applications.
For SQL Server to perform optimally, monitoring and optimizing the SQL Server Disk sub-system is one of the important aspects. We have very specific disk performance requirements.
Avg. Disk Sec/Read is the average time, in seconds, of a read of data from the disk. The following list shows ranges of possible values and what the ranges mean:
Less than 10 ms - very good
Between 10 - 20 ms - okay
Between 20 - 50 ms - slow, needs attention
Greater than 50 ms – Serious I/O bottleneck
Identifying IO bottlenecks:
You can use the following Performance Monitor counters (perfmon.exe) to correctly identify disk performance issues.
- PhysicalDisk Object: Avg. Disk Queue Length represents the average number of physical read and write requests that were queued on the selected physical disk during the sampling period. If your I/O system is overloaded, more read/write operations will be waiting. If your disk queue length frequently exceeds a value of 2 during peak usage of SQL Server, you might have an I/O bottleneck.
- Avg. Disk Sec/Read is the average time, in seconds, of a read of data from the disk.
- Avg. Disk Sec/Write is the average time, in seconds, of a write of data to the disk. The guidelines for the Avg. Disk Sec/Read values apply here.
- Physical Disk: %Disk Time is the percentage of elapsed time that the selected disk drive was busy servicing read or write requests. A general guideline is that if this value is greater than 50 percent, there is an I/O bottleneck.
- Avg. Disk Reads/Sec is the rate of read operations on the disk. Ensure that this number is less than 85 percent of the disk capacity. The disk access time increases exponentially beyond 85 percent capacity.
- Avg. Disk Writes/Sec is the rate of write operations on the disk. Ensure that this number is less than 85 percent of the disk capacity. The disk access time increases exponentially beyond 85 percent capacity.
For more information on how-to create a Data Collector Set from Performance Monitor check: http://technet.microsoft.com/en-us/library/cc722148.aspx
Disk Drive Placements:
You need to use separate drives for different purposes.
Individual disk latency requirements:
Database >15ms, Logs> 2ms Tempdb> 2ms
Do not use the default allocation unit when formatting the Disk drives for SQL Server Log files & Data files:
Format drives with 64k Cluster Allocation Unit
Antivirus programs can create issues with SQL Server functionality, and it is important to exclude them from their scope, by adding them to the exclusions list.
File types to exclude:
*.mdf, *.ndf, *.ldf, *.bak
Always assign the maximum memory setting on the SQL Server Instance level properties
Note: The maximum memory setting is only for the SQL Server buffer cache, and does not include memory requirements for other SQL Server functions like replication etc.
In order to determining the Non-Buffer Pool Usage, use the following:
Requirements other than SQL Server’s buffer pool.
· 2GB for Windows OS. Recommended <3 GB for 64 Bit systems.
· x times GB for SQL Server worker threads. You configure out how many threads your instance will use. Each thread will use 0.5MB on x86, 2MB on x64 and 4MB on Itanium.
· 1GB for multi-page allocations, linked servers, and other consumers of memory outside the SQL Server.
· 1–3GB for other applications that might be running on the system, such as backup programs.
For example, on a server with eight CPU cores and 16GB of RAM running SQL Server 2012 x64 and a third-party backup utility, you would allow the following:
· 3GB for Windows (2GB for 32 Bit Windows)
· 1GB for worker threads (576 × 2MB rounded down)
· 1GB for MPAs, etc.
· 1-2 GB for the backup program.
You can find more information on ‘max worker threads Option’ here: http://technet.microsoft.com/en-us/library/ms187024(v=sql.105).aspx (For SQL Server 2008).
By default, you only one Datafile and a transaction log file. However for optimal performance, follow the recommended best practices given below:
Storage Planning for TempDB
- Set the recovery model of TempDB to SIMPLE. This model automatically reclaims log space to keep space requirements small.
- Do not allow for TempDB files to automatically grow. This reduces the CPU overhead of managing a dynamic file growth.
- Have multiple data file for TempDB (Total Number of TempDB Primary Data File = Number of Processors available to SQL)
- Each data file should be of equal size.
- Try to keep each data files in separate disk drives for IO Parallelism.
- TempDB Data and Log files should be kept in faster disk drive (Preferably RAID 1 if possible)
- Use RAID-10 or SSD Disks.
- Pre-size TempDB files
- 25% of largest DB size.
- Set Auto Growth to fixed size < 200 MB
- You should have the same number of data files as the number of CPUs up to a maximum of 8.
Enable the Lock Pages in Memory Option (Windows)
This Windows policy determines which accounts can use a process to keep data in physical memory, preventing the system from paging the data to virtual memory on disk.
This can give you improved performance, especially when there is a memory pressure.
Max Degree of Parallelism
Defines how many CPUs can execute parallel queries
Many Microsoft products like SharePoint & MS Dynamics CRM have a recommendation of setting this to ‘1’.
For LOB applications, when you see that there are too many CXPACKETS waits in your SQL Server, you can consider setting this option to ‘1’.
Fill Factor Settings
If your SQL Server have a very high number of transactions, and your index is having high level of fragmentation, set the Fill Factor Setting to ‘80’.
Detecting index fragmentation:
Performance Monitor (Perfmon.exe)
In order to capture SQL Server specific perfmon counters, you can use the following:
Processor: % Processor Time - Should average below 75% (and preferably below 50%).
System: Processor Queue Length Should average below 2 per processor. For example, in a 2-processor machine, it should remain below 4.
Memory—Pages/sec Should average below 20 (and preferably below 15).
Memory—Available Bytes Should remain above 50 MB.
Physical Disk—% Disk Time Should average below 50%.
Physical Disk—Avg. Disk Queue Length Should average below 2 per disk. For example, for an array of 5 disks, this figure should average below 10.
Physical Disk—Avg. Disk Reads/sec Used to size the disk and CPU. Should be below 85% of the capacity of the drive.
Physical Disk—Avg. Disk Writes/ sec Used to size the disk and CPU. Should be below 85% of the capacity of the drive.
Network Interface—Bytes Total/sec Used to size the network bandwidth.
SQL Server: Buffer Manager— Buffer Cache Hit Ratio Should exceed 90% (and ideally approach 99%).
SQL Server: Buffer Manager—Page Life Expectancy Used to size memory. Should remain above 300 seconds.
SQL Server: General Statistics— User Connections Used to size memory.
SQL Server: Databases— Transactions/sec Used to size disks and CPU.
SQL Server: Databases—Data File(s) Size KB Used to size the disk subsystem.
SQL Server: Databases—Percent Log Used to size the disk subsystem.
Hope you find this information useful.
In my next blog post, I will talk more about Performance Monitoring with the help of various tools.