SQL Performance Tuning – 1 Overview

SQL performance tuning is a huge topic, and you need to know many detailed info of behavior with different version of OS and SQL. I will share my general approach of such work in 4 - 5 posts.

Before Tuning

The first thing you should keep in mind is end users don’t care SQL, they care the overall application response time. You have to know your application. Sometime we need to tune how application accessing database. One example is you should cache your blog post from dynamical page(query from database every time) to statics page.

The second thing is database performance is related to application behavior(queries). And application behavior could change from time to time. So tuning is not one time action. You should create your performance baseline and keep on monitoring your environment. The usage of performance baseline is to tell you how SQL server run when the application response time is acceptable and to shorten your time to find out what’s abnormal. Unlike other product(AD, Exchange), SQL and IIS are platform to run your application and each SQL has its own performance baseline. You can’t simply relay on one general baseline, you should create for each of the critical application. You could tune SQL without baseline but it will take you longer time to find out what’s abnormal.

The baseline should cover
1. OS level performance counter like CPU, memory, disk and network etc.
2. SQL instance level performance counter like buffer pool, access methods etc.
3. SQL instance level DMV info like top n database, top n wait stats and top n query.
4. (optional) Application level key workload count.

Load test before application release is very important, you could build the initial baseline here. Although it’s hard to simulate all the end user query pattern, it’s useful to find out what’s different between load test and real production loading later on.

The third thing is set your tuning target and prioritize the findings. My 2 cent is no need to change code immediately if the overall application response time is acceptable. You could put the finding into potential issue list. But you need to keep on monitoring the performance data and plan your human resource to fix these potential issues before they become real problem. One example is you find out full scan number is high but database size is small and physical memory is sufficient. Most of the read is logical read than physical read. Then there is less performance impact at this moment. But you should know that logical read will increase when table size growing, and logical read will become physical read once database size being larger than memory buffer pool size.


Tuning Workflow
Typical tuning workflow is like:
1. Data collection. There are multiple information could be captured by different tools like SCOM SQL MP, msinfo32, eventvwr, SQL error log, perfmon and profiler. I will demo how to use SQL MDW's data for analysis.
2. Perfmon analysis – OS. We will find out if any hardware resource is bottleneck here.
3. Perfmon analysis – SQL. We will find out how SQL Server consume hardware resource here.
4. Top N databases by bottleneck resource. Using profiler output or DMV to find out how databases consume resource here. Move them to different instance or tune their Top N queries.
5. Top N queries by bottleneck resource. Using profiler output or DMV to find out top n queries, then tune the single execution cost or reduce the execution number.
6. Wait stats. SQL Query Duration = CPU Time/# of CPU in used + Wait Time. Both duration and CPU time could be found from profiler output and DMV, then you can compare the change of wait time. If wait time becomes higher, no doubt that you should dig into wait stat.
7. Check top n queries’ execution plans, tune them case by case. There are 2 performance tuning scenarios, 1st is single query slow, 2nd is concurrency related. This step is for single query slow scenario. The latter is more related to locking/blocking.


--Posted by Shiyang Qiu, July 9, 2016

Comments (2)

  1. Any posts on database performance that bring a new perspective are more than welcome. I have a few comments if you allow me.

    Database performance is dependent on existing workload, users’ behavior reflected directly in applications’ behavior, hardware resources’ performance and design assumptions/implementation. Many problems related to database performance can be tracked back to design, and without code redesign the performance will be poor no matter how much attempts are made to optimize the backend.

    A baseline for SQL Server configuration, system configuration, and a change log for both of them are often needed as well, especially to track the changes occurred in the environment.

    I mentioned above the design assumptions. Sizing assumptions like number of expected users, transactions, data volume, number of jobs or background tasks running, don’t need to be reviewed regularly though they need to be considered as well when assessing a system’s performance. It may happen that the backend was sized based on different expectations. Of course, this doesn’t help in tuning, though provide an explanation for the current state of art.

    Some performance problems can be tracked back to indexes. There’s no point in stressing the importance of having adequate indexes. On the other side it’s maybe important to consider also the indexes not used at all, indexes that for each insert or update statement create an overhead.

    In theory one needs to tune first the environment and then go deeper in individual tunings. Some tasks are considered during the periodic server administration tasks.

    Before going too deep in troubleshooting and analysis is important to check first basic things like actual resources’ utilization/availability. More or less random events like an unscheduled backup, a job running wild, the use of special functionality, a disk running full or an error that triggered a memory dump can be easily found with simple routine checks.

    Last but not the least – a (regularly performed) server server/service restart might prove the “cheapest” solution. It’s not a solution per se, though might prove the shortest path in regaining a running system. Some use it extensively instead of performing any adequate troubleshooting.
    When performing periodic restarts it might happen that some of the latent issues are hidden by this practice (e.g. cache bloating, tempdb-related issues).

    1. Shiyang Qiu says:

      Great reply Adrian! Thank you for sharing your experience and I agree your viewpoint. I will provide my check list for application data access pattern, how to use MDW to baseline and check common performance bottleneck when query performance degrading.

      For the “restart” practice, most of the time i will restart application regularly to address potential resource leak but i seldom restart SQL for 2 seasons: a) the buffer pool will be emptied b) it brings downtime. And for failover clustered instance, for some customers I suggest them to failover SQL to another node regularly but the main purpose is to make sure secondary node could work normally than to reset SQL.

Skip to main content