SQL execution plans, part 2: SQL compilation

Today we want to have a closer look how an execution plan for a given SQL statement is created within SQL Server and how this affects an SAP system. The idea of this discussion is not to highlight in detail the phases of the plan creation. The intention is to give an SAP administrator or ABAP/JAVA programmer an idea, which factors have an impact on the execution plan.

The component of SQL Server which creates the execution plan is called (query) optimizer. The procedure of creating an execution plan is called compilation. Once a new plan is re-created for an existing query it is called a re-compilation. The most important decision which has to be made by the optimizer is to decide which database indexes are used and how they are used. SQL Server may perform one or several index seeks. In some cases an index range scan on the index may be even faster. For JOINs the optimizer has to decide the join method (nested-loop, merge-join, hash-join) and the join order.

A compilation can be a very time consuming operation. That’s the reason why SAP uses stored procedures or parameterized statements, which allow SQL Server to re-use an existing execution plan. However, a new execution plan has to be created once no valid execution plan exists:

· After restarting SQL Server.

· After clearing the procedure cache using DBCC FREEPROCCACHE. This command can also be executed in the SAP transaction DBACOCKPIT.

· Once an execution plan is flushed out of the SQL Servers statement cache due to memory pressure. This can happen at any time.

· For SAP applications running 4.6D kernel or older kernel releases, once the name of a global temporary stored procedure is flushed out of SAP’s stored procedure (SP) name cache. This can happen at any time. As a result SAP creates a new temporary SP which a different name. SQL Server will then create a new execution plan for the new SP.

· For SAP applications running 4.6D kernel or older kernel releases, once the connection is closed, which originally created a global temporary SP. When SAP restarts a work process, all its database connections are certainly closed. As a result the dependent SPs will be dropped. SAP has to re-create these SPs and SQL Server has to re-create their execution plans.

An existing execution plan may become invalid:

· After updating the index (or column) statistics, which were relevant for the query optimizer during plan creation. It makes no difference here whether this is an automatic or manual update statistics.

· Once the structure of the table changes. This includes creating or dropping an index.

· Once the SP executing a query or a table accessed by the query is explicitly marked for recompilation. This can be done using the system stored procedure sp_recompile. It is not recommended to mark a table for recompilation, because a “sp_recompile <TABLE NAME>” holds and requests database locks. We have seen blocking situations for hours at SAP systems caused by this.

· For SAP applications running 6.40 kernel or older kernel releases, If the SAP profile parameter “dbs/mss/max_duration” was set. Whenever the average runtime of a query exceeded the specified threshold, SAP executed a “sp_recompile <PROCEDURE NAME>” for the stored procedure taking too long. This parameter caused rather trouble than solving issues. Therefore it was removed from the SAP kernel.

Generally speaking, the decisions made by the query optimizer are based on the estimation of the number of rows selected and cost calculations (CPU/memory/IO needs) for the potential execution plans. There are several factors which have an influence on the chosen execution plan. Let’s have a closer look on them:

· Existing indexes
This is the most obvious factor. Without a proper index you cannot expect a fast query execution. In the other hand, having too many indexes is much more often an issue on an SAP system. This may result in unstable (changing) execution plans. Inserts and deletes become more expensive the more indexes you have. Defining proper indexes is therefore often a compromise. You have to find a balance between speeding up a single query and minimizing the impact on the other queries. Typically you therefore create a few combined indexes to tune your most important queries running in dialog. Queries running in a batch job are often not expected to have optimal performance.

· Clustered index property
SAP (almost) always creates the primary key as a clustered index on SQL Server. You can have at most one clustered index per table since the data is physically sorted by the clustered index. Having a clustered primary key has several advantages and reduces the cost for using the primary key. All other indexes are non-clustered and called secondary indexes. A secondary index stores the clustered index key (which typically does not change) rather than a physical pointer to the data row. This results in better update performance. On the other hand you have to traverse two index trees when using a secondary index: First of all the index tree of the non-clustered index to receive the clustered index key and then the clustered index to access the data row. In a nutshell, having a primary clustered key in SAP reduces the cost of using the primary key and increases the cost of using a secondary index. In extreme cases a secondary index will not be chosen by the optimizer due to higher costs of accessing the secondary index.

· Index statistics
The index statistics of SQL Server basically consists of the density information of the combined n index columns and a histogram of the first index column. A density is the reciprocal value of the number of distinct values. If a combined index consist of three columns A, B and C then the index statistics contains the density of (A), the combination of (A,B) and the combination of (A,B,C). The histogram is only available for column A. It contains a sample of up to 200 values and the number of rows which have this value. If required, SQL Server automatically creates additional column statistics including a histogram for other columns, too. However, the database option “auto create statistics” is to be set for it, as recommended by SAP.
Having not up-to-date index statistics is the most common suspect when seeing bad query performance. However, on an SAP system the index statistics are typically innocent. SAP strongly recommends to turn on the database options auto update/create statistics. This results in automatically updated statistics and works pretty well on an SAP system. There are only a few scenarios where SAP recommends performing a manual update statistics in addition. One example is a client import of a relatively small client (column “MANDT” means “Mandant” in German) into a huge SAP database. The same may be necessary when creating a new company code (column “BUKRS” means “Buchungskreis” in German) on a system which only has a few distinct customer codes.
Updating the statistics only makes sense when the selectivity of the updated columns changes significantly. For example, the document number (column “BELNR” means “Belegnummer” in German) is almost always very selective. Even when updating a one year old index statistics of this column you will not see any significant difference in the selectivity. Therefore updating the statistics of this column is not relevant for optimizer decisions.

· Parameters used for compilation
The SQL Server query optimizer estimates the number of selected rows based on the parameters passed at compilation time. This is very useful in many scenarios. A typical example is a delete flag (column “LOEKZ” means “Loeschkennzeichen” in German). There are many tables in ABAP which have such a column, containing either the value “ “ (for delete flag not set) and “X” (for delete flag set). There is often an index containing “LOEKZ” in order to speed-up selecting the rows, with have the delete flag set. Just having a look on the density 0.5 of this field, this index looks useless. Once you take the parameter value “X” into account and know from the histogram that only 0.01% of the rows have the value “X” (and 99.99% have the value “ “), the index becomes very attractive. This SQL Server feature of creating execution plans dependent on the parameters passed at compilation time is called parameter sniffing.
Although parameter sniffing helps in many cases, it may result in unexpected execution plans in some scenarios. Running a query the first time with untypical parameters results in an execution plan which may not fit for the subsequent executions of the same query. Dependent on the parameters passed by chance at compilation time, totally different execution plans may be used.

· Other factors
It is self-evident that an optimizer hint or a plan guide has a huge impact on the execution plan. But also SQL Server configuration and connection settings are important. SAP recommends to turn off parallel query execution by setting the configuration option “max degree of parallelism” to 1. Setting this to a different value may result in different (parallel) execution plans.

All of the above factors influence the execution plan. For an SAP system you will rather see an issue with parameter sniffing than with outdated index statistics. However, customers often think that they have solved a performance issue by updating the statistics. They are not aware of the side effect of the statistics update. The optimizer will create a new execution plan due to the new statistics. This time the parameters used for compilation are probably different from the previous compilation. It’s just like rolling the dice. Probably a DBBCC FREEPROCCACHE would also have (temporarily) solved the issue.

You may probably ask yourself what to do when running into a performance issue. It is the same as with any other issue. You first have to analyze it before taking action. With SQL Server 2005 you can easily figure both, the execution plan and the parameters used for compilation. Once you detect instable execution plans due to varying parameters you should force an execution plan, for example by adding an optimizer hint. You can add all possible SQL Server optimizer hints into SAP ABAP code. You will even find SAP Standard coding which will force certain indexes for different databases by using ABAP Query hints (please see OSS notes #129385, 133381)