The configuration option “Max degree of parallelism” (MaxDOP) defines how many processors SQL Server can use in parallel to execute a single SQL query. Setting this option to a value other than 1 may decrease the response time of a single, long running SQL command. However, this will also decrease the overall throughput and concurrency of SQL Server as the parallel executions are consuming more system resources as the serial executions. Running with MaxDOP = 1 can decrease the runtime (compared to the parallel execution) but will increase the concurrency (more users can work in parallel) and the throughput. For a long time SAP recommends to set MaxDOP to 1 to disable intra query parallelism completely, with the background that the common ERP query is small and does not return many rows. Another reason to force a MaxDOP of 1 was to provide predictable performance of queries as those get executed thousands of times throughout the day under different load conditions of the underlying server infrastructure.
In order to give a bit more background, let’s briefly touch on how SQL Server’s query parallelism works. Before a potential parallel query is executed, SQL Server checks the available resources. The number of processor resources available, available SQL Server worker threads and memory available to determine the number of processors used to service the query. Based on those checks, a query might not be executed using all CPU threads available, instead using only a few CPU threads (or even just one CPU thread).
When executing a query in parallel, multiple streams of data are sorted in parallel and merged afterwards. This leads to higher buffer usage in cache and in addition, the CPU resource consumption by a single query usually increases with the degree of parallelism.
It is important to note that the MaxDOP set either in the global configuration or with query hints as done by SAP BW, can applied to the different execution branches (parallel operators) of a query. That can result in a way higher number of SQL Server worker threads engaged in the execution of a query than MaxDOP is set to. Easy to imagine, especially in more complex queries joining several tables. We might end up with a plan that decides to scan all tables involved in a parallel manner. In such a case the MaxDOP setting is applied to each scan. Means we can have number of tables multiplied by MaxDOP worker threads allocated for the scan alone. We then can have MaxDOP applied for every single join. And so it can go on. At the end complex queries can allocate way more worker threads than the MaxDOP setting.
Although an isolated query executed in parallel generally will execute much faster, there is a point at which the parallel query execution becomes inefficient and can even extend the execution time. For example, parallel queries performing small joins and aggregations on small data sets can become inefficient when they run on all 64 CPU threads of a server. Due to different degrees of parallelism chosen at execution time, response times for one query can be different depending on resource availability. The most severe effect is the varying response times experienced by end users. Like administrators, end users want predictability in a system. They want predictable performance in the most important areas. Allowing SQL Server to execute queries in parallel can compromise this predictability. Different run times for the same query due to different run time decisions on parallel execution may occur.
Things to consider when changing MaxDOP to a value other than 1
- SQL Server has to create multiple plans for each statement, one for the parallel and one for the serial execution. So the compilation time will be higher for all statements and the system needs more memory for storing SQL plans.
The runtime of a statement can vary, depending on the type of execution (parallel or serial). The type of execution is determined through the SQL Server optimizer at runtime, depending on the load on the system and availability of system resources. A query that can run in parallel in the morning (as system load is low) might run in serial in the afternoon with a different runtime. This is especially true for small virtual or physical machines with limited system resources.
With the advent of increasing numbers of cores and therefore CPUs per physical socket (see this blog for details ) and with the enhanced potential of SQL Server 2012 to be able to handle parallel queries more gracefully, the recommendation to only use a MaxDOP setting of 1 has been relaxed for this and later releases of the product. The new recommendation from SAP in regards of the MaxDOP setting is as follows:
- It applies only for SQL Server 2012 or higher
- For SAP BW System there is no change, means the old recommendation of 1 is still valid. See note 1654613 for details.
- If the server has less than 16 CPUs the old recommendation of MaxDOP = 1 is still valid
- If the system has 16 to 63 CPUs you might set it to 1 or 2
- If there are 64 or more CPUs available, valid settings are 1, 2, 3, or 4
- A value of zero (0) is not allowed at any time (this setting means the SQL Server can decide how many threads it is using for the query (from 1 up to #CPUs, but not necessarily all CPUs))
The SAP alerting functionality will leverage these new MaxDOP thresholds as of the basis support packages listed in SAP Note 2133120.
The SAP Early Watch Alert will change accordingly to this new recommendation. This flow diagram illustrates the new colored warning levels for the MaxDOP setting in the early watch report. Click on the diagram to open it in a bigger resolution. You have to read the diagram from left to right.
To change the MaxDOP setting you can execute the following SQL script via the SQL Server Management Studio:
exec sp_configure ‘Max degree of parallelism’, <new value, e.g. 2>
reconfigure with override
The change will be active instantly without the necessity to restart SAP or SQL Server.
As you can see, we are careful with our recommendations in setting MaxDOP for SAP systems. Different DBMS systems are following different implementations of query parallelism. Some DBMS systems, especially systems that keep data in-memory resident, are claiming that they want to provide all CPU resources to maximize performance for a single query and have all other queries waiting. Argument for such a strategy is that queries are executed so fast with using all CPU resources that even under concurrent situations the wait time of the other queries is at the end lower than trying to split CPU resources to execute different queries. An approach that may work under certain circumstances for very specific OLAP type of workload (BW), but badly breaks in OLTP type of workload (ERP, CRM etc.). In SQL Server the idea is to maximize for concurrency. Something which definitely benefits OLTP or hybrid workloads. With the settings as we suggested it above, we think that we should have a very good compromise, especially for SAP ERP workloads. On the one side we should not overload CPU resources by engaging multiple worker threads for a single query. On the other side, the queries executed in parallel are of little complexity and should be executed in an efficient way.
Eager to hear your feedback after changing the settings.