SQL product team has made significant improvements in columnstore index functionality, supportability and performance during SQL Server 2016 based on the feedback from customers. This blog series focuses on the performance improvements done as part of SQL Server 2016. Customers will get these benefits automatically with no changes to the application when they upgrade the application to SQL Server 2016.
The examples in this blog series are based on AdventureWorksDW2016CTP3 database that you can download from here. For each example, I will run the query in SQL Server 2014 and then contrast that with SQL Server 2016 and to reinforce the point that you get performance improvements without requiring any changes to your query or the workload.
SQL Server 2014
In SQL Server 2014, BATCH mode Batch Mode execution mode was only supported when the query was executed with DOP > 1. The basic premise was that customer running big analytics query will run it on multi-core machines with high degree of parallelism. While this is all good, but imagine you are running your analytics workload on a Server with 16 cores and everything is running smoothly. Let us say the concurrent workload spikes, SQL Server can choose to decrease the DOP automatically. In the extreme case, it is possible that the queries get executed with DOP=1. As you can expect, running query single threaded will increase the query execution time proportionally. However, the impact on queries using columnstore index is a lot more severe because SQL Server 2014 will also switch to Row Mode execution. Its like double jeopardy – your query slowed down both because of single threaded execution as well as Row Mode execution. This is what I refer to as Performance Cliff, a sudden significant drop in the query performance.
In the example below, we run a simple aggregate query on a SQL Server where MAXDOP has been configured to 0 which allows SQL Server to choose available DOP. As you notice, that this query was run in BatchMode with 4 threads. The query execution time was
CPU time = 626 ms, elapsed time = 354 ms.
Now, let me run the same query but specify DOP=1 explicitly to force single threaded execution. As shown the picture below, the query runs single threaded in RowMode. For this case, the query execution time increased by 8 times; from 354ms to 2617ms.
SQL Server Execution Times:
CPU time = 2390 ms, elapsed time = 2617 ms.
SQL Server 2016
Now, let us run the same query ‘as is’ in SQL Server 2016 on the same machine. The picture below show the query run single threaded by explicitly forcing DOP=1. Note, the query was executed in BatchMode as shown so no more Performance Cliff!!
The query execution time on SQL Server 2016 was much much faster with single threaded execution than what we saw on SQL Server 2014 even with multi-threaded execution. The key reason for this was another performance optimization, aggregate pushdown , in SQL Server 2016.
SQL Server Execution Times:
CPU time = 110 ms, elapsed time = 315 ms.
As you can see that with SQL Server 2016, there is no more performance cliff and your queries will continue to leverage BatchMode execution irrespective of degree of parallelism. You will need DBCompat 130 (i.e. SQL Server 2016 default compatibility level) https://msdn.microsoft.com/en-us/library/bb510680.aspx for this optimization.
SQL Server Tiger Team