Columnstore Index Performance: SQL Server 2016 – Aggregate Pushdown


SQL product team has made significant improvements in columnstore index functionality, supportability and performance during SQL Server 2016 based on the feedback from customers. This blog series focuses on the performance improvements done as part of SQL Server 2016. Please refer to previous blog No performance cliff  in this series for details.

Aggregates are a very common construct in analytics queries. For example, you may want to aggregate sales per quarter for each products you sell. With columnstore index, SQL Server processes aggregate in BatchMode thereby delivering order of magnitude better performance when compared to rowstore.  SQL Server 2016 takes the aggregate performance to the next level by pushing aggregate computations to the SCAN node. Yes, this improvement is on top of BatchMode execution.

The picture below shows and aggregate query that processes 10 million rows and computes a single aggregate SUM of Quantity sold from the SALES table 
SELECT SUM (Quantity)
FROM  SALES

SQL Server 2014 scans these 10 million rows in batches (e.g. 900 rows) and sends these batches to Aggregate operator to compute the aggregate.  The picture below shows 10 million rows moving from SCAN node to the Aggregate node.

aggregate-1

In SQL Server 2016, the aggregate operator itself is pushed to the SCAN node (i.e. closer to the source of the data) and the aggregate is computed there for compressed rowgroups. The picture below shows that 0 rows moved from SCAN node to the AGGREGATE node. This is because the aggregate was computed at the SCAN node. The dotted line shows the computed aggregate, 1 row in this case, was sent internally to the output of AGGREGATE node. Couple of  important points to note

  • The structure of Query plan structure is identical between SQL Server 2014 and SQL Server 2016 and the main difference is how the rows are processed. For this reason, Aggregate pushdown optimization is available across all database compatibility levels.
  • Aggregate push down is only done for the rows in compressed rowgroup. The rows in delta store will flow from SCAN node to the AGGREGATE node like before

aggregate-2

Let us now look at a concrete example that contrasts the aggregate processing between SQL Server 2014 and SQL Server 2016.

SQL Server 2014

The picture below shows a aggregate query and its execution plan. Note that all 11 million row flow from SCAN node to the AGGREGATE node.

The query execution time for this query was as follows

 SQL Server Execution Times:   CPU time = 547 ms, elapsed time = 389 ms.

aggregate-3

SQL Server 2016

The same query was run ‘as is’ on SQL Server 2016  and as the picture below shows the aggregate computation was indeed done at the SCAN node. The query execution time for this case was 2x lower than what we saw with SQL Server 2016. A more interesting number is the CPU time which was 3x lower than SQL Server 2014. In actual production workloads, we have seen much more dramatic performance gains.

SQL Server Execution Eimes   CPU time = 171 ms, elapsed time = 167 Ms.

aggregate-4

Other interesting thing to notice is that the SCAN node has a property to show the number of rows that were aggregated locally. As expected for the example above, as shown in the picture below, all the rows were aggregated locally

aggregate-41

To show that Aggregate pushdown optimization is not a available for rows in delta rowgroup, I copied 50k rows into new table temp_cci with clustered columnstore index and then I ran the same aggregate query as shown in the picture below. Note, that all 50k rows are flowing from SCAN node to AGGREGATE node 

aggregate-5

In summary, Aggregate Pushdown will give you the performance boost automatically when you upgrade your application to SQL Server 2016 requiring no changes to your query. there are some restrictions as described below

  • Supported aggregate operators are MIN, MAX, SUM, COUNT, AVG
  • Any datatype <= 64 bits is supported. For example, bigint is supported as its size is 8 bytes but decimal (38,6) is not because its size is 17 bytes. Also, no string types are supported
  • Aggregate operator must be on top of SCAN node or SCAN node with group by

 

Thanks,

Sunil Agarwal
SQL Server Tiger Team
Twitter | LinkedIn
Follow us on Twitter: @mssqltiger | Team Blog: Aka.ms/sqlserverteam

Comments (4)

  1. Hi Sunil

    Nice Article

    The Actual Number of rows (0) versus the Estimated Number of Rows (11669600) during the SCAN mode might lead to a wrong interpretation about the inadequate table and index statistics. Hopefully you’ve foreseen the Actual Number of Locally Aggregated Rows (116699600). But I would have preferred to see this information just above the Estimated Number of Rows

    Best Regards

    1. Good point. At first, it indeed looks confusing. we have documented this and hopefully expert DBAs like yourself, who may need to look at this level of details, will know how to interpret

  2. lev says:

    I’m running a similar query on SQL server 2016 (13.0.4411) with db compatibility level 130. I do not see the aggregate push down the behavior is same as 2014. Am I missing something? Are there any known issues?

    1. Please send me an actual execution plan. Aggregate push down as some conditions that need to be met. For example, the data type needs to fit in 8 bytes etc

Skip to main content