Columnstore Index: Scan and parallelism

When you query a table with clustered columnstore index, it is run in parallel unless you force it to run serially using MAXDOP or if SQL Server determines at run-time that there are not enough resources to run this query in parallel.

This is all well and good but there is one interesting aspect. The delta rowgroup is scanned using a single thread. However, if you have multiple delta rowgroups, they are scanned in parallel with one thread per delta rowgroup.

Let us look at an example ccc-scan-1

Note, I only inserted 1 million rows and not (1024*1024 = 1048576), so the inserted rows will stay in the delta rowgroup. Now, let me run the following query

select avg (convert (bigint, unitsold)) from t_colstor

If you look at the actual execution plan below, it shows that it used 4 threads (note, the machine I am running it on has 4 cores) cci-scan2

But if you look at the XML plan, it shows that only one thread processed the 1000000 rows

cci-scan3

Now, let me compress the rows in delta rowgroup using the following command
-- run this to compressed OPEN RGs alter index t_colstor_NCCI on t_colstor reorganize with (COMPRESS_ALL_ROW_GROUPS = ON) cci-scan4

Now, I run the same query as before (use TF 8649 to force parallel execution if needed), and here is what I see for compressed rowgroup. Note, a single compressed rowgroup can be scanned by multiple concurrent threads.
cci-scan5

Do you need to be concerned about that a delta rowgroup is scanned single threaded? The answer is NO for two reasons (a) most columnstore indexes have very few delta rowgroups (b) if you have multiple delta rowgroups, they can be scanned in parallel with one thread per delta rowgroup.

You should see the same behavior with CCI (clustered columnstore index) as well.

Thanks

Sunil Agarwal