Columnstore Index: Why do I need to create clustered columnstore Index on In-Memory OLTP table for Analytics?

The columnstore index on In-Memory table(s) targets real-time operational analytics (HTAP) allowing customers to eliminate the need for ETL (Extract, Transform and Load) into a separate Data Warehouse.  For details on the scenario, please refer to https://blogs.msdn.microsoft.com/sqlserverstorageengine/2016/03/18/real-time-operational-analytics-a-short-video-on-the-value-proposition-in-sql-server-2016/

 You can enable HTAP on disk-based tables by creating NCCI on one or more tables as shown in the command below CREATE NONCLUSTERED COLUMNSTORE INDEX <index-name> ON <table-name> (<column-list>)

However, if you want to do the same on in-memory OLTP tables, the syntax requires you to create clustered columnstore index as shown below ALTER table <in-memory table> add index <index-name> clustered columnstore

This has been very confusing to customers for two reasons (a) clustered index implies that it is the primary copy of the table but in reality the primary copy of the table is memory-optimized rowstore table (b) The rows in memory-optimized tables are not organized in any clustering key order. Only nonclustered indexes are supported.

Well, the reason for calling it a clustered columnstore index is legacy of some design we were looking into to manage hot/cold data. For now, think of this clustered columnstore index as an NCCI that requires all the columns in the table. This simplifies the narrative that for HTAP scenarios.  We understand that this is limitation because for many analytics workload, you may not require all the columns from the rowstore table. In future, we will look into supporting a NCCI on In-Memory tables.

Thanks,

Sunil Agarwal

SQL Server Tiger Team Twitter | LinkedIn

Follow us on Twitter: @mssqltiger | Team Blog: Aka.ms/sqlserverteam