Clustered Columnstore Indexes (CCI) is a game changer for data warehouse workload, which is available from SQL Server 2014 Enterprise Edition. I worked for a PoC project last year to adopt Clustered Columnstore Indexes for data warehouse. The PoC method is simple but the improvement is very significant. Below is some detailed.
The Pain and the Result
Customer run a 353GB data warehouse on SQL 2008 R2, max memory is 20GB. The ELT tool is SAS, SSAS cube is built on top of data warehouse. The SSAS model is changed frequently and full process is ran every day due to their dynamical business nature. As the physical memory size is small, there is heavy IO loading to storage during data process and customer plan to increase physical memory to 200GB.
By changing the top 30 tables to CCI tables, the database size reduces from 353GB to 75GB, the largest table size reduces from 46GB to 4.5GB! The duration of daily ELT reduces from 35 seconds to 121 seconds when the SQL max memory is 20GB, reduces from 292 seconds to 49 seconds when the SQL max memory is 200GB. The duration of cube process reduces from 35 min to 15 min when the SQL max memory is 20GB, reduces from 21 min to 14 min when the SQL max memory is 200GB. So we are happy to see the huge improvement after adopting CCI without new hardware expense.
Step 1: Find out top tables
In this PoC there is no unsupported data type, no unique key, primary key or foreign key in every fact table as it's a well design data warehouse, I use below SQL find out top 30 tables in terms of size:
use [db_name] go select top 30 i.object_id, t.name as table_name, s.name as [schema_name], p.rows as rowc_ounts, sum(a.total_pages) / 128.0 as total_space_mb, sum(a.used_pages) / 128.0 as used_space_mb, (sum(a.total_pages) - sum(a.used_pages)) / 128.0 as unused_space_mb into #t_top_tables from sys.tables t inner join sys.indexes i on t.object_id = i.object_id inner join sys.partitions p on i.object_id = p.object_id and i.index_id = p.index_id inner join sys.allocation_units a on p.partition_id = a.container_id left outer join sys.schemas s on t.schema_id = s.schema_id where t.name not like 'dt%' and t.is_ms_shipped = 0 and i.object_id > 255 group by i.object_id,t.name, s.name, p.rows order by total_space_mb desc select * from #t_top_tables
Step 2: Generate T-SQL to drop all existing indexes and create Clustered Columnstore Indexes(CCI)
Below T-SQL is ran in same session in generate drop all existing indexes and create CCI statement:
declare @schema sysname, @table sysname, @index sysname, @sql nvarchar(4000) declare cur cursor for select o.schema_name, o.table_name, i.name as index_name from sys.indexes i inner join #t_top_tables o on i.object_id=o.object_id where (i.index_id>0 and i.index_id<255) and i.name not like '_WA_Sys_%' open cur fetch next from cur into @schema,@table,@index while @@fetch_status=0 begin set @sql='drop index ['+@schema+'].['+@table+'].['+@index +']' select @sql fetch next from cur into @schema,@table,@index end close cur deallocate cur select 'CREATE CLUSTERED COLUMNSTORE INDEX cl_'+table_name+' ON '+schema_name+'.'+table_name from #t_top_tables drop table #t_top_tables
Step 3: Run the generated script
Before you run the generated script in production, I recommend you to test in testing environment to measure the duration. Another suggestion is to turn database recovery model to simple to avoid growth of t-log.
SAS ODBC Error
Customer hit below error when they testing the SAS code:
ERROR: CLI describe error: [Microsoft][ODBC SQL Server Driver][SQL Server]Cursors are not supported on a table which has a clustered columnstore index. : [Microsoft][ODBC SQL Server Driver][SQL Server]Statement(s) could not be prepared.
The statement is a simple select, but SAS translates it to a cursor. With SAS tech support, we finally find out this is related to a SAS optimization configuration: Block read buffer size.
Below is how to set the READBUFF=0 in SAS Management Console:
1. Right click the Library in SMC -> Properties -> Options -> Advanced Options – > Optimization -> Block read buffer size
2. set this property to 0.
Posted by Shiyang, Aug 12 2016