Clustered Columnstore Index: Enforcing uniqueness constraint in table

SQL Server introduced ‘updateable’ clustered columnstore index (CCI) starting with SQL Server 2014 to speed up analytics by orders of magnitude up to 100x  while significantly reducing storage, typically 10x.  Your mileage will vary. Microsoft SQL Server team strongly recommends using CCI for your Data Warehouse for larger tables (> 1 million rows). A typical recommended deployment for DW is to use CCI for the Fact table and traditional rowstore for Dimension tables. However, you should consider using CCI for dimension tables especially now that SQL Server 2016 supports creating one or more traditional btree indexes for efficient equality and short-range searches.

One interesting thing to know about Clustered Columnstore Index (CCI) is that it has no ‘key’ columns and the rows are NOT ordered. From this perspective, the word ‘clustered’ may be a bit confusing but the intent is to designate CCI as the ‘master’ copy of the data. If it helps, you can think of CCI as a ‘heap’ that stores data in ‘columnar storage format.

One of challenges with CCI in SQL Server 2014 is that there is no direct way to enforce uniqueness constraint. You can enforce uniqueness in a round-about way using materialized view as shown in the example here

create table t_account (
       accountkey                 int not null,
       accountdescription         nvarchar (50),
       accounttype                nvarchar(50),
       AccountCodeAlternatekey    int)
-- create CCI on it
CREATE CLUSTERED COLUMNSTORE index ACCOUNT_CI on t_account
drop view dbo.myview_t_account
go


create view dbo.myview_t_account with schemabinding
as select accountkey,accounttype, accountcodealternatekey
from dbo.t_account
go


-- create a materialized view
create unique clustered index clix_myview_t_account on myview_t_account (accountkey)
insert into t_account values (1, 'hello', 'hello', 1)
-- now insert a row with duplicate key which will fail due to uniqueness violation
insert into t_account values (1, 'hello', 'hello', 2)
-- Msg 2601, Level 14, State 1, Line 36
-- Cannot insert duplicate key row in object 'dbo.myview_t_account'
-- with unique index 'clix_my_t_account'. The duplicate key value is (1).
-- The statement has been terminated.

Starting with SQL Server 2016, you can enforce uniqueness on CCI directly using a traditional btree index as shown in the example here

-- create the table. Unique constraint defaults to NCI


create table t_account (
       accountkey                 int not null,
       accountdescription   nvarchar (50),
       accounttype                nvarchar(50),
       unitsold                 int,
   CONSTRAINT uniq_account UNIQUE (AccountKey)
)

— when you create CCI, it inherits existing NCIs. In our case, the NCI used for uniqueness

CREATE CLUSTERED COLUMNSTORE INDEX t_account_cci ON t_account

--Test the unique key:
insert into t_account values (1,1,1,1)
insert into t_account values (1,2,2,2)


Msg 2627, Level 14, State 1, Line 22
Violation of UNIQUE KEY constraint 'uniq_account'. Cannot insert duplicate key in object 'dbo.t_account'. The duplicate key value is (1).

The statement has been terminated.

 You can see this is much simpler and intuitive. In my next blog, I will describe how to enforce PK/FK constraint on CCI

Thanks

Sunil