Importance of choosing correct bucket count of hash indexes on a memory optimized table

 

I was working with a customer to troubleshoot memory optimized table issues.  In this scenario, our customer uses a memory optimized table variable.  He put 1 million rows of data into the table variable and then process it.  Based on what he said, I tried to come up with a repro to see if I can duplicate the problem.   While troubleshooting that issue, I ran into another issue where I can’t even insert 1 million row into a memory optimized table variable.

Here is what I did

I grabbed the example from https://msdn.microsoft.com/en-us/library/dn535766.aspx to create the type

CREATE TYPE [Sales].[SalesOrderDetailType_inmem] AS TABLE(
  [OrderQty] [smallint] NOT NULL,
  [ProductID] [int] NOT NULL,
  [SpecialOfferID] [int] NOT NULL,
  [LocalID] [int] NOT NULL,

  INDEX [IX_ProductID] HASH ([ProductID]) WITH ( BUCKET_COUNT = 8),
  INDEX [IX_SpecialOfferID] NONCLUSTERED (LocalID)
)
WITH ( MEMORY_OPTIMIZED = ON )

 

Then I created another disk based table with exact same columns and populate it with 1 million rows

CREATE table t2(
  [OrderQty] [smallint] NOT NULL,
  [ProductID] [int] NOT NULL,
  [SpecialOfferID] [int] NOT NULL,
  [LocalID] [int] NOT NULL
  )

 

To my surprise, the following query spent 17 minutes without even finishing with t2 having just 1 million rows.

DECLARE @t2 AS [SalesOrderDetailType_inmem]
insert into @t2 select * from t2

Then I started to look at the type definition and discovered that is BUCKET_COUNT was just 8 while productid was fairly unique in t2.  After recreating the type with bucket_count being 1 million, the above insert query finished in 5 seconds.

CREATE TYPE [SalesOrderDetailType_inmem] AS TABLE(
  [OrderQty] [smallint] NOT NULL,
  [ProductID] [int] NOT NULL,
  [SpecialOfferID] [int] NOT NULL,
  [LocalID] [int] NOT NULL,

-- INDEX [IX_ProductID] HASH ([ProductID]) WITH ( BUCKET_COUNT = 8),
  INDEX [IX_ProductID] HASH ([ProductID]) WITH ( BUCKET_COUNT = 1000000),
  INDEX [IX_SpecialOfferID] NONCLUSTERED (LocalID)
)
WITH ( MEMORY_OPTIMIZED = ON )

 

So it’s very critical you follow https://msdn.microsoft.com/en-us/library/dn494956(v=sql.120).aspx to determine bucket count of hash indexes.

If you are using SQL Server 2016, you can issue alter table (https://msdn.microsoft.com/en-us/library/dn269114.aspx) to change bucket_count if you experience performance issues related to bucket count.

You may also ask why the example given for memory optimized table variables use bucket_count of 8.   In general table variables are not meant to contain large number of rows.  In fact, if you large number of rows and start to join with other tables, you may experience performance issues due to the fact that table variable doesn’t provide good cardinality estimate at compile time.

 

Jack Li |Senior Escalation Engineer | Microsoft SQL Server

twitter| pssdiag |Sql Nexus