The Joys of Filtered Indexes

A filtered index, introduced in SQL Server 2008, is a non-clustered index that contains only a subset of the number of rows contained in a table.  Because there is a 1:1 ratio between the number of rows in a table and the number of rows in a regular non-clustered index, a filtered index can have performance and storage advantages.  These indexes are typically much smaller than the full-table non clustered indexes.  The statistics on these filtered indexes are filtered as well, which typically results in them being more accurate. 

Filtered indexes are well-suited for scenarios where you frequently query a well-defined subset of data.  This often includes querying ranges of specific values such as dates or monetary ranges, along with scenarios where you frequently have to filter out NULL values – such as Sparse columns or minimally updated columns.  Another scenario where filtered indexes work really nicely is when you're dealing with heterogeneous data, such as frequent queries on highly correlated columns in a table for certain subsets of data.   In this case, because of the frequency that you deal with the relationship between columns for a given data set, it may make sense to create a specific filtered index to meet your query needs rather than using a non-clustered index that contains the same number of rows as the base table.

Because the filtered index only needs to be maintained when there are rows inserted/updated/deleted that meet the index and filter definition, the overhead of maintaining the filtered index can be significantly less.  There are also many cases where the Query Optimizer is more likely to choose a lower-cost plan based on the presence of the filtered index.

Let's take a look at a filtered index in action.  First, I'll create a new table named Sales.NewSales, which is a copy of the AdventureWorks table Sales.SalesOrderDetail.

USE AdventureWorks

GO

IF OBJECT_ID('Sales.NewSales') IS NOT NULL
DROP TABLE Sales.NewSales
GO

SELECT *
INTO Sales.NewSales
FROM Sales.SalesOrderDetail
GO

Create a new column in the Sales.NewsSales table named NewUnitPrice.

ALTER TABLE Sales.NewSales
ADD NewUnitPrice MONEY NULL
GO

Update every 10th record in Sales.NewSales to have a non-NULL value.

UPDATE Sales.NewSales
SET NewUnitPrice =
CASE
WHEN SalesOrderDetailID % 10 = 1
THEN UnitPrice
ELSE NULL
END

Create a filtered index on the NewUnitPrice column to include any value that is NOT NULL.

CREATE NONCLUSTERED INDEX NewSalesIDX
ON Sales.NewSales(NewUnitPrice)
WHERE(NewUnitPrice IS NOT NULL)

Create an index on the UnitPrice field for comparison purposes.

CREATE NONCLUSTERED INDEX UnitPriceIDX
ON Sales.NewSales(UnitPrice)

Now, let's compare the number of pages between the two indexes.

SELECT
i.name, d.index_type_desc,
d.index_depth, d.avg_fragmentation_in_percent,
d.page_count,
IndexSizeMB = (page_count * 8192)/1024.0/1024.0
FROM sys.dm_db_index_physical_stats
( DB_ID(),
OBJECT_ID('Sales.NewSales'),
NULL,NULL, NULL
) d
JOIN sys.indexes i ON
d.index_id = i.index_id AND
d.object_id = i.object_id
WHERE
i.index_id > 1

From the screenshot below, you can see that the filtered index has many fewer pages – which also means that it takes less disk space.

So, how does performance compare between the two indexes?  Running the following query you can see that the filtered index out performs the regular non-clustered index by a significant margin.

SELECT NewUnitPrice
FROM Sales.NewSales
WHERE NewUnitPrice = 49.99

SELECT UnitPrice
FROM Sales.NewSales
WHERE UnitPrice = 49.99

For more information on the downside of filtered indexes, check out my post here:  https://blogs.msdn.com/b/timchapman/archive/2012/08/27/the-drawback-of-using-filtered-indexes.aspx

 

HTH,

Tim