Applying Functions on WHERE Clause Columns might Cause Serious Performance Problems

Any function such as collate, convert, substring etc. applied on a column might make impossible for SQL Server to use any index on this column. This because SQL Server query optimizer considers the column after the function as a new column, because the column values are not stored in the index pages as returned by the function. For these reasons, it is not possible to use indexes on these columns.

This is of course is not only for WHERE clause columns. Join, order, group by or having clause may have same problem. However functions in the select list does not affect index selection.

Some common problematic functions are :

  • COLLATE
  • CONVERT
  • SUBSTRING
  • LEFT
  • LTRIM
  • RTRIM
  • User defined functions

Ok now lets make some demos to see this performance problem.

/*

Turgay Sahtiyan - Microsoft SQL Server Premier Field Engineer

Blog : www.turgaysahtiyan.com

Twitter : @turgaysahtiyan

*/

--Applying Functions on Where Clause Columns

--Create a work database with Latin1_General_CI_AS Collation

Create Database DBCollate COLLATE Latin1_General_CI_AS

GO

Use DBCollate

GO

--Create a work table

create table tbl1(a varchar(37), b char(400))

GO

--Populate it with 10.000 records

declare @i int=0

declare @a char(37)

while @i<10000 begin

  select @a=CAST(NEWID() as CHAR(37))

  insert tbl1

    select @a,REPLICATE('b',400)

  set @i=@i+1

end

--Create a clustered index on column a

Create Clustered Index CX_1 on tbl1(a)

use DBCollate

GO

/*****************************************

--Sample 1 : Collate

*****************************************/

--Compare the query plans of the below queries

--CTRL+M

--Query 1 - index seek

select * from tbl1

where a = 'F0166605-4683-44AB-A859-0A98FFD337B9'

--Query 2 - index scan

select * from tbl1

where a = 'F0166605-4683-44AB-A859-0A98FFD337B9' COLLATE TURKISH_CI_AS

/*****************************************

Sample 2 : LTRIM - RTRIM

*****************************************/

--This query normally does index seek

select * from tbl1

where a ='F0166605-4683-44AB-A859-0A98FFD337B9'

--LTRIM : Index Scan

select * from tbl1

where LTRIM(a) ='F0166605-4683-44AB-A859-0A98FFD337B9'

--RTRIM : Index Scan

select * from tbl1

where RTRIM(a) ='F0166605-4683-44AB-A859-0A98FFD337B9'

/*****************************************

Sample 3 : Substring - Left

*****************************************/

--Substring : Index Scan

select * from tbl1

where substring(a,1,1) ='F'

--Substring : Index Scan

select * from tbl1

where left(a,1) ='F'

--Below query is exactly same with the substring one.

--But this guy does index seek

select * from tbl1

where a like 'F%'

--drop work database

use master

go

drop database DBCollate