Performance of a Query on Computed Column can degrade in Sql 2005 with the increase in complexity of function in computed column

 

With Sql server 2000 and 2005 computed columns can be use to improve the performance of queries since without the computed columns, indexes cannot be used as explained in the following blog by one of our Escalation Engineer Jack Li.

 

https://blogs.msdn.com/psssql/archive/2009/03/09/how-to-use-computed-columns-to-improve-query-performance.aspx

 

However in some cases, you may observe that a query on computed column running on SQL 2005 instance is performing much slower as compared to the same query running on SQL 2000 instance. In this blog I will be explaining one of the possible cause of the slowness of query on computed column in SQL 2005 instance with the help of following example.

 

The following script is used to create a table with a computed column which has an index created on it.

 

 

use tempdb

GO

set ansi_nulls

, ansi_padding

, ansi_warnings

, concat_null_yields_null

, quoted_identifier on

set numeric_roundabort off

GO

if object_id ( 'dbo.test') is not null drop table dbo.test

GO

create table dbo.test

(

id int identity ( 1, 1) not null

, col char (7000) not null

, computed_col as (checksum(isnull(col, '')))

, constraint test_pk primary key clustered( id)

)

GO

 

 

declare @i int

set @i=1000

while (@i<20000)

begin

insert into test(col) values(@i);

set @i=@i+100

end

 

create index ix_table

on dbo.test (computed_col)

 

 

use tempdb

go

set ANSI_NULLS, ANSI_PADDING, ANSI_WARNINGS, CONCAT_NULL_YIELDS_NULL, quoted_identifier, ARITHABORT ON

set NUMERIC_ROUNDABORT OFF

go

set statistics io on

set statistics profile on

set statistics time on

go

select id,col,computed_col

from dbo.test

where computed_col = 580804

set statistics io off

set statistics profile off

set statistics time off

 

 

When we run the above query on the Sql 2000 instance and SQL 2005 instance we observe that query performs slow on SQL 2005 instance as compared to SQL 2000 instance. This is because of the difference in the execution plan in SQL 2000 and SQL 2005

 

SQL 2000

========

 

Table 'test'. Scan count 1, logical reads 3, physical reads 0, read-ahead reads 0.

 

StmtText

-------------------------------------------------------------------------------------------------------------------------

SELECT [id]=[id],[col]=[col],[computed_col]=[computed_col] FROM [dbo].[test] WHERE [computed_col]=@1

|--Compute Scalar(DEFINE:([test].[computed_col]=checksum(isnull([test].[col], Convert('')))))

|--Bookmark Lookup(BOOKMARK:([Bmk1000]), OBJECT:([tempdb].[dbo].[test]))

|--Index Seek(OBJECT:([tempdb].[dbo].[test].[ix_table]), SEEK:([test].[computed_col]=580804) ORDERED FORWARD)

 

 

 

SQL 2005

========

 

Table 'test'. Scan count 1, logical reads 192, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

 

StmtText

---------------------------------------------------------------------------------------------------------------------------------

SELECT [id],[col],[computed_col] FROM [dbo].[test] WHERE [computed_col]=@1

|--Compute Scalar(DEFINE:([tempdb].[dbo].[test].[computed_col]=checksum([tempdb].[dbo].[test].[col])))

|--Clustered Index Scan(OBJECT:([tempdb].[dbo].[test].[test_pk]), WHERE:(checksum([tempdb].[dbo].[test].[col])=(580804)))

 

 

 

As seen in the above execution plans, in sql 2000 instance the query performs an Index seek on the index on computed column and thereby causes only 3 logical reads whereas SQL 2005 performs a Clustered Index Scan neglecting the index on computed column.

 

The difference in the execution plan is attributed to the difference in the behavior of the Algebrizer and Optimizer in SQL 2000 and SQL 2005 instance when dealing with computed column. As seen in the above execution plan, in sql 2005 the Algebrizer expands the computed column to  checksum(isnull([tempdb].[dbo].[test].[col]) which is further simplified to checksum([tempdb].[dbo].[test].[col])=(580804) in the optimization phase.

 

In the above example, in SQL 2005 the Algebrizer substitutes the computed_col with checksum(isnull(col, '')) and it is further simplified to checksum([tempdb].[dbo].[test].[col])=(580804) during the optimization phase. When the optimizer generates the execution plan it checks whether there is any index on checksum([tempdb].[dbo].[test].[col]) however the index exists on the checksum(isnull([tempdb].[dbo].[test].[col]))and hence the optimizer is unable to detect the index on computed columns and chooses to perform the clustered index scan.

 

However when we remove the isnull function from the computed column we observe that sql 2005 instance chooses the same plan as that of sql 2000 instance which is desirable.

 

use tempdb

GO

set ansi_nulls

, ansi_padding

, ansi_warnings

, concat_null_yields_null

, quoted_identifier on

set numeric_roundabort off

GO

if object_id ( 'dbo.test') is not null drop table dbo.test

GO

create table dbo.test

(

id int identity ( 1, 1) not null

, col char (7000) not null

, computed_col as (checksum(col))

, constraint test_pk primary key clustered( id)

)

GO

 

 

SQL 2005

=========

 

Table 'test1'. Scan count 1, logical reads 4, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

 

StmtText

----------------------------------------------------------------------------------------------------------------------------------------------------------------------

SELECT [id],[col],[computed_col] FROM [dbo].[test1] WHERE [computed_col]=@1

|--Compute Scalar(DEFINE:([tempdb].[dbo].[test1].[computed_col]=[tempdb].[dbo].[test1].[computed_col]))

|--Nested Loops(Inner Join, OUTER REFERENCES:([tempdb].[dbo].[test1].[id]))

|--Index Seek(OBJECT:([tempdb].[dbo].[test1].[ix_table1]), SEEK:([tempdb].[dbo].[test1].[computed_col]=(580804)) ORDERED FORWARD)

|--Clustered Index Seek(OBJECT:([tempdb].[dbo].[test1].[test_pk1]), SEEK:([tempdb].[dbo].[test1].[id]=[tempdb].[dbo].[test1].[id]) LOOKUP ORDERED FORWARD)

 

 

Alternatively if we remove the "not null" for the column

So the workaround for such cases is to simplify the functions used for computed columns

 

 

Parikshit Savjani

SE, Microsoft Sql Server

 

&

 

Levi Justus
Technical Lead, Microsoft Sql Server

 

Reviewed By

Amit Banerjee
Technical Lead, Microsoft Sql Server