Migrating from SQL 2000 to SQL 2005 - Look for common catch

Recently I was involved in a migration exercice from SQL Server 2000 to SQL Server 2005 and encountered an additional catch related to impact of datatype mismatch on performance while writing TSQL statements (generally ignored by developers. Probably in past I would have also done the same mistake.. unknowingly!!!)

Generally you will find development team using char and nchar interchangebly while writing TSQL statement without bothering about the impact on performance (behind the scenes for SQL Server it boils down to query plan).

We use to notice the performance impact in SQL Server 2000 also for datatype mismatch between right hand operand and left hand operand in filter clauses.

Watch out !! SQL 2005 is here with all new query optimizer.

As SQL Server 2005 has changed the default weights associated with datatype mismatches and implicit datatype conversion one need to be little bit more careful as you might see different response times altogether in SQL Server 2005.

For example refer to below sample.

--This sample script shows the impact of datatype mismatch in TSQL statements

-- If you will run this sample in SQL 2000 and SQL 2005 you would be able to check differences in query plans and with large table you will see different performance !

--Increase the size of sample table to notice performance impacts or run the SLQ statements in some loop.

 

USE [AdventureWorks]

GO

--drop table TestIndexUsage

set nocount on

--create sample table with NONCLUSTERED index on char field (which would be used in join)

CREATE TABLE TestIndexUsage (

            [City] [char](30) COLLATE Latin1_General_CI_AS NOT NULL,

            [StateProvinceID] [int] NOT NULL,

            [ProvinceID] [int] NOT NULL,

            [ModifiedDate] [datetime] NOT NULL DEFAULT (getdate()),

) ON [PRIMARY]

CREATE NONCLUSTERED INDEX [Test] ON [dbo].[TestIndexUsage]

(

            [City] ASC,

            [ProvinceID] ASC

)WITH (STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, IGNORE_DUP_KEY = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = OFF) ON [PRIMARY]

GO

--add some sample data

delete TestIndexUsage

declare @i as integer

declare @j as integer

declare @str as char(30)

select @j=1

WHILE @j < 50

begin

            select @i=10

            WHILE @i < 150

            BEGIN

                        select @str = 'TEST' + cast(@i as char(4))

                        Insert into TestIndexUsage values (@str,1111,1111,getdate())

                        select @i = @i + 1

            end

            select @j = @j + 1

end

DBCC freeproccache

DBCC DROPCLEANBUFFERS

--NOTE :

-- This SQL will use simple plan with Index Seek / Scan

select [City], [ProvinceID] from TestIndexUsage

where [City] = 'TEST15'

--NOTE :

-- This SQL will end up using complex plan in SQL 2005 with NESTED LOOP and might shock you with performance

 

-- If you will run the below query in SQL 2000; it still will use simple plan with Index Seek / Scan only

select [City], [ProvinceID] from TestIndexUsage

where [City] = N'TEST15'

You got it right!! - We need to ensure same datatype for parameter passed and column in base table.

Guys - It's time to take care of such datatypes mismatch as they might shock with performance (not to mention some time this might impact your application functioanlity if Unicode is of importance).

Ensure proper datatypes in your TSQL statements and enjoy the rocking performance gains from SQL Server 2005.

Go and try what happens if the base table in above sample is changed to nchar or nvarchar and you the below query.

select [City], [ProvinceID] from TestIndexUsage

where [City] = 'TEST15'

            [City] [char](30) COLLATE Latin1_General_CI_AS NOT NULL