High CPU after upgrading to SQL Server 2005 from 2000 due to batch sort

We have had some customers who reported to us that their queries consumed more CPU in SQL Server 2005 after they upgraded from 2000.   When you examine the plans.  Both are very similar.   But SQL Server 2005 has something extra.  It has "OPTIMIZED" keyword.  Here is an example plan segment:

 |--Nested Loops(Left Outer Join, OUTER REFERENCES:([B].[id_Table02Binding], [B].[si_HashBucket], [Expr1113]) OPTIMIZED WITH UNORDERED PREFETCH)                   
      |--Filter(WHERE:([testDB01].[dbo].[Table02].[dt_DeleteDatetime] as [R].[dt_DeleteDatetime] IS NULL))                                                      
      |    |--Nested Loops(Inner Join, OUTER REFERENCES:([Uniq1039], [R].[vc_Table02Name], [Expr1112]) WITH UNORDERED PREFETCH)                                    
      |         |--Nested Loops(Inner Join, OUTER REFERENCES:([C].[bi_Table02Definition], [Expr1111]) WITH UNORDERED PREFETCH)                                     
      |         |    |--Nested Loops(Inner Join, OUTER REFERENCES:([B].[bi_Table02Definition], [Expr1110]) WITH UNORDERED PREFETCH)                                
      |         |    |    |--Clustered Index Seek(OBJECT:([testDB01].[dbo].[Table02Binding].[UIX_Table02BindingScope] AS [B]), SEEK:([B].[ti_Inversion]=[@ti_Inversi
      |         |    |    |--Clustered Index Seek(OBJECT:([testDB01].[dbo].[Table03].[IX_Table03] AS [C]), SEEK:([C].[bi_Table02Definition]=[AuthZ_FSS0
      |         |    |--Index Seek(OBJECT:([testDB01].[dbo].[Table02].[PK_Table02] AS [R]), SEEK:([R].[bi_Table02Definition]=[testDB01].[dbo].[Table03].[bi_Rol
      |         |--Clustered Index Seek(OBJECT:([testDB01].[dbo].[Table02].[IX_Table02Name] AS [R]), SEEK:([R].[vc_Table02Name]=[testDB01].[dbo].[Table02].[vc_Table02Name
      |--Clustered Index Seek(OBJECT:([testDB01].[dbo].[Table01].[PK_Table01] AS [E]), SEEK:([E].[id_Table02Binding]=[testDB01].[dbo].[Table02Binding

This is a SQL 2005 optimization feature called batch sort.  What it does is try to order rows retrieved from the outer side of the nested loop to minimize random seek on the inner side of the join.  It can reduce IO cost of random seek.  If you use profiler trace and turn on SQLTransaction event, you will see many small transactions with transaction name as 'sort_init before the query is finished with this kind of plan.   More details can be found in blog:  https://blogs.msdn.com/craigfr/archive/2009/03/18/optimized-nested-loops-joins.aspx.

This feature generally improves performance if large number of rows are touched.  But in situations where only smaller number of rows are touched, overhead of this optimization can overweigh benefit.   If you have very queries that touch small number of rows, you can CPU consumptions going up in SQL 2005 following upgrade from 2000.  SQL Server 2008 behaves the same.

So what's the solution?

The solution is that you turn off this feature by using trace flag 2340.  This will reduce CPU consumption. Having said that, you need to do testing.  As mentioned earlier, if large number of rows are touched by the plan, the feature can be very beneficial.    

 

Jack Li | Senior Escalation Engineer | Microsoft SQL Server Support