What you should know if you are using Logical IOs to measure query plan quality

One of the presenters at recent SQL Server Open World conference in Denmark (https://www.miracleas.dk/index.asp?page=168&page2=323) suggested tuning SQL queries using logical IOs to measure quality of query plan. The IO count is returned for each object accessed in your query after you issue “set statistics IO on”. Here is an example of returned message:

Table 't1'. Scan count 10, logical reads 20, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

I have seen this approach many times in the past and I agree that in general, using less logical IOs is better and faster than using more. Therefore tuning query towards lower number of logical IOs in most cases produces query that is faster and uses less resources. But I would like to point out one potential pitfall when using this counter. In a nutshell reading the same data may end up showing very different number of logical IOs depending on the “data access method”. This is caused by counting each new “seek” into any page (even repeatedly the same one) as new logical IO (in fact several logical IOs because of the index navigation!) while reading all rows on the same page using “scan” counts as a single logical IO.

If you run the following statements in SSMS after you create table t1 as shown in the appendix

set statistics io on
select * from t1 with (index=2) where a in (1,2,3,4,5,6,7,8,9,10) -- Query 1
select * from t1 with (index=0) where a in (1,2,3,4,5,6,7,8,9,10) -- Query 2

you will find following 2 messages in the “Messages” tab of the SSMS result window:

Table 't1'. Scan count 10, logical reads 20, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 't1'. Scan count 1, logical reads 1, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

The first query is performing 10 index seeks showing “20 logical reads” using following query plan:

|--Index Seek(OBJECT:([test].[dbo].[t1].[i1]), SEEK:([test].[dbo].[t1].[a]=(1) OR [test].[dbo].[t1].[a]=(2) OR [test].[dbo].[t1].[a]=(3) OR [test].[dbo].[t1].[a]=(4) OR [test].[dbo].[t1].[a]=(5) OR [test].[dbo].[t1].[a]=(6) OR [test].[dbo].[t1].[a]=(7) OR [test].[dbo].[t1].[a]=(8) OR [test].[dbo].[t1].[a]=(9) OR [test].[dbo].[t1].[a]=(10)) ORDERED FORWARD)

The reason why SQL Server counts 2 logical IOs for each seek is that it has to interrogate the “root” page of the index first to locate the correct “data” page. If an index has more levels, there will be as many logical IOs used to locate the data page as many non-leaf nodes must be accessed. Additionally, more than one “leaf” page has to be visited in some cases.

The second query returns different plan:

|--Table Scan(OBJECT:([test].[dbo].[t1]), WHERE:([test].[dbo].[t1].[a]=(1) OR [test].[dbo].[t1].[a]=(2) OR [test].[dbo].[t1].[a]=(3) OR [test].[dbo].[t1].[a]=(4) OR [test].[dbo].[t1].[a]=(5) OR [test].[dbo].[t1].[a]=(6) OR [test].[dbo].[t1].[a]=(7) OR [test].[dbo].[t1].[a]=(8) OR [test].[dbo].[t1].[a]=(9) OR [test].[dbo].[t1].[a]=(10)))

The difference is in the “index seek” versus “table scan” operators. Accessing all rows by table scan counts each visited page as a single logical IO (one page visited in our case) while accessing rows in an index using seeks counts each “seek” and as pointed out above twice or even more times depending how many pages I need to visit to complete the seek.

Interestingly, if we remove index hint in the query, SQL Server optimizer will choose the query plan using more logical IO! To make situation even more complicated, we should be careful what counts as a single “seek” and what constitutes “multiple seeks”. If you slightly change our query to

select * from t1 where a between 1 and 10 -- Query 3

you will get following plan

  |--Index Seek(OBJECT:([test].[dbo].[t1].[i1]), SEEK:([test].[dbo].[t1].[a] >= CONVERT_IMPLICIT(int,[@1],0) AND [test].[dbo].[t1].[a] <= CONVERT_IMPLICIT(int,[@2],0)) ORDERED FORWARD)

And the IO count for Query 3 is

Table 't1'. Scan count 1, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Query 3 is using single seek (therefore shows “scan count 1”) when returning all rows qualifying by “between 1 and 10” predicate. Similar Query 1 above used 10 seeks (“scan count 10”) because it’s seek predicate contains OR and each new “branch” of the OR predicate is using separate “seek”.

Conclusion: When you are using “logical IO” count to evaluate your query efficiency or compare various query modifications be aware that “logical reads” may have very different meaning depending on the data access methods chosen by the query optimizer.

Notes:

  1. I have used CTP6 build of SQL Server 2008 to run the examples but the conclusion is valid also for SQL Server 2000 and 2005.
  2. You can use SQL Profiler to capture Logical Reads in "Batch completed" event and the same numbers you can get using sys.dm_exec_query_stats. However, these are cummulative numbers for all accessed object as opposed to those returned by SET STATISTICS IO ON as described in the blog

 

Appendix:

create table t1 (a int, b char(10))
declare @i int
set @i=0
set nocount on

while (@i<10)
begin set @i=@i+1; insert into t1 values (@i,'x') end

create index i1 on t1 (a,b)

-- Following query shows how many pages are used by the table and its index

select OBJECT_NAME(id), indid, dpages from sys.sysindexes where id=OBJECT_ID('t1')

 

Lubor