Paging SQL Server result sets

Paging through result sets is an approach for reducing the network and client resources used to display large result sets. Essentially, the approach is to load only a page (e.g. 100 rows) of data at a time. It is likely that a user will only want to see a page of data. If they want to see more data then the next page is retrieved etc. Because the entire result set is probably not retrieved and does not need to be stored on the client the network and client resource demands will be lower. Ideally, latency for showing the first rows of data should also be lower since only a subset of data is needed.

The most commonly advocated SQL Server paging approach is based on the row_number ranking function.

with NumberedResult as
(
select row_number() over (order by M.[Id]) as [Row], M.[Id], M.[Name]
from [System_Runtime].[Methods] as M
)
select NR.[Id], NR.[Name]
from NumberedResult as NR
where NR.[Row] between @startRow and @stopRow;

This works by associating a row number with each row and fetching blocks of row numbers. For this to work correctly there must be a fixed row order defined by the over expression and the result set must be unchanging. Otherwise each call to the query may return results in a different order or even different results destroying the illusion of paging through a point in time result set.

A generally better approach is to use top to retrieve a page of data.

select top(@pageSize) M.[Id], M.[Name]
from [System_Runtime].[Methods] as M
where M.[Id] > @lastId
order by M.[Id]

As before, for this to work correctly there must be a fixed row order defined by the order by clause and the result set must be unchanging. The reason this will usually work better is that if the order by clause is covered by an index then the data can be found without sorting the entire data set. In theory, row_number could work this way also but in practice does not. The difference can be seen by looking at query plans—typically a row_number based query will form the entire result set while a top based approach will seek to the first index entry and then return the next page of rows in the index.

If there is no fixed row order or the result set can change then data can be read into a temporary table and processed by either of the two previous approaches.

ADO.Net has good support for paging data https://msdn.microsoft.com/en-us/library/tx1c9c2f.aspx. The Fill method has a specific overload to support paging in of data.

Below I compare the approaches for two interesting performance questions for a table of .Net Framework with 207 966 rows:

· What is the cost of retrieving the first page?

· What is the cost of retrieving all the pages?

Below are the results in seconds to perform the tests 100 times. As always it is wise to not make assumptions in performance or concurrency critical code and to test various approaches yourself.

Approach

First page (sec)

All pages (sec)

Row number

2.78

7 310.47

Top

0.26

51.17

Temporary table

111.13

139.94

 

When applicable, the top approach has clear performance advantages. Also it will typically use less server resources than the other approaches.

Here are the test scripts (run with the query option to discard the result set after execution but to collect client statistics).

-- row_number approach: First page
dbcc freesystemcache ('All')

declare @repetitions int = 100;

declare @i int = 0;
while @i < @repetitions
begin
declare @pageSize bigint = 100;
declare @lastrow bigint = (select count(*) from [System_Runtime].[Methods]);

declare @startRow bigint = 1;

declare @stopRow bigint = @startRow + @pageSize - 1;

with NumberedResult as
(
select row_number() over (order by M.[Id]) as [Row], M.[Id], M.[Name]
from [System_Runtime].[Methods] as M
)
select NR.[Id], NR.[Name]
from NumberedResult as NR
where NR.[Row] between @startRow and @stopRow;

set @i += 1;
end;

-- row_number approach: All pages
dbcc freesystemcache ('All')

declare @repetitions int = 100;

declare @i int = 0;
while @i < @repetitions
begin
declare @pageSize bigint = 100;
declare @lastrow bigint = (select count(*) from [System_Runtime].[Methods]);

declare @startRow bigint = 1;

while @startRow < @lastRow
begin
declare @stopRow bigint = @startRow + @pageSize - 1;

with NumberedResult as
(
select row_number() over (order by M.[Id]) as [Row], M.[Id], M.[Name]
from [System_Runtime].[Methods] as M
)
select NR.[Id], NR.[Name]
from NumberedResult as NR
where NR.[Row] between @startRow and @stopRow;

set @startRow += @pageSize;
end;

set @i += 1;
end;

-- top approach: First page
dbcc freesystemcache ('All')

declare @repetitions int = 100;

declare @i int = 0;
while @i < @repetitions
begin
declare @pageSize bigint = 100;
declare @maxId bigint = (select max(M.Id) from [System_Runtime].[Methods] as M);

declare @lastId bigint = 0;

select top(@pageSize) M.[Id], M.[Name]
from [System_Runtime].[Methods] as M
where M.[Id] > @lastId
order by M.[Id]

set @i += 1;
end;

-- top approach: All pages
dbcc freesystemcache ('All')

declare @repetitions int = 100;

declare @i int = 0;
while @i < @repetitions
begin
declare @pageSize bigint = 100;
declare @maxId bigint = (select max(M.Id) from [System_Runtime].[Methods] as M);

declare @lastId bigint = 0;

while @lastId < @maxId
begin
select top(@pageSize) M.[Id], M.[Name]
from [System_Runtime].[Methods] as M
where M.[Id] > @lastId
order by M.[Id]

set @lastId += @pageSize;
end;

set @i += 1;
end;

-- temporary table approach: First page
dbcc freesystemcache ('All')

declare @repetitions int = 100;

declare @i int = 0;
while @i < @repetitions
begin
declare @pageSize bigint = 100;

create table #results
(
[Row] bigint not null primary key,
[Id] bigint not null,
[Name] nvarchar(max) not null
);

insert into #results ([Row], [Id], [Name])
select row_number() over (order by M.[Id]) as [Row], M.[Id], M.[Name]
from [System_Runtime].[Methods] as M

declare @startRow bigint = 1;

declare @stopRow bigint = @startRow + @pageSize - 1;

select R.[Id], R.[Name]
from #results as R
where R.[Row] between @startRow and @stopRow;

drop table #results;

set @i += 1;
end;

-- temporary table approach: All pages
dbcc freesystemcache ('All')

declare @repetitions int = 100;

declare @i int = 0;
while @i < @repetitions
begin
declare @pageSize bigint = 100;

create table #results
(
[Row] bigint not null primary key,
[Id] bigint not null,
[Name] nvarchar(max) not null
);

insert into #results ([Row], [Id], [Name])
select row_number() over (order by M.[Id]) as [Row], M.[Id], M.[Name]
from [System_Runtime].[Methods] as M

declare @lastrow bigint = (select count(*) from #results);
declare @startRow bigint = 1;

while @startRow < @lastRow
begin
declare @stopRow bigint = @startRow + @pageSize - 1;

select R.[Id], R.[Name]
from #results as R
where R.[Row] between @startRow and @stopRow;

set @startRow += @pageSize;
end;

drop table #results;

set @i += 1;
end;