Ranking Functions: ROW_NUMBER

SQL Server 2005 introduced four new functions, ROW_NUMBER, RANK, DENSE_RANK, and NTILE that are collectively referred to as ranking functions.  These functions differ from ordinary scalar functions in that the result that they produce for a given row depends on the other rows in the result set.  They also differ from aggregate functions in that they produce exactly one output row for each input row.  Unlike aggregates they do not collapse a set of rows into a single row.  In this post, I'll take a look at ROW_NUMBER - the simplest of all ranking functions.

I'll use the following simple data set for the examples in this post:

CREATE TABLE T (PK INT IDENTITY, A INT, B INT, C INT)
CREATE UNIQUE CLUSTERED INDEX TPK ON T(PK)

INSERT T VALUES (0, 1, 8)
INSERT T VALUES (0, 3, 6)
INSERT T VALUES (0, 5, 4)
INSERT T VALUES (0, 7, 2)
INSERT T VALUES (0, 9, 0)
INSERT T VALUES (1, 0, 9)
INSERT T VALUES (1, 2, 7)
INSERT T VALUES (1, 4, 5)
INSERT T VALUES (1, 6, 3)
INSERT T VALUES (1, 8, 1)

Let's begin with the following simple example:

SELECT *, ROW_NUMBER() OVER (ORDER BY B) AS RowNumber FROM T

This query orders the rows in the table by column B and assigns sequential row numbers (much like an identity column) to these rows.  The result looks like so:

 PK    A     B     C     RowNumber
----- ----- ----- ----- ----------
6     1     0     9     1
1     0     1     8     2
7     1     2     7     3
2     0     3     6     4
8     1     4     5     5
3     0     5     4     6
9     1     6     3     7
4     0     7     2     8
10    1     8     1     9
5     0     9     0     10

The plan for this query is fairly simple:

  |--Sequence Project(DEFINE:([Expr1003]=row_number))
       |--Compute Scalar(DEFINE:([Expr1005]=(1)))
            |--Segment [GROUP BY:()]
                 |--Sort(ORDER BY:([T].[B] ASC))
                      |--Clustered Index Scan(OBJECT:([T].[TPK]))

This plan scans the table, sorts it on column B, and then the sequence project operator assigns sequential numbers to each row.

The segment operator normally breaks adjacent rows into segments or groups of related rows.  It is unnecessary in this example as all rows are in a single segment.  Note that the text plan normally does not show the GROUP BY columns for a segment operator; this information is only available from SHOWPLAN_ALL, SHOWPLAN_XML, or the graphical plan.  I've annotated the text plan to show explicitly that the segment operator has no GROUP BY columns.  We will see an example where the segment operator is necessary in just a moment.

The compute scalar operator is also unnecessary and, in fact, has been removed in the SQL Server 2008 CTPs.

ROW_NUMBER (and the other ranking functions) also support dividing rows into groups or partitions and computing the function separately on each group:

SELECT *, ROW_NUMBER() OVER (PARTITION BY A ORDER BY B) AS RowNumber FROM T

Since column A has two values (0 and 1), this query breaks the rows in the table into two groups and assigns row numbers separately to each group.  The result is:

 PK    A     B     C     RowNumber
----- ----- ----- ----- ----------
1     0     1     8     1
2     0     3     6     2
3     0     5     4     3
4     0     7     2     4
5     0     9     0     5
6     1     0     9     1
7     1     2     7     2
8     1     4     5     3
9     1     6     3     4
10    1     8     1     5

The plan for this query is nearly identical to the previous plan:

  |--Sequence Project(DEFINE:([Expr1003]=row_number))
       |--Compute Scalar(DEFINE:([Expr1005]=(1)))
            |--Segment [GROUP BY:([T].[A])]
                 |--Sort(ORDER BY:( [T].[A] ASC, [T].[B] ASC))
                      |--Clustered Index Scan(OBJECT:([T].[TPK]))

Notice that the new plan sorts on columns A and B and then the segment operator groups the rows by column A.  The sequence project assigns sequential numbers to each row just as in the first example but resets the row count at the beginning of each segment.  You may notice some similarities between how the grouping works in this plan and how the grouping works in a plan with a stream aggregate operator.

It is possible to combine multiple ROW_NUMBER functions (or multiple of any of the other ranking functions) with different ORDER BY clauses in a single query:

SELECT *,
    ROW_NUMBER() OVER (ORDER BY B) AS RowNumber_B,
    ROW_NUMBER() OVER (ORDER BY C) AS RowNumber_C
FROM T

 PK    A     B     C     RowNumber_B  RowNumber_C
----- ----- ----- ----- ------------ ------------
5     0     9     0     10           1
10    1     8     1     9            2
4     0     7     2     8            3
9     1     6     3     7            4
3     0     5     4     6            5
8     1     4     5     5            6
2     0     3     6     4            7
7     1     2     7     3            8
1     0     1     8     2            9
6     1     0     9     1            10

The plan for this query is just like the plan for the first query only it is repeated once for each ranking function with a different sort order for each ORDER BY clause:

  |--Sequence Project(DEFINE:([Expr1004]=row_number))
       |--Compute Scalar(DEFINE:([Expr1008]=(1)))
            |--Segment [GROUP BY:()]
                 |--Sort(ORDER BY:( [T].[C] ASC))
                      |--Sequence Project(DEFINE:([Expr1003]=row_number))
                           |--Compute Scalar(DEFINE:([Expr1006]=(1)))
                                |--Segment [GROUP BY:()]
                                     |--Sort(ORDER BY:( [T].[B] ASC))
                                          |--Clustered Index Scan(OBJECT:([T].[TPK]))

Note that the ORDER BY clause associated with each ranking function determines the behavior of that ranking function only.  These ORDER BY clauses do not determine the order that rows will be returned from the query.  To guarantee absolutely a certain output order from the query, you must add an explicit query-level ORDER BY clause:

SELECT *, ROW_NUMBER() OVER (ORDER BY B) AS RowNumber FROM T ORDER BY B

In this case, the extra ORDER BY clause does not affect the plan as the optimizer recognizes that the data is already sorted by column B following the computation of the ROW_NUMBER function.

Unfortunately, if we have multiple ROW_NUMBER functions with different ORDER BY clauses and a query-level ORDER BY clause, the optimizer does not consider whether computing the ROW_NUMBER functions in a different order might help.  For example, the following query results in an extra sort:

SELECT *,
    ROW_NUMBER() OVER (ORDER BY B) AS RowNumber_B,
    ROW_NUMBER() OVER (ORDER BY C) AS RowNumber_C
FROM T
ORDER BY B

  |--Sort(ORDER BY:([T].[B] ASC))
       |--Sequence Project(DEFINE:([Expr1004]=row_number))
            |--Compute Scalar(DEFINE:([Expr1008]=(1)))
                 |--Segment
                      |--Sort(ORDER BY:([T].[C] ASC))
                           |--Sequence Project(DEFINE:([Expr1003]=row_number))
                                |--Compute Scalar(DEFINE:([Expr1006]=(1)))
                                     |--Segment
                                          |--Sort(ORDER BY:([T].[B] ASC))
                                               |--Clustered Index Scan(OBJECT:([T].[TPK]))

On the other hand, if we simply reverse the order of the two ROW_NUMBER functions in the SELECT clause, the extra sort vanishes:

SELECT *,
 ROW_NUMBER() OVER (ORDER BY C) AS RowNumber_C,
ROW_NUMBER() OVER (ORDER BY B) AS RowNumber_B

FROM T
ORDER BY B

  |--Sequence Project(DEFINE:([Expr1004]=row_number))
       |--Compute Scalar(DEFINE:([Expr1008]=(1)))
            |--Segment
                 |--Sort(ORDER BY:([T].[B] ASC))
                      |--Sequence Project(DEFINE:([Expr1003]=row_number))
                           |--Compute Scalar(DEFINE:([Expr1006]=(1)))
                                |--Segment
                                     |--Sort(ORDER BY:([T].[C] ASC))
                                          |--Clustered Index Scan(OBJECT:([T].[TPK]))

Like a stream aggregate, SQL Server also can eliminate the sort for a sequence project operator if there is an available index:

CREATE INDEX TB ON T(B)

SELECT PK, ROW_NUMBER() OVER (ORDER BY B) AS RowNumber_B FROM T

  |--Sequence Project(DEFINE:([Expr1003]=row_number))
       |--Compute Scalar(DEFINE:([Expr1005]=(1)))
            |--Segment
                 |--Index Scan(OBJECT:([T].[TB]), ORDERED FORWARD)

However, because the query plan for a query with multiple ROW_NUMBER functions with different ORDER BY clauses "stacks" several sequence project operators on top of a single scan, SQL Server can use only a single index in such a plan.  Thus, the plan for such a query must contain at least one sort.  Moreover, even if the plan does use an index in place of one of the sorts, if that index does not cover all columns required in the query, the optimizer must choose between using the index along with a bookmark lookup or using the clustered index and performing an extra sort.  I'll leave experimenting with these issues as an exercise.

Before I conclude this post, I'd also like to point out that SQL Server does not support using the ROW_NUMBER function without an ORDER BY clause.  Generally, using a ranking function without an ORDER BY clause does not make much sense as the results are non-deterministic.  However, there are some scenarios where you may want to omit the ORDER BY clause.  For example, you may simply want to use the ROW_NUMBER function to assign identity values.  This post by Itzik Ben-Gan and this article by Alex Kozak both discuss this issue and how to work around it.  Finally, this post by Itzik Ben-Gan shows how you can calculate row numbers on earlier versions of SQL Server that did not support the ROW_NUMBER function.

In my next post, I'll look at the other ranking functions - namely RANK, DENSE_RANK, and NTILE.