TOP 100 Percent ORDER BY Considered Harmful.

(Updated 2006-27-03 9:00am Pacfiic Time – at the bottom)

SQL is a declarative language.  That means that the language declares the form of the output but not the method used to generate those results.  There are cases, however, where the language is not quite rich enough to describe what customers want.  There are also cases where it is possible to over-infer guarantees that are not really in the language at all.

ORDER BY is an example that I’d like to discuss in this post.  In ANSI SQL, the language has the ability to specify the output order of a query (the “presentation order”).   This applies to query results and to cursor results, if a cursor is being used.  This is exposed via ORDER BY, and this is really only legal in the ANSI spec on the outer-most query block of a query.  Microsoft SQL Server allows this in more places than the spec indicates (which is something we can do and still be in-line with the specification).  Specifically, we allow the use of ORDER BY in sub-selects or in view definitions to help define the set of rows that qualify with a TOP operation (TOP is not in ANSI SQL, by the way).

The default plan implementation for this code happens to sort the rows as part of performing the TOP operation.  Often this meant that the results happened to be returned in sorted order, and this led customers to believe that there was a guarantee that rows were sorted.  This is actually not the case.  If you want rows to be returned to the user in sorted order, you need to use an ORDER BY on the outermost query block (per ANSI) to guarantee the output presentation order. 

In SQL Server 2005, you can see how the output order is *not* guaranteed through the following example:

use tempdb

create table t1 (col1 int, col2 int)
declare @i int
set @i = 0
while @i < 20
insert into t1(col1, col2) values (@i, rand()*1000)
set @i = @i + 1

create view v as (select top 100 percent * from t1 order by col1 desc)

set showplan_text on
select * from v

The output from this example is:

  |–Table Scan(OBJECT:([tempdb].[dbo].[t1]))

col1        col2       
———– ———–
0           443
1           418
2           291
3           726
4           948
5           315
6           835
7           247
8           755
9           78
10          88
11          906
12          640
13          876
14          422
15          746
16          528
17          909
18          186
19          868

You’ll notice that the original table was created as a heap (no clustered index) and that no secondary indexes are defined.  So, if there were an ordering guarantee for the query based on the order by in the view, the rows should be sorted in descending order and the query plan would need to have a sort in it to make that happen.  However, you’ll notice that the query plan contains only a Table Scan, which will return the rows back in the order they happen to be read from disk.  In this particular case, the optimizer recognizes that TOP 100 PERCENT qualifies all rows and does not need to be computed at all.  It gets removed from the query plan, and there is no other reason to do an intermediate sorting operation.  As such, the output isn’t returned in any particular order.

So, please do not assume that nested sub-selects will guarantee order.  The optimizer will consider rewrites that invalidates this assumption.  If you need rows returned in a particular order, please add that to the outermost block of your SELECT statement.


Conor Cunningham

PS: Update… One comment I received seeks a bit more detail on why the ORDER BY is “ignored” in this case.  I’ll try to expand a bit more to see if that helps. 

From the semantics of the query, the optimizer only really honors the ORDER BY as part of the evaluation of the TOP in that same scope.  The syntax is a bit unfortunate because it causes people to believe that things “will be ordered”.  However, it really only says “I want this set of rows”.  Presentation orders only apply to the output of the query, not intermediate nodes.  Since we can reorder operations, you can’t actually view this as a procedural guarantee “first I sort, then I do whatever is ‘above’ the sort”.  You don’t need an optimizer if that were the case, as you aren’t asking declarative questions anymore. 

The bottom line is that even if we do the sort as part of the TOP operation in a sub-select, it does NOT guarantee anything about the output order of the query. 


is not the same as:


The top query guarantees the output order of the query.  The bottom query does not (even if the rows happen to come back in sorted order)



Comments (18)

  1. JVILLAGA says:

    Man, that explains a lot!  

    I’ve never seen this issue explained so well.

    …Off to fix dubiously constructed views!

  2. Excellent Post …..

    I get the part where TOP 100 was ignored. I am curious as to why the ORDER BY clause in the view definition was ignored.

  3. Marco Russo says:

    Mi &#232; capitato di frequente di vedere l’uso della clausola TOP 100 PERCENT usato nella definizione di…

  4. mcdrewski says:

    Trying to create a view which orders (CREATE VIEW blah AS SELECT A,B,C FROM X ORDER BY A) gives : "The ORDER BY clause is invalid in views, inline functions, derived tables, subqueries, and common table expressions, unless TOP or FOR XML is also specified."

    But above you say that if TOP and ORDER BY are specified then ORDER BY is (euphemism alert) "optimised" out.  Therefore, I don’t understand how this is consistent.  The error may as well say "The ORDER BY clause is invalid in views, inline functions, derived tables, subqueries, and common table expressions", since if it can never be honoured it may as well be illegal.

    I guess the SQLServer way to implement data hiding (yes, we use views for this purpose where order is important) is not to use views but to implement all views as stored procedures which do the same thing.

  5. Since we’re starting a new blog for the whole query processing team, we think it’s appropriate to start…

  6. When working with SQL Server 2000, I used to have this little trick I’d pull out after exhausting all…

  7. When working with SQL Server 2000, I used to have this little trick I’d pull out after exhausting all…

  8. When working with SQL Server 2000, I used to have this little trick I’d pull out after exhausting all

  9. TOP 100 Percent ORDER BY Considered Harmful Straight to the subject about how it works in SQL 2000 &amp;

  10. Last week I wrote about a special case of the TOP operator known as ROWCOUNT TOP . This week I’ll take

  11. One of the greatest lessons I’ve learned in building software is that every technical implementation

Skip to main content