Conor vs. Anti-Semi-Join Reordering

I was asked to comment on a post about the order of WHERE NOT EXISTS (<subquery>) in a query and its impact on a query plan.

Specifically this thread:

https://social.msdn.microsoft.com/Forums/en/transactsql/thread/278c272b-5ebb-4fda-8985-49927bbd3799

 

Broadly, SQL Server does attempt to make it so that the order in which you write a query does not matter in terms of plan choice.  For many of the common cases, we have done this work.  I was reading through the source code last night and I don’t believe we do relative reordering of WHERE NOT EXISTS subqueries (which in the academic literature is called an anti-semi-join) in the same scope. 

In the specific customer case, the customer actually did WHERE NOT EXISTS (SELECT TOP 1 … <rest of subquery>).

I will specifically recommend _against_ doing this as a general practice.  The Optimizer can do all sorts of magic to transform semi-joins to joins and joins to semi-joins, and there are reorderings that are blocked if you do the TOP 1 (since it is not a relational operator).  The SQL Optimizer is smart enough to know not to retrieve more than one row for any WHERE EXISTS or WHERE NOT EXISTS subquery, so you should not consider it necessary or wise to add TOP 1 for this case.  (Perhaps there are cases when it is appropriate, but I don’t know of any general cases that could not be solved by using other hints).

As I have posted before, there are some cases where SQL does not do every possible rule transformation because some of them are just not common enough. 

Also, remember that we do add rules each release, so don’t assume that we never do these rewrites.  We like adding transformation rules in the engine :).

Happy Querying!

 

Conor