Query performance, scalar UDFs, and predicate pushdown

Recently I had to troubleshoot a query that performed much slower than expected. The solution to that seemed sufficiently interesting to warrant a write-up, and the root cause of the problem reinforced a well-known best practice of avoiding scalar user-defined functions in set-based queries.

The original query looked like this:

SELECT 'Literal1', u.Col2 FROM dbo.Table1 AS u INNER JOIN dbo.Table2 AS l ON u.Col2 = l.Col2 WHERE l.Col3 = 'Literal2' AND NOT EXISTS ( SELECT 1 FROM dbo.Table2 AS l2 WHERE u.Col2 = l2.Col2 AND Col3 = 'Literal1' ) AND dbo.UDF1(u.Col1) IS NOT NULL;

Table1 had about 3 million rows, and Table2 had about 7 million rows. In its original form, the query was running for more than one hour, while the requirement was that it would complete within minutes.

The scalar user-defined function (UDF) in the WHERE clause immediately stood out as a warning sign (highlighted). Using scalar UDFs in set-based queries is generally not recommended for two reasons:

The function is invoked for each row in the set, and each invocation has a fixed and not insignificant cost, which is in addition to the cost of the statements in the function body.
The optimizer cannot accurately estimate the cost of the function (which could vary greatly), therefore it may come up with less than optimal query plans.

For the query above, removing the UDF from the WHERE clause brought down the execution time to 12 seconds. Looking at the query plan, it was clear why it took so long to execute the original statement – a full clustered index scan of Table1 (all 3 million rows of it) was followed by a filter operator, where the UDF was applied to the Col1 column of Table1. In other words, rather than first reduce the set via joins with Table2 and then apply the UDF to that much smaller set, the optimizer chose to first filter millions of rows with the UDF. This is a fairly typical example of a sub-optimal plan caused by an inaccurate estimate (reason #2 above), which was in turn caused by the usage of an “opaque” scalar UDF.

The obvious workaround for this problem would be to remove the UDF-based filter from the WHERE clause, insert the resulting intermediate result set into a temporary table, and then select from that temporary table, filtering the much smaller intermediate result set with the UDF. However, I wanted to avoid a procedural approach, which would be more complex and less robust than a single set-based statement.

The first attempt to come up with a single statement that avoids the original problem looked like this:

SELECT 'Literal1', up.Col2 FROM ( SELECT u.Col1, u.Col2 FROM dbo.Table1 AS u INNER JOIN dbo.Table2 AS l ON u.Col2 = l.Col2 WHERE l.Col3 = 'Literal2' AND NOT EXISTS ( SELECT 1 FROM dbo.Table2 AS l2 WHERE u.Col2 = l2.Col2 AND l2.Col3 = 'Literal1' ) ) AS up WHERE dbo.UDF1(up.Col1) IS NOT NULL;

Here we have a subquery in the FROM clause, and apply the filter with the UDF “on top” of the subquery, in the WHERE clause of the outer query. The intent is to evaluate the subquery first, before applying the filter. Unfortunately, this does not work – this query is logically equivalent to the original query, and SQL Server is at liberty to evaluate the predicates in the order that it deems optimal, which in this case still means filtering millions of rows in Table1 with the UDF-based filter, before joining with Table2.

Here is the query that did work (changes to previous attempt are highlighted):

The reason this succeeds in forcing the optimizer to evaluate the UDF filter after reducing the row set via joins is because we pass an expression, rather than a base column, to the function. The passed expression in this case is logically equivalent to the base column, but because it contains a reference to a column from a seemingly pointless outer query, the optimizer cannot push down the UDF-based filter to the inner subquery. In the plan generated for this query, the UDF executes only against a few hundred rows produced by the inner subquery, and the statement completes in 30 seconds – well within required time. Note that it wouldn’t be sufficient to use a literal empty string as the second part of the expression – to avoid predicate pushdown, it has to be a column from an outer query.

As a conclusion, this could be a useful query tuning technique for the specific case when the optimizer pushes down a filtering predicate (not necessarily UDF-based), and the resulting plan is less optimal than the one where the filter is evaluated later in the plan.

Leave a Reply Cancel reply