Soften the RBAR impact with Native Compiled UDFs in SQL Server 2016

Article
02/17/2016

Reviewers: Joe Sack, Denzil Ribeiro, Jos de Bruijn

Many of us are very familiar with the negative performance implications of using scalar UDFs on columns in queries: my colleagues have posted about issues here and here. Using UDFs in this manner is an anti-pattern most of us frown upon, because of the row-by-agonizing-row (RBAR) processing that this implies. In addition, scalar UDF usage also limits the optimizer to use serial plans. Overall, evil personified!

Native Compiled UDFs introduced

Though the problem with scalar UDFs is well-known, we still come across workloads where this problem is a serious detriment to the performance of the query. In some cases, it may be easy to refactor the UDF as an inline Table Valued Function, but in other cases, it may simply not be possible to refactor the UDF.

SQL Server 2016 offers natively compiled UDFs, which can be of interest where refactoring the UDF to a TVF is not possible, or where the number of referring T-SQL objects are simply too many. Natively compiled UDFs will NOT eliminate the RBAR agony, but they can make each iteration incrementally faster, thereby reducing the overall query execution time. The big question is how much?

Real-life results

We recently worked with an actual customer workload in the lab. In this workload, we had a query which invoked a scalar UDF in the output list. That means that the UDF was actually executing once per row – in this case a total of 75 million rows! The UDF has a simple CASE expression inside it. However, we wanted to improve query performance so we decided to try rewriting the UDF.

We found the following results with the trivial UDF being refactored as a TVF versus the same UDF being natively compiled (all timings are in milliseconds):

	Interpreted (classic)	Native compiled (new in SQL Server 2016)	TVF
CPU Time	12734	8906	3735
Elapsed time	13986	8906	3742

As can be expected, the TVF approach is the fastest, but it is encouraging that the native compiled UDF reduced execution time by solid 36% even though the logic in the UDF was very trivial!

Test Scenario

In order to take this further, I decided to do some testing with a neutral workload. For this neutral test I used the DB1BTicket table which is 230+ million rows and in my test database had a Clustered Columnstore Index created on it.

In this test, I used two different types of UDFs: a trivial one and another one which has more conditional statements in it. The scripts for these are at the end of the post. The results with these different iterations are summarized in the table below:

	Interpreted (classic)	Native compiled (new in SQL Server 2016)	TVF
Simple UDF	1672.239 seconds	796.427 seconds	10.473 seconds
Multi-condition UDF	3763.584 seconds	848.106 seconds	Not attempted

Side Note: Parallelism

It is well known that UDFs (even those which do not access data and just do computation) cause a serial plan to be used. Here is the plan with interpreted UDF – as you can see it is serial:

Here is the plan with native compiled UDF – it is still serial:

Lastly, here is the execution plan with TVF – as you can see it is a parallel plan:

Here’s the million-dollar question to you: how badly do you want SQL Server to support parallel plans when UDFs are used anywhere in the plan? Do send us your feedback as comments.

Conclusion

While refactoring the scalar UDF as a TVF ultimately provided the best results, in cases where it is not possible to do this, using native compiled UDFs provides a very useful reduction in query execution time. Therefore, native compiled UDFs can be used as a mitigation or even considered a solution to those thorny cases where RBAR is absolutely necessary.

Appendix: UDF Definitions

Here is the simple UDF, in the classic T-SQL interpreted form:

CREATE FUNCTION dbo.FarePerMile ( @Fare MONEY, @Miles INT )

RETURNS MONEY

WITH SCHEMABINDING

BEGIN

DECLARE @retVal MONEY = ( @Fare / @Miles );

RETURN @retVal;

END;

Here is the simple UDF written as a native compiled version:

CREATE FUNCTION dbo.FarePerMile_native (@Fare money, @Miles int)

RETURNS MONEY

WITH NATIVE_COMPILATION, SCHEMABINDING, EXECUTE AS OWNER

BEGIN ATOMIC

WITH (TRANSACTION ISOLATION LEVEL = SNAPSHOT, LANGUAGE = N'us_english')

DECLARE @retVal money = ( @Fare / @Miles)

RETURN @retVal

END

Here is the simple UDF refactored as a TVF:

CREATE FUNCTION dbo.FarePerMile_TVF ( @Fare MONEY, @Miles INT )

RETURNS TABLE

RETURN

SELECT ( @Fare / @Miles ) AS RetVal;

Now, the multiple condition UDF, in the classic T-SQL interpreted form:

CREATE FUNCTION dbo.FictionalPricingLogic

(

@RPCarrier VARCHAR(2) ,

@Origin VARCHAR(3) ,

@Fare MONEY ,

@Miles INT

)

RETURNS MONEY

WITH SCHEMABINDING

BEGIN

DECLARE @retVal MONEY;

DECLARE @discount MONEY = 0; -- discount percentage

IF ( @RPCarrier = 'DL' )

SELECT @discount += 0.05;

IF ( @RPCarrier = 'AA' )

SELECT @discount += 0.05;

IF ( @Origin = 'DFW' )

SELECT @discount += 0.01;

IF ( @Origin = 'SEA' )

SELECT @discount += 0.009;

IF ( @Miles > 500 )

SELECT @discount += 0.01;

SELECT @retVal = @Fare * ( 1.0 - @discount );

RETURN @retVal;

END;

Here is the multiple condition UDF written as a native compiled version:

CREATE FUNCTION dbo.FictionalPricingLogic_Native

(

@RPCarrier VARCHAR(2) ,

@Origin VARCHAR(3) ,

@Fare MONEY ,

@Miles INT

)

RETURNS MONEY
WITH native_compilation
,schemabinding
,EXECUTE AS OWNER
AS
BEGIN
atomic
WITH (
TRANSACTION ISOLATION LEVEL = snapshot
,LANGUAGE = N'us_english'
)

DECLARE @retVal MONEY;

DECLARE @discount MONEY = 0; -- discount percentage

IF ( @RPCarrier = 'DL' )

SELECT @discount += 0.05;

IF ( @RPCarrier = 'AA' )

SELECT @discount += 0.05;

IF ( @Origin = 'DFW' )

SELECT @discount += 0.01;

IF ( @Origin = 'SEA' )

SELECT @discount += 0.009;

IF ( @Miles > 500 )

SELECT @discount += 0.01;

SELECT @retVal = @Fare * ( 1.0 - @discount );

RETURN @retVal;

END;

In this test, assume that it was not worth refactoring the multiple condition UDF as a TVF.

Appendix: Test Queries

Here are the sample queries used to test the performance of each of the above UDF variations:

SET STATISTICS TIME ON;

SET STATISTICS IO ON;

SELECT AVG(dbo.FarePerMile(ItinFare, MilesFlown))

FROM DB1BTicket;

SELECT AVG(dbo.FarePerMile_Native(ItinFare, MilesFlown))

FROM DB1BTicket;

The below query is the version with the TVF. Note the usage of CROSS APPLY:

SELECT AVG(myTVF.RetVal)

FROM DB1BTicket

CROSS APPLY dbo.FarePerMile_TVF(ItinFare, MilesFlown) AS myTVF;

SELECT AVG(dbo.FictionalPricingLogic(RPCarrier, Origin, ItinFare, MilesFlown))

FROM DB1BTicket;

SELECT AVG(dbo.FictionalPricingLogic_Native(RPCarrier, Origin, ItinFare,

MilesFlown))

FROM DB1BTicket;

Soften the RBAR impact with Native Compiled UDFs in SQL Server 2016

Native Compiled UDFs introduced

Real-life results

Test Scenario

Side Note: Parallelism

Conclusion

Appendix: UDF Definitions

Appendix: Test Queries

Additional resources