Multi-fold increase in throughput for big gateway query compiles in SQL Server


There are some exciting new changes in the database engine that dramatically improves query compilation scalability in SQL Server 2014 and SQL Server 2016. These changes are yet another reason to upgrade your SQL Server versions.

When a query first starts and has to be compiled in SQL Server, it needs to acquire memory for various phases of compilation (parsing, algebrization, optimization etc.). This memory does not include the memory that is required to run (execute) the query. When multiple queries start compiling, there is no limit to number of queries that can concurrently compile. However, as the memory requirement grows and reaches a threshold, the query needs to acquire a semaphore to pass through the gateway. As you can see in the picture below, the number of concurrent compiles allowed decreases as queries pass from one gateway to another. Think of this as an inverted funnel with a very small opening. The throughput (flow rate ) is very low as you move towards the opening (big gateway). If the number of concurrent compiles has reached the limit allowed in a specific gateway, any new queries wanting to enter the gateway will wait on ‘RESOURCE_SEMAPHORE_QUERY_COMPILE’  and this will be seen in various diagnostics such as sys.dm_os_waiting_tasks, DBCC MEMORYSTATUS etc.

image

The number of queries allowed to compile concurrently in each gateway is shown on the left side of the picture above. This can also be seen in the output of DBCC MEMORYSTATUS. This output is taken from a system that has 48 logical cores.

Small Gateway (default)                  Value
—————————————- ———–
Configured Units                         192  // = 4 times number of cores
Available Units                          192
Acquires                                 0
Waiters                                  0
Threshold Factor                         380000
Threshold                                380000

(6 row(s) affected)

Medium Gateway (default)                 Value
—————————————- ———–
Configured Units                         48 // =  Number of cores
Available Units                          48
Acquires                                 0
Waiters                                  0
Threshold Factor                         12
Threshold                                -1

(6 row(s) affected)

Big Gateway (default)                    Value
—————————————- ———–
Configured Units                         1 // = only 1 query
Available Units                          1
Acquires                                 0
Waiters                                  0
Threshold Factor                         8
Threshold                                -1

This model has been in place for several versions of SQL Server and works very well and allows SQL Server to scale very well for large number of concurrent compiles for most workloads. However, with modern hardware (with several cores and large amounts of RAM) and new workloads pushing SQL Server to new boundaries, we often see several queries waiting for ‘RESOURCE_SEMAPHORE_QUERY_COMPILE’  in the Big Gateway. This is because of the restriction of only one large query compilation in the Big Gateway even when there is large amounts of memory available.

Now for the exciting announcement. We realized the need to improve the scalability of SQL Server for such workloads and have released an update which will dynamically adjust the number of concurrent compiles allowed in the Big Gateway.

What this means is SQL Server will automatically determine the number of concurrent queries that will be allowed in the Big Gateway depending on the Target Memory. Meaning, SQL Server will allow one query per 25GB of memory up to 80% of target. The upper limit is only enforced by the medium gateway limit since all queries in big gateway would have first passed through medium gateway. For example, on a system that has 24 physical/48 logical cores and 64GB RAM, the new behavior allows for two large 25GB query compilations up to 48 concurrent compilations (as allowed by the medium gateway) if queries do not need 25GB for compilation. This is a dramatic improvement in number of large compiles compared to earlier versions.

Note that this dynamic behavior is evaluated at the resource pool level. Hence each resource pool will benefit from allowing multiple compilations in the big gateway.

As you can see from the revised picture below (depicting the new improvement), several more queries can compile concurrently in the big gateway. The funnel now has a much wider opening and hence the throughput (flow rate) is dramatically increased.

image

To take advantage of this new policy change, install the following update for SQL Server 2014 and turn on trace flag 6498.

Large query compilation waits on RESOURCE_SEMAPHORE_QUERY_COMPILE in SQL Server 2014

In SQL Server 2016, (starting with CTP 2.4), you will not require the trace flag as the dynamic policy for Big Gateway is on by default. In addition, we have introduced a new DMV sys.dm_exec_query_optimizer_memory_gateways that will give both configuration and run time information on all the compile gateways for that instance.

In the sample output shown below taken from a machine that has 24 cores (48 logical with hyper threading) and 128GB RAM, you can see that up to 5 concurrent compiles will be allowed in the Big gateway:

pool_id     name              max_count   active_count waiter_count threshold_factor     threshold  is_active
———– —————– ———– ———— ———— ——————– ———- ———
1           Small Gateway     192         0            0            380000               380000     1
1           Medium Gateway    48          0            0            12                   -1         0
1           Big Gateway       5           0            0            8                    -1         0
2           Small Gateway     192         0            0            380000               380000     1
2           Medium Gateway    48          0            0            12                   -1         0
2           Big Gateway       5           0            0            8                    -1         0

 

An extended event query_optimizer_memory_gateway has also been added and if enabled will fire whenever the query enters medium or big gateway with the following information in the event data.

name                            description
——————————- ——————————————–
medium_gateway_active_acquires  count of active acquires for medium gateway
medium_gateway_count_waiters    count of waiters  for medium gateway
medium_gateway_threshold        Threshold  for medium gateway
big_gateway_active_acquires     count of active acquires for big gateway
big_gateway_count_waiters       count of waiters  for big gateway
big_gateway_threshold           Threshold  for big gateway
gateway_number                  Gateway number. used in predicates.
pool_name                       Name of the resource pool
gateway_name                    Name of the gateway trying to enter

 

To summarize the query compilation heuristics, using an example system that has 24 physical/48 logical cores, 128 GB RAM:

 

Gateway Formula Number of concurrent compiles
HT OFF HT ON
None N/A Unlimited Unlimited
Small 4 * number of cores 96 192
Medium Number of cores 24 48
Big (80% of target memory/pool)/25GB per query 1-24 1-48

Ajay Jagannathan (@ajayMSFT)

Principal Program Manager

 

Comments (5)

  1. Excellent post. Much needed for large scale systems.

    Karthik (KKB)

    Database consultant at ANZ

  2. SomewhereSomehow says:

    Very interesting, thanks!

  3. Chris Wood says:

    Could this be pushed to SQL2012 as well?

    Chris

  4. Thiago says:

    Excellent post and explanation. Thanks for sharing

  5. Neeraj Mittal says:

    Excellent explanation on complex topic.

Skip to main content