SQL Server Parallel Query Placement Decision Logic

Recently I had a conversion with (Jonathan Kehayias jonathan@sqlskills.com) about the placement of workers belonging to a parallel query.   As I asked around and reviewed the code I quickly found that the assumption is still ‘Least Loaded Node’ but this changed in SQL Server 2012 and broad awareness for both our support engineers and customers is a bit light.   This post highlights the various decision options available to SQL Server 2012, 2014 and 2016 releases.

There are many different DOP placement options SQL Server has to pick from.  The table below outlines the selection types and documents there behaviors.

SMP or Connection Bound

Place threads on one nodeConnect requires: -T2479

Either the system only has a single node or it is treated as if the connection node is the only node on the system.

DOP is allowed as long as free threads on the node are >= (dop * branches in query)

Full

Place threads on all nodes

 

Node zero will always be the starting node. Starting at node id zero SQL Server loops across schedulers and nodes until all workers are placed. 

DOP is allowed as long as all workers can be placed on the full system.

Least Loaded Node

Place threads on the least loaded node.Requires: -T2467

Loop over the online nodes determining if there are enough free threads on any single node to support the current DOP request. Making sure there are enough schedulers online within the node to support the request without stacking up requests from the query on the same scheduler(s.)

Use Next Node

Place threads within nodeRequires: -T2468

Find the next node that can service the DOP request.

Unlike full mode, the global, resource manager keeps track of the last node used. Starting from the last position, and moving to the next node, SQL Server checks for query placement opportunities. If a node can’t support the request SQL Server continues advancing nodes and searching.

Spread

Place threads on multiple nodes

This is the most common decision made by SQL Server. The decision spreads the workers across multiple nodes as required. The design is similar to full except the starting position is based on the saved, next node, global enumerator.

 

  • SQL Server does check for online nodes and skips the offline nodes.
  • SQL Server considers the number of online schedulers within each node. 
  • When assigning workers SQL Server avoids assignment of the query’s workers on the same scheduler.

The decision logic is protected and synchronized across queries using the QRY_PARALLEL_THREAD_MUTEX (see sys.dm_os_wait_status.) 

The following are a few of the outputs surrounding the DOP calculation decisions.

  • XEvent: XeSqlPkg::calculate_dop
  • SQL Server 2016 DMV: sys.dm_exec_query_parallel_workers
  • Statistics XML: runtime information

Bob Dorr - Principal SQL Server Escalation Engineer