NUMA Connection Affinity and Parallel Queries

The SQL Books Online states that using NUMA port affinity helps co-locate the resources used by the query (CPU, Memory, ….).   This is true for serial plan execution but not completely the case for parallel query execution.  You might assume that you can use the NUMA port affinity and all the parallel work is completed on the node(s) the connection is affinitized to but this is not the case.   NUMA port affinity is not taken into account when the parallel query decision is made.   The parallel query decision is based on the best node to get the most resources to support the parallel request.   If you want to keep a NODE centric model set MAX DOP to 1 to force serial plans.


The parallel query execution model takes several factors into account in determining the location of the parallel workers.  These factors include, but are not limited to, the number of available schedulers, number of available workers, memory and DOP target for the query.  These values are looked at for all the online nodes.

·         If the DOP for the query can be satisfied by a single node the ideal node (most workers, schedulers and memory available at the time the query is started) is selected.   The controlling worker can be on a different node than the ideal.  For example you could have NUMA port affinity established for NODE 1 but the parallel query executes on NODE 4 because 4 was the ideal node.   This is not a problem because the majority of the work is done by the parallel workers which are located together on a node.

·         If the DOP for the query can NOT be satisfied by a single node the work load is spread among all available nodes.

The state of the nodes is internally updated for the query execution engine approximately every 2 seconds.   Under rare circumstances you may find that a single node was selected as the destination for multiple parallel queries.   When possible you can stagger the large queries by more than 2 seconds and alternate node choices will be available to the execution engine.


- Bob Dorr

Comments (1)

  1. varund says:

    Great explanation Bob, I've been looking for this explanation for last many days. In an internal DEMO on MAXDOP/NUMA affinity, I could not explain why one of worker thread of my DEMO "bad" query is going to a diff NUMA node, when all parallel thread were on a diff node.

    This leaves me with a small doubt, with controller worker on a diff node, does that brings in the performance deficiencies due to COSTLY remote node data look ups (Ref KB#2023536)? your thoughts

    Varun Dhawan | Database Architect