Forcing NUMA Node affinity for Analysis Services Tabular databases

In the whitepaper, Hardware Sizing a Tabular Solution, we made a reference to Analysis Services Tabular and Non Uniform Memory Access or NUMA. Specifically, Analysis Services Tabular mode is not NUMA Aware. SSAS Tabular mode runs in memory and will load any tabular models into memory on startup. Having the model loaded into memory is what gives SSAS tabular models such impressive query performance. With that said, there are cases where hardware can limit the speed of SSAS Tabular. More specifically, if your Tabular model is running on a system with NUMA hardware you could see a performance impact if Analysis Services needs to access memory located on remote NUMA node. A introduction and discussion on NUMA is outside the scope of this blog post however if you would like to know more, the following article provides some background on the topic  https://technet.microsoft.com/en-us/library/ms178144(v=sql.105).aspx 

During a recent lab engagement we tested query workloads for some large tabular databases on NUMA and non-NUMA hardware, then on NUMA affinitized instances of Analysis Services.  I found that the majority of the queries initially ran two or three times slower on NUMA hardware with a cold cache.

Since publishing the Hardware Sizing guide for Analysis Services many customers have approached me with questions specific to Analysis Services tabular models running on NUMA hardware.  In this blog post I will explain how to determine if you are running on NUMA hardware and provide some options for forcing Analysis Services Tabular instances to run on a single NUMA node.

To understand the NUMA architecture for a system there are a couple of easy ways to find out.  The first and easiest way to tell is by using task manager

  1. Open Task Manager and click on the Performance Tab
  2. Next click on CPU
  3. Right Click the Graph area and select Change graph to \ NUMA nodes
  4. On a system with no NUMA hardware you will notice NUMA nodes is greyed out.   In the screenshot below, the system on the left is from a server with 4 NUMA nodes and the system on the right is from a machine with no NUMA hardware

A second option is to use a tool from Sysinternals called CoreInfo CoreInfo will tell you if your hardware is NUMA and it also provides an estimate of what kind of performance impact accessing remote NUMA memory will have.

  1. Download and extract CoreInfo from the following link CoreInfo
  2. Open an administrative Command Prompt and navigate to the directory where you extracted CoreInfo
  3. Run the following command CoreInfo.exe
  4. In the output you will see a section called “Logical Processors to NUMA Node Map”.  This will tell you if you are using NUMA and if so how many NUMA nodes.  If the System is NUMA then you will also see the approximate cross-NUMA node access cost which gives you an idea of what type of performance hit you may see when referencing memory on remote NUMA nodes

 

If you are using a system that is using NUMA and you would like to affinitize Analysis Services to a NUMA node I will explain two options that I have tested and work well.

  • Windows System Resource Manager
  • NUMA Node Affinitization Using a Hyper-V Virtual Machine

One way to force Analysis Services to use only 1 NUMA node is to use Windows System Resource Manager or WSRM. To use WSRM you need to first add the feature from the “Add Roles and Features Wizard”  Once you have installed WSRM you will then need to create a Resource Allocation Policy and a ProcessMatching criteria to force SSAS to one NUMA node.  WSRM is listed as deprecated in MSDN however I haven’t found any compelling reason not to use it and it appears to be the only way to force processor affinitization with SSAS outside of virtualization https://technet.microsoft.com/en-us/library/cc753939.aspx

  1. Before using WSRM you need to identify which processors are assigned to which NUMA node. Use CoreInfo to find out.

  2. Add Windows System Resource Manager using the Add Roles and Features Wizard as seen below

  3. Once installed open WSRM

  4. Create a new resource allocation Policy for the processors assigned to the NUMA node you SSAS to run under.  In the screenshot below I specified the first instance of SSAS to run one the first NUMA node, NUMA Node 0: processors 0 – 15 (16 processors per NUMA node on the hardware I was using below…configuration is 0 based so make sure you don’t choose 1 – 16).  I specified the second instance to run under processors 16 – 31 (NUMA NODE 1)

     

  5. Once you have your SSAS instances set up and running in WSRM you will need to modify your Analysis Services settings in the server properties or using the MSMDSRV.ini file.  Many SSAS settings are dynamic based on the number of CPUs, Memory on the machine.   For example on a server with 1.5 Terabytes of memory and 4 NUMA nodes with 16 cores a piece we would limit each instance of SSAS to the amount of memory available to that NUMA node (1536GB/4 = 384).  If SSAS is limited to 384GB and 16 cores our default values for a setting such as LowMemoryLimit will be skewed; the default of 65% would be calculated based on the overall memory of the machine which SSAS would see as 1536 GB then set the Low Memory Limit to roughly 998GB which would be way more than SSAS actually has available. To address this you need to modify your .ini file settings to account for the amount of Memory and number of CPUs you are limited.

A second way to force NUMA affinitization to SSAS instances is by using Hyper-V.  The benefit of this method as mentioned earlier in the write-up is that you do not need to modify the SSAS Properties in the .ini file to account for the fact that the SSAS will be limited to a specific group of processors and a fraction of the memory.

  1. In Hyper-V manager create a virtual machine per instance of SSAS you want to use

  2. Go to the settings of the Virtual Machine
     

  3. Access the NUMA section under Processors

  4. Set NUMA configuration to map each VM to a NUMA node.  Calculate how much memory to set as maximum depending on the total memory available per NUMA Node.  Set Maximum amount of memory and Maximum NUMA nodes accordingly.

 

In addition to setting the NUMA affinity per VM you can also set the following at the Host level.  Unchecking the option “Allow virtual machines to span physical NUMA nodes” in theory

Hopefully this post gives you a basic understanding of NUMA and why you should be aware of it when it comes to SSAS tabular mode.  If you are experiencing slow cold cache response times from your SSAS Tabular database queries, take a look at your hardware and see if it could be NUMA related.  If so, try one of the methods I mention to force affinitization to a NUMA node or test on non-NUMA hardware.