Configuring SQL Server 2005 for Soft NUMA


Sometimes I am surprised how issues come up the moment you mention them J. Recently, I have discussed Soft NUMA configurations in the article http://blogs.msdn.com/slavao/articles/441058.aspx. This week one of our customers had an interesting problem. The customer wanted to partition single SQL server instance based on the load. Customer’s application is heterogynous. It consists of TPCH type queries and data loading applications. The customer has a system, which is NUMA, with 2 nodes and 4 CPU per Node. The customer wanted to give the loading application two CPUs and the rest of CPUs to the queries. Is it possible to achieve it?


 


As you might guess the answer is SQL2005’s Soft NUMA support. We advised them to configure SQL Server and clients to treat system as three node NUMA system. (Surprised? Yes it is possible with SQL 2005 Soft NUMA support). The configuration looks like following: zero node has 4 CPUs, first node has 2 CPUs and last node has 2 CPUs. Keep in mind, when you configure SQL Server for Soft NUMA, soft nodes should fully be contained in the real nodes, i.e. a soft node can not span several real NUMA nodes. Customer’s TPCH queries were configured to utilize zero and first nodes and the load application was configured to utilize last node. Once configured and started this configuration worked as expected – load was fully partitioned across CPUs. It is important to notice that the customer was very delighted with the experience. Below is the example of node & network configuration we provided customer with:


 


A. Node configuration – provides node and their affinity information.


 


[HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Microsoft SQL Server\90\NodeConfiguration\Node0]


“CPUMask”=dword:0000000F


[HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Microsoft SQL Server\90\NodeConfiguration\Node1]


“CPUMask”=dword:00000030


[HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Microsoft SQL Server\90\NodeConfiguration\Node2]


“CPUMask”=dword:000000C0


 


– Keep in mind affinity mask for a soft node can’t span multiple hardware nodes. For example 000000CF is incorrect mask.


 


B. Network Configuration – provides port & node assignment.


 


[HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Microsoft SQL Server\MSSQL.1\MSSQLServer\SuperSocketNetLib\Tcp\IPAll]


“TcpPort”=”1433[0x3],2000[0x4]”


“TcpDynamicPorts”=””


“DisplayName”=”Any IP Address”


 


– This configuration makes SQL Server to listen on two ports. One port serves two nodes. It means that port node relations ship could be one two many.


– Keep in mind that in the square brackets you specify node affinity not the CPU affinity, i.e. 3 in this case means that port 1433 will processes requests on node 0 and 1


– When port is assigned to multiple nodes it will assign connections to the nodes in a round robin fashion


 


 Once registry is configured server have to be restarted for changes to take affect. In addition clients should be configured to connect to specific ports. In the case of our customer TPCH clients were directed to port 1433 and the loading application was assigned to port 2000.  


 


SQL Server 2005 supports more complex configuration for Soft NUMA. Below is information along with an example provided to me by our performance team; describing how to configure SQL Server 2005 for Soft NUMA in general. Please notice that you can actually bind a specific NIC to the node.


 


This is default key where you need to change the value of TcpPort.


 


[HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Microsoft SQL Server\MSSQL.1\MSSQLServer\SuperSocketNetLib\Tcp\IPAll]


“TcpPort”=””


“TcpDynamicPorts”=””


“DisplayName”=”Any IP Address”


 


Thus if you need 3 ports (2000,2001,1433) on a four node machine such that 2000 controls connections such that they go nodes 0 and 1 ,2001  causes them to go to nodes 2 and 3 and 1433 to go to all nodes You use:


 


[HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Microsoft SQL Server\MSSQL.1\MSSQLServer\SuperSocketNetLib\Tcp\IPAll]


“TcpPort”=”2000[0x3],2001[0xc],1433”


 


Thus the values in the square brackets are node masks and NOT processor masks. The above key makes all NICs on the system listen to all the above ports. Now instead,  if we want to control things on a NIC by NIC basis,  we need to flip this switch


From


[HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Microsoft SQL Server\MSSQL.1\MSSQLServer\SuperSocketNetLib\Tcp]


“ListenOnAllIPs”=dword:00000001


To


[HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Microsoft SQL Server\MSSQL.1\MSSQLServer\SuperSocketNetLib\Tcp]


“ListenOnAllIPs”=dword:00000000


 


And then if we had Two NICs on the system besides the loopback adaptor. Note the Last part of the key “IP3” depend on the NIC number


 


[HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Microsoft SQL Server\MSSQL.1\MSSQLServer\SuperSocketNetLib\Tcp\IP2]


“Enabled”=dword:00000001


“Active”=dword:00000001


“TcpPort”=”2000[0x3],1433”


“TcpDynamicPorts”=””


“DisplayName”=”Specific IP Address”


“IpAddress”=”10.192.168.01”


 


[HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Microsoft SQL Server\MSSQL.1\MSSQLServer\SuperSocketNetLib\Tcp\IP3]


“Enabled”=dword:00000001


“Active”=dword:00000001


“TcpPort”=”2001[0xc],1433”


“TcpDynamicPorts”=””


“DisplayName”=”Specific IP Address”


“IpAddress”=”10.192.168.02”


 


This means we shall listen to 2000 only 10.192.168.01 and pass connections received from 2000 only to nodes 0 and 1.  And similarly for 2001.


 


Below configuration which makes an 8 proc 2 node system behave as an 8 node system. It assumes that there are 8 NICs in the system


 


[HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Microsoft SQL Server\MSSQL.1\MSSQLServer\SuperSocketNetLib\Tcp]


“Enabled”=dword:00000001


“ListenOnAllIPs”=dword:00000000


 


[HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Microsoft SQL Server\MSSQL.1\MSSQLServer\SuperSocketNetLib\Tcp\IPAll]


“TcpPort”=””


“TcpDynamicPorts”=”2001[0x1],2002[0x2],2003[0x4],2004[0x8],2005[0x10],2006[0x20],2007[0x40],2008[0x80]”


“DisplayName”=”Any IP Address”


 


[HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Microsoft SQL Server\MSSQL.1\MSSQLServer\SuperSocketNetLib\Tcp\IP3]


“Enabled”=dword:00000001


“Active”=dword:00000001


“TcpPort”=”2001[0x1]”


“TcpDynamicPorts”=””


“DisplayName”=”Specific IP Address”


“IpAddress”=”10.192.168.01”


 


[HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Microsoft SQL Server\MSSQL.1\MSSQLServer\SuperSocketNetLib\Tcp\IP4]


“Enabled”=dword:00000001


“Active”=dword:00000001


“TcpPort”=”2002[0x2]”


“TcpDynamicPorts”=””


“DisplayName”=”Specific IP Address”


“IpAddress”=”10.192.168.02”


 


[HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Microsoft SQL Server\MSSQL.1\MSSQLServer\SuperSocketNetLib\Tcp\IP5]


“Enabled”=dword:00000001


“Active”=dword:00000001


“TcpPort”=”2003[0x4]”


“TcpDynamicPorts”=””


“DisplayName”=”Specific IP Address”


“IpAddress”=”10.192.168.03”


 


[HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Microsoft SQL Server\MSSQL.1\MSSQLServer\SuperSocketNetLib\Tcp\IP6]


“Enabled”=dword:00000001


“Active”=dword:00000001


“TcpPort”=”2004[0x8]”


“TcpDynamicPorts”=””


“DisplayName”=”Specific IP Address”


“IpAddress”=”10.192.168.04”


 


[HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Microsoft SQL Server\MSSQL.1\MSSQLServer\SuperSocketNetLib\Tcp\IP7]


“Enabled”=dword:00000001


“Active”=dword:00000001


“TcpPort”=”2005[0x10]”


“TcpDynamicPorts”=””


“DisplayName”=”Specific IP Address”


“IpAddress”=”10.192.168.05”


 


[HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Microsoft SQL Server\MSSQL.1\MSSQLServer\SuperSocketNetLib\Tcp\IP8]


“Enabled”=dword:00000001


“Active”=dword:00000001


“TcpPort”=”2006[0x20]”


“TcpDynamicPorts”=””


“DisplayName”=”Specific IP Address”


“IpAddress”=”10.192.168.06”


 


[HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Microsoft SQL Server\MSSQL.1\MSSQLServer\SuperSocketNetLib\Tcp\IP9]


“Enabled”=dword:00000001


“Active”=dword:00000001


“TcpPort”=”2007[0x40]”


“TcpDynamicPorts”=””


“DisplayName”=”Specific IP Address”


“IpAddress”=”10.192.168.07”


 


[HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Microsoft SQL Server\MSSQL.1\MSSQLServer\SuperSocketNetLib\Tcp\IP10]


“Enabled”=dword:00000001


“Active”=dword:00000001


“TcpPort”=”2008[0x80]”


“TcpDynamicPorts”=””


“DisplayName”=”Specific IP Address”


“IpAddress”=”10.192.168.08”


 


 


 


[HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Microsoft SQL Server\90\NodeConfiguration]


 


[HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Microsoft SQL Server\90\NodeConfiguration\Node0]


“CPUMask”=dword:00000001


[HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Microsoft SQL Server\90\NodeConfiguration\Node1]


“CPUMask”=dword:00000002


[HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Microsoft SQL Server\90\NodeConfiguration\Node2]


“CPUMask”=dword:00000004


[HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Microsoft SQL Server\90\NodeConfiguration\Node3]


“CPUMask”=dword:00000008


 


[HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Microsoft SQL Server\90\NodeConfiguration\Node4]


“CPUMask”=dword:00000010


[HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Microsoft SQL Server\90\NodeConfiguration\Node5]


“CPUMask”=dword:00000020


[HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Microsoft SQL Server\90\NodeConfiguration\Node6]


“CPUMask”=dword:00000040


[HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Microsoft SQL Server\90\NodeConfiguration\Node7]


“CPUMask”=dword:00000080

Comments (6)

  1. Slava,

    Thanks for very interesting articles!

    Questions:

    1. This method works only for NUMA?

    2. Can I create program nodes on SMP systems?

    Sincerely yours,

    Alexander Gladchenko

    SQL Server MVP

  2. slavao says:

    1. This method works not only for NUMA

    2. Yes you can configure SMP box to have multiple Soft NUMA, or as you called them program nodes.

  3. La stratégie :

    SQL Server 2005 Update

    from Paul Flessner (6/04/2006)

    Les versions et combien ca…

  4. ybhalerao says:

    Slava,

    Thanks for this article… never knew we could do that with CPUs till now… I have a couple of customers that can benefit from this, however I think they might be curious and me too to know if we can make this system dynamic? In the above example assuming that the application is intelligent enough to redirect the queries to the appropriate NIC, when the bulk load is not running and that there is a heavy load of TPCH queries, in that case the CPU resources would be poorly utilized? I would like the SoftNUMA to increase and decrease it’s CPUs within the node, is that possible without restarting?

    Secondly most of my customer servers are on a cluster in that case how to make the application simple?  Application is assumed to be intelligent enough to redirect queries to appropriate nodes, now in case of failover onto a mirror/passive server (where the nodes are not configured or configured differently) the application will have to revert to the non-SoftNUMA config or new-SoftNUMA config?

    Thanks

    Yogesh Bhalerao

    Scalability Experts, Inc.

  5. slavao says:

    Currently SoftNUMA configuration is static. However you can overcome the issue by pushing actual decisions which node to use to middle tier/client side. When bulk loads not running you could round robin connection across all the nodes, when bulk load is about to run you can remove TPCH load from the bulk load node.

    For the second question the answer is yes if you decide to manage connections on the client/application side you will have to know about nodes availability hence change the configuration after failover.