Windows Server 2016, debuts the birth of site-aware clusters. Nodes in stretched clusters can now be grouped based on their physical location (site). Cluster site-awareness enhances key operations during the cluster lifecycle such as failover behavior, placement policies, heartbeating between the nodes and quorum behavior. In the remainder of this blog I will explain how you can configure sites for your cluster, the notion of a “preferred site” and how site awareness manifests itself in your cluster operations.
A node’s site membership can be configured by setting the Site node property to a unique numerical value.
For example, in a four node cluster with nodes – Node1, Node2, Node3 and Node4, to assign the nodes to Sites 1 and Site 2, do the following:
- Launch Microsoft PowerShell© as an Administrator and type:
#Create Site Fault Domains New-ClusterFaultDomain –Name Seattle –Type Site –Description “Primary” –Location “Seattle DC” New-ClusterFaultDomain –Name Denver –Type Site –Description “Secondary” –Location “Denver DC” #Set Fault Domain membership Set-ClusterFaultDomain –Name Node1 –Parent Seattle Set-ClusterFaultDomain –Name Node2 –Parent Seattle Set-ClusterFaultDomain –Name Node3 –Parent Denver Set-ClusterFaultDomain –Name Node4 –Parent Denver
Configuring sites enhances the operation of your cluster in the following ways:
- Groups failover to a node within the same site, before failing to a node in a different site
- During Node Drain VMs are moved first to a node within the same site before being moved cross site
- The CSV load balancer will distribute within the same site
Virtual Machines (VMs) follow storage and are placed in same site where their associated storage resides. VMs will begin live migrating to the same site as their associated CSV after 1 minute of the storage being moved.
You now have the ability to configure the thresholds for heartbeating between sites. These thresholds are controlled by the following new cluster properties:
Amount of time between each heartbeat sent to nodes on dissimilar sites in milliseconds
Missed heartbeats before interface considered down to nodes on dissimilar sites
To configure the above properties launch PowerShell© as an Administrator and type:
(Get-Cluster).CrossSiteDelay = <value> (Get-Cluster).CrossSiteThreshold = <value>
You can find more information on other properties controlling failover clustering heartbeating here.
The following rules define the applicability of the thresholds controlling heartbeating between two cluster nodes:
- If the two cluster nodes are in two different sites and two different subnets, then the Cross-Site thresholds will override the Cross-Subnet thresholds.
- If the two cluster nodes are in two different sites and the same subnets, then the Cross-Site thresholds will override the Same-Subnet thresholds.
- If the two cluster nodes are in the same site and two different subnets, then the Cross-Subnet thresholds will be effective.
- If the two cluster nodes are in the same site and the same subnets, then the Same-Subnet thresholds will be effective.
Configuring Preferred Site
In addition to configuring the site a cluster node belongs to, a “Preferred Site” can be configured for the cluster. The Preferred Site is a preference for placement. The Preferred Site will be your Primary datacenter site.
Before the Preferred Site can be configured, the site being chosen as the preferred site needs to be assigned to a set of cluster nodes. To configure the Preferred Site for a cluster, launch PowerShell© as an Administrator and type:
(Get-Cluster).PreferredSite = <Site assigned to a set of cluster nodes>
Configuring a Preferred Site for your cluster enhances operation in the following ways:
During a cold start VMs are placed in in the preferred site
- Dynamic Quorum drops weights from the Disaster Recovery site (DR site i.e. the site which is not designated as the Preferred Site) first to ensure that the Preferred Site survives if all things are equal. In addition, nodes are pruned from the DR site first, during regroup after events such as asymmetric network connectivity failures.
- During a Quorum Split i.e. the even split of two datacenters with no witness, the Preferred Site is automatically elected to win
- The nodes in the DR site drop out of cluster membership
- This allows the cluster to survive a simultaneous 50% loss of votes
- Note that the LowerQuorumPriorityNodeID property previously controlling this behavior is deprecated in Windows Server 2016
Preferred Site and Multi-master Datacenters
The Preferred Site can also be configured at the granularity of a cluster group i.e. a different preferred site can be configured for each group. This enables a datacenter to be active and preferred for specific groups/VMs.
To configure the Preferred Site for a cluster group, launch PowerShell© as an Administrator and type:
(Get-ClusterGroup -Name <GroupName>).PreferredSite = <Site assigned to a set of cluster nodes>
Groups in a cluster are placed based on the following site priority:
- Storage affinity site
- Group preferred site
- Cluster preferred site
Fault Domains are being introduced for clustering in Windows Server 2016, which provide Node, Chasse, Rack, and Site awareness. See this blog as well as the below video’s to learn more about this new feature: https://technet.microsoft.com/en-us/windows-server-docs/storage/storage-spaces/fault-domains-windows-server-2016
Fault Domain Awareness in WS2016 – Part 1: Overview
Fault Domain Awareness in WS2016 – Part 2: Using PowerShell
Fault Domain Awareness in WS2016 – Part 3: Using XML