Testing Storage Spaces Direct using Windows Server 2016 virtual machines

Windows Server 2016 introduces Storage Spaces Direct (S2D), which enables building highly available storage systems which is virtual shared storage across the servers using local disks. This is a significant step forward in Microsoft’s Windows Server Software-defined Storage (SDS) story as it simplifies the deployment and management of SDS systems and also unlocks use of new classes of disk devices, such as SATA disk devices, that were previously unavailable to clustered Storage Spaces with shared disks. The following document has more details about the technology, functionality, and how to deploy on physical hardware.

Storage Spaces Direct Experience and Installation Guide

That experience and install guide notes that to be reliable and perform well in production, you need specific hardware (see the document for details).  However, we recognize that you may want to experiment and kick the tires a bit in a test environment, before you go and purchase hardware. Therefore, as long as you understand it’s for basic testing and getting to know the feature, we are OK with you configuring it inside of Virtual Machines.

If you want to verify specific capabilities, performance, and reliability, you will need to work with your hardware vendor to acquire approved servers and configuration requirements.

Assumptions for this Blog

  • You have a working knowledge of how to configure and manage Virtual Machines (VMs).
  • You have a basic knowledge of Windows Server Failover Clustering (cluster).

Pre-requisites

  • Windows Server 2012R2 or Windows Server 2016 with the Hyper-V Role installed and configured to host VMs.
  • Enough capacity to host four VMs with the configuration requirements noted below.
  • Hyper-V servers can be part of a host failover cluster, or stand-alone.
  • VMs can be located on the same server, or distributed across servers (as long as the networking connectivity allows for traffic to be routed to all VMs with as much throughput and lowest latency possible.)

Note:  These instructions and guidance focus on using our latest Windows Servers as the hypervisor.  Windows Server 2012R2 and Widows Server 2016 Technical Preview is what I use.  There is nothing that will restrict you to trying this with other private or public clouds.  However, this blog post does not cover those scenarios and whether or not they work will depend on the environment providing the necessary storage/network and other resources.  We will update our documentation as we verify for other private or public clouds.

Overview of Storage Spaces Direct

S2D uses disks that are exclusively connected to one node of a Windows Server 2016 failover cluster and allows Storage Spaces to create pools using those disks. Virtual Disks (Spaces) that are configured on the pool will have their redundant data (mirrors or parity) spread across the disks in different nodes of the cluster.  Since copies of the data is distributed, this allows access to data even when a node fails or is shutdown for maintenance.

You can implement S2D implement in VMs, with each VM configured with two or more virtual disks connected to the VM’s SCSI Controller.  Each node of the cluster running inside of the VM will be able to connect to its own disks, but S2D will allow all the disks to be used in Storage Pools that span the cluster nodes.

S2D leverages SMB3 as the protocol transport to send redundant data, for the mirror or parity spaces to be distributed across the nodes.

Effectively, this emulates the configuration in the following diagram:

General Suggestions:

  • Network.  Since the network between the VMs transports the redundant data for mirror and parity spaces, the bandwidth and latency of the network will be a significant factor in the performance of the system.  Keep this in mind as you experience the system in the test configurations.
  • VHDx location optimization.  If you have a Storage Space that is configured for a three way mirror, then the writes will be going to three separate disks (implemented as VHDx files on the hosts), each on different nodes of the cluster. Distributing the VHDx files across disks on the Hyper-V hosts will provide better response to the I/Os.  For instance, if you have four disks or CSV volumes available on the Hyper-V hosts, and four VMs, then put the VHDx files for each VM on a separate disks (VM1 using CSV Volume 1, VM2 using CSV Volume 2, etc).

Enabling Storage Spaces Direct in Virtual Machines:

Windows Server 2016 includes enhancements that automatically configures the storage pool and storage tiers in “Enable-ClusterStorageSpacesDirect”.  It uses a combination of  bus type and media type to determine devices to use for caching and the automatic configuration of storage pool and storage tiers.

Below is an example of the steps to do this:

#Create cluster 
New-Cluster -Name <ClusterName -Node <node1>,<node2>,<node3> -NoStorage

#Enable Storage Spaces Direct
Enable-ClusterS2D 

#Create a volume
New-Volume -StoragePool "S2d*" -FriendlyName <friendlyname> -FileSystem CSVFS_REFS -StorageTiersFriendlyNames Performance, Capacity -StorageTierSizes <2GB>, <10GB>

#Note: The values for the -StorageTierSizes parameter above are examples, you can specify the size you prefer.  The -StorageTierFriendNames of Performance and Capacity are the names of the default tiers created with the Enable-ClusterS2D cmdlet.  There are some cases there may only be one of them, or someone could have added more tier definitions to the system.  Use Get-StorageTier to confirm what storage tiers exist on your system.

Configuration 1: Single Hyper-V Server (or Client)

The simplest configuration is one machine hosting all of the VMs used for the S2D system.  In my case, a Windows Server 2016 Technical Preview 2 (TP2) system running on a desktop class machine with 16GB or RAM and a 4 core modern processor.

The VMs are configured identically. I have a virtual switch connected to the host’s network and goes out to the world for clients to connect and I created a second virtual switch that is set for Internal network, to provide another network path for S2D to utilize between the VMs.

The configuration looks like the following diagram:

Hyper-V Host Configuration

  • Configure the virtual switches: Configure a virtual switch connected to the machine’s physical NIC, and another virtual switch configured for internal only.

Example: Two virtual switches. One configured to allow network traffic out to the world, which I labeled “Public”.  The other is configured to only allow network traffic between VMs configured on the same host, which I labeled “InternalOnly”.

 

VM Configuration

–         Create four or more Virtual Machines

  • Memory: If using Dynamic Memory, the default of 1024 Startup RAM will be sufficient.  If using Fixed Memory you should configure 4GB or more.
  • Network:  Configure each two network adapters.  One connected to the virtual switch with external connection, the other network adapter connected to the virtual switch that is configured for internal only.
    • It’s always recommended to have more than one network, each connected to separate virtual switches so that if one stops flowing network traffic, the other(s) can be used and allow the cluster and Storage Spaces Direct system to remain running.
  • Virtual Disks: Each VM needs a virtual disk that is used as a boot/system disk, and two or more virtual disks to be used for Storage Spaces Direct.
    • Disks used for Storage Spaces Direct must be connected to the VMs virtual SCSI Controller.
    • Like all other systems, the boot/system disk needs to have unique SIDs, meaning they need to be installed from ISO or other install methods, and if using duplicated VHDx it needs to be generalized (for example using Sysprep.exe), before the copy was made.
    • VHDx type and size:  You need at least eight VHDx files (four VMs with two data VHDx each).  The data disks can be either “dynamically expanding” or “fixed size”.  If you use fixed size, then set the size to 8GB or more, then calculate the size the combined VHDx files so that you don’t exceed the storage available on your system.

Example:  The following is the Settings dialog for a VM that is configured to be part of an S2D system on one of my Hyper-V hosts.  It’s booting from the Windows Server TP2 VHD that I downloaded from Microsoft’s external download site, and that is connected to the IDE Controller 0 (this had to be a Gen1 VM since the TP2 file that I downloaded is a VHD and not VHDx). I created two VHDx files to be used by S2D, and they are connected to the SCSI Controller.  Also note the VM is connected to the Public and InternalOnly virtual switches.

Note: Do not enable the virtual machine’s Processor Compatibility setting.  This setting disables certain processor capabilities that S2D requires inside the VM. This option is unchecked by default, and needs to stay that way.  You can see this setting here:

Guest Cluster Configuration

Once the VMs are configured, creating and managing the S2D system inside the VMs is almost identical to the steps for supported physical hardware:

  1. Start the VMs
  2. Configure the Storage Spaces Direct system, using the “Installation and Configuration” section of the guide linked here: Storage Spaces Direct Experience and Installation Guide
    1. Since this in VMs using only VHDx files as its storage, there is no SSD or other faster media to allow tiers.  Therefore, skip the steps that enables or configures tiers.

Configuration 2: Two or more Hyper-V Servers

You may not have a single machine with enough resources to host all four VMs, or you may already have a Hyper-V host cluster to deploy on, or more than one Hyper-V servers that you want to spread the VMs across.  Here is an diagram showing a configuration spread across two nodes, as an example:

This configuration is very similar to the single host configuration.  The differences are:

 

Hyper-V Host Configuration

  • Virtual Switches:  Each host is recommended to have a minimum of two virtual switches for the VMs to use.  They need to be connected externally to different NICs on the systems.  One can be on a network that is routed to the world for client access, and the other can be on a network that is not externally routed.  Or, they both can be on externally routed networks.  You can choose to use a single network, but then it will have all the client traffic and S2D traffic taking common bandwidth, and there is no redundancy if the single network goes down for the system S2D VMs to stay connected. However, since this is for testing and verification of S2D, you don’t have the resiliency to network loss requirements that we strongly suggest for production deployments.

Example:  On this system I have an internal 10/100 Intel NIC and a dual port Pro/1000 1gb card. All Three NICs have virtual switches. I labeled one Public and connected it to the 10/100 NIC since my connection to the rest of the world is through a 100mb infrastructure.  I then have the 1gb NICs connected to a 1gb desktop switch (two different switches), and that provides my hosts two network paths between each other for S2D to use. As noted, three networks is not a requirement, but I have this available on my hosts so I use them all.

VM Configuration

  • Network:  If you choose to have a single network, then each VM will only have one network adapter in its configuration.

Example: Below is a snip of a VM configuration on my two host configuration. You will note the following:

  • Memory:  I have this configured with 4GB of RAM instead of dynamic memory.  It was a choice since I have enough memory resources on my nodes to dedicate memory.
  • Boot Disk:  The boot disk is a VHDx, so I was able to use a Gen2 VM.
  • Data Disks: I chose to configure four data disks per VM.  The minimum is two, I wanted to try four. All VHDx are configured on the SCSI Controller (which you don’t have a choice in Gen2 VMs).
  • Network Adapters:  I have three adapters, each connected to one of the three virtual switches on the host to utilize the available network bandwidth that my hosts provide.

FAQ:

How does this differ from what I can do in VMs with Shared VHDx?

Shared VHDx remains a valid and recommended solution to provide shared storage to a guest cluster (cluster running inside of VMs).  It allows a VHDx to be accessed by multiple VMs at the same time in order to provide clustered shared storage.  If any nodes (VMs) fail, the others have access to the VHDx and the clustered roles using the storage in the VMs can continue to access their data.

S2D allows clustered roles access to clustered storage spaces inside of the VMs without provisioning shared VHDx on the host.  With S2D, you can provision VMs with a boot/system disk and then two or more extra VHDx files configured for each VM.  You then create a cluster inside of the VMs, configure S2D and have resilient clustered Storage Spaces to use for your clustered roles inside the VMs.

References

Storage Spaces Direct Experience and Installation Guide