Container Storage Support with Cluster Shared Volumes (CSV), Storage Spaces Direct (S2D), SMB Global Mapping

By Amitabh Tamhane

Goals: This topic provides an overview of providing persistent storage for containers with data volumes backed by Cluster Shared Volumes (CSV), Storage Spaces Direct (S2D) and SMB Global Mapping.

Applicable OS releases: Windows Server 2016, Windows Server version 1709

Prerequisites:

Blog:

With Windows Server 2016, many new infrastructure and application workload features were added that deliver significant value to our customers today. Amongst this long list, two very distinct features that were added: Windows Containers & Storage Spaces Direct!

1.   Quick Introductions

Let’s review a few technologies that have evolved independently. Together these technologies provide a platform for persistent data store for applications when running inside containers.

1.1         Containers

In the cloud-first world, our industry is going through a fundamental change in how applications are being developed & deployed. New applications are optimized for cloud scale, portability & deployment agility. Existing applications are also transitioning to containers to achieve deployment agility.

Containers provide a virtualized operating system environment where an application can safely & independently run without being aware of other applications running on the same host. With applications running inside containers, customers benefit from the ease of deployment, ability to scale up/down and save costs by better resource utilization.

More about Windows Containers.

1.2         Cluster Shared Volumes

Cluster Shared Volumes (CSV) provides a multi-host read/write file system access to a shared disk. Applications can read/write to the same shared data from any node of the Failover Cluster. The shared block volume can be provided by various storage technologies like Storage Spaces Direct (more about it below), Traditional SANs, or iSCSI Target etc.

More about Cluster Shared Volumes (CSV).

1.3         Storage Spaces Direct

Storage Spaces Direct (S2D) enables highly available & scalable replicated storage amongst nodes by providing an easy way to pool locally attached storage across multiple nodes.

Create a virtual disk on top of this single storage pool & any node in the cluster can access this virtual disk. CSV (discussed above) seamlessly integrates with this virtual disk to provide read/write shared storage access for any application deployed on the cluster nodes.

S2D works seamlessly when configured on physical servers or any set of virtual machines. Simply attach data disks to your VMs and configure S2D to get shared storage for your applications. In Azure, S2D can also be configured on Azure VMs that have premium data disks attached for faster performance.

More about Storage Spaces Direct (S2D). S2D Overview Video.

1.4         Container Data Volumes

With containers, any persistent data needed by the application running inside will need to be stored outside of the container or its image. This persistent data can be some shared read-only config state or read-only cached web-pages, or individual instance data (ex: replica of a database) or shared read-write state. A single containerized application instance can access this data from any container host in the fabric or multiple application containers can access this shared state from multiple container hosts.

With Data Volumes, a folder inside the container is mapped to another folder on the container host using local or remote storage. Using data volumes, application running inside containers access its persistent data while not being aware of the infrastructure storage topology. Application developer can simply assume a well-known directory/path to have the persistent data needed by the application. This enables the same container application to run on various deployment infrastructures.

2.   Better Together: Persistent Store for Container Fabric

This data volume functionality is great but what if a container orchestrator decides to place the application container to a different node? The persistent data needs to be available on all nodes where the container may run. These technologies together can provide a seamless way to provide persistent store for container fabric.

2.1         Data Volumes with CSV + S2D

Using S2D, you can leverage locally attached storage disks to form a single pool of storage across nodes. After the single pool of storage is created, simply create a new virtual disk, and it automatically gets added as a new Cluster Shared Volume (CSV). Once configured, this CSV volume gives you read/write access to the container persistent data shared across all nodes in your cluster.

With Windows Server 2016 (plus latest updates), we now have enabled support for mapping container data volumes on top of Cluster Shared Volumes (CSV) backed by S2D shared volumes. This gives application container access to its persistent data no matter which node the container orchestrator places the container instance.

Configuration Steps

Consider this example (assumes you have Docker & container orchestrator of your choice already installed):

  1. Create a cluster (in this example 4-node cluster)

New-Cluster -Name <name> -Node <list of nodes>

(Note: The generic warning text above is referring to the quorum witness configuration which you can add later.)

  1. Enable Cluster S2D Functionality

Enable-ClusterStorageSpacesDirect or Enable-ClusterS2D

(Note: To get the optimal performance from your shared storage, it is recommended to have SSD cache disks. It is not a must have for getting a shared volume created from locally attached storage.)

Verify single storage pool is now configured:

Get-StoragePool S2D*

  1. Create new virtual disk + CSV on top of S2D:

New-Volume -StoragePoolFriendlyName *S2D* -FriendlyName <name> -FileSystem CSVFS_REFS -Size 50GB

 

Verify new CSV volume getting created:

Get-ClusterSharedVolume

This shared path is now accessible on all nodes in your cluster:

  1. Create a folder on this volume & write some data:

  1. Start a container with data volume linked to the shared path above:

This assumes you have installed Docker & able to run containers. Start a container with data volume:

docker run -it –name demo -v C:\ClusterStorage\Volume1\ContainerData:G:\AppData nanoserver cmd.exe

Once started the application inside this container will have access to “G:\AppData” which will be shared across multiple nodes. Multiple containers started with this syntax can get read/write access to this shared data.

Inside the container, G:\AppData1 will then be mapped to the CSV volume’s “ContainerData” folder. Any data stored on “C:\ClusterStorage\Volume1\ContainerData” will then be accessible to the application running inside the container.

2.2         Data Volumes with SMB Global Mapping (Available in Windows Server version 1709 Only)

Now what if the container fabric needs to scale independently of the storage cluster? Typically, this is possible through SMB share remote access. With containers, wouldn’t it be great to support container data volumes mapped to a remote SMB share?

In Windows Server version 1709, there is a new support for SMB Global Mapping which allows a remote SMB Share to be mapped to a drive letter. This mapped drive is then accessible to all users on the local host. This is required to enable container I/O on the data volume to traverse the remote mount point.

With Scaleout File Server, created on top of the S2D cluster, the same CSV data folder can be made accessible via SMB share. This remote SMB share can then be mapped locally on a container host, using the new SMB Global Mapping PowerShell.

Caution: When using SMB global mapping for containers, all users on the container host can access the remote share. Any application running on the container host will also have access to the mapped remote share.

Configuration Steps

Consider this example (assumes you have Docker & container orchestrator of your choice already installed):

  1. On the container host, globally map the remote SMB share:

$creds = Get-Credentials

New-SmbGlobalMapping -RemotePath \\contosofileserver\share1 -Credential $creds -LocalPath G:

This command will use the credentials to authenticate with the remote SMB server. Then, map the remote share path to G: drive letter (can be any other available drive letter). Containers created on this container host can now have their data volumes mapped to a path on the G: drive.

  1. Create containers with data volumes mapped to local path where the SMB share is globally mapped.

Inside the container, G:\AppData1 will then be mapped to the remote share’s “ContainerData” folder. Any data stored on globally mapped remote share will then be accessible to the application running inside the container. Multiple containers started with this syntax can get read/write access to this shared data.

This SMB global mapping support is SMB client-side feature which can work on top of any compatible SMB server including:

  • Scaleout File Server on top of S2D or Traditional SAN
  • Azure Files (SMB share)
  • Traditional File Server
  • 3rd party implementation of SMB protocol (ex: NAS appliances)

Caution: SMB global mapping does not support DFS, DFSN, DFSR shares in Windows Server version 1709.

2.3 Data Volumes with CSV + Traditional SANs (iSCSI, FCoE block devices)

In Windows Server 2016, container data volumes are now supported on top of Cluster Shared Volumes (CSV). Given that CSV already works with most traditional block storage devices (iSCSI, FCoE). With container data volumes mapped to CSV, enables reusing existing storage topology for your container persistent storage needs.