Faster Live Migration in Windows Server 2012 R2

There are a number of interesting new features in Hyper-V in Windows Server 2012 R2.  One of the ones that I was directly involved in was the work to make live migration faster.

Now, most people are pretty impressed with live migration today.  After all – it lets you move virtual machines between physical servers with zero downtime.  What could be better?  But as people are investing more and more in virtualization, and taking advantage of all the functionality that virtualization can provide, we are discovering new issues that need to be tackled.

In the case of live migration – it is already pretty fast and simple to live migrate a single virtual machine.  But that is not how live migration is used by most people.  Live migration is most frequently used to enable patch deployment to your virtualization fabric without needing to stop any virtual machines.  It is also used to dynamically redistribute virtual machine load on your fabric.  In both of these cases you are not live migrating a single virtual machine, but large numbers of virtual machines (and possibly all of your virtual machines).

It does not take long before the time to perform a live migration becomes noticeable in these scenarios.

For example: performing a zero downtime patch deployment in an 8 node Hyper-V cluster with 128 GB of memory per node will require that around a terabyte of data is transferred.  With Windows Server 2012 this operation would take 12 to 24 hours to complete (depending on your infrastructure).  And those numbers just get larger as your virtualization deployment grows.

It is for this reason that we decided that we needed to make live migration faster in Windows Server 2012 R2.  We are doing this by providing two new options for live migration:

  • Live migration with compression:   Here we utilize spare CPU capacity in the host operating system to reduce the amount of data that gets sent as part of the live migration.  In testing this has yielded a 2x to 4x performance improvement without any changes to the virtualization hardware or network configuration.
  • Live migration with RDMA: Here we take advantage of RDMA enabled hardware to deliver amazing performance for live migration, with zero CPU impact.

I will be going deep on these two approaches in the next week or two – but in the meantime you can see them demonstrated in this recording of me speaking at TechEd Australia this year:

These features are demonstrated at 26:30 on the above video.

Cheers,
Ben