Faster Live Migration with Compression in Windows Server 2012 R2

Last week I talked about the fact that we had introduced two new technologies for making live migration faster in Windows Server 2012 R2.  Today I would like to dig in deep on how one of these approaches, live migration with compression, actually works.

Live migration with compression is the default option for live migration in Windows Server 2012 R2.  Essentially what happens is that we compress any memory data before sending it over the network, and we decompress the memory on the destination side.  This has the effect of increasing the CPU utilization of a live migration, while decreasing the network utilization of a live migration.

In talking to many users – we have seen that most environments are currently bottlenecked on their network connectivity, but are still underutilizing the processing capabilities.  As such we expect that this functionality will have a significant impact in most situations.

image

That said – life is never that simple.

One of the secondary goals that we had with live migration with compression is that we wanted to be able to safely set it as the default option for live migration.  This meant that we wanted to be confident that using live migration with compression would not accidentally have any adverse effects on the system.

One obvious area for concern is: what happens if CPU resource is not available?  What if the virtual machines are actively using all the processor power I have available?

To handle this concern – we were very careful in the design of live migration with compression.  Throughout the entire process of a live migration we are now actively monitoring the CPU utilization and needs of all virtual machines on a Hyper-V server (even the virtual machines that are not being live migrated).  We then throttle our compression engine appropriately so that we only consume CPU resource that is not being actively used by the rest of the system.  This does mean that in a worst case scenario, where you attempt to live migrate a virtual machine on a system that is heavily utilizing its CPU, we may decide to not engage our compression engine at all and you would see no performance benefit.

The next obvious question is: just how much faster is live migration with compression?  Unfortunately, this is hard to answer.  There are two factors that effect the performance of live migration with compression.  The first one is the availability of CPU resource (as I have just discussed) but the second is the complexity of the memory inside the virtual machine.

This second factor is very hard to predict.  A virtual machine may be using a lot of memory – but the content of the memory may be easy to compress and the result would be a very fast live migration.  Alternatively, a virtual machine may be using only a portion of its memory – but that content may be complicated to compress, which would result in a small performance boost.

So to answer this question I can only share some data points from our own testing:

  1. With an idle virtual machine with no workload (best case for live migration with compression) we have seen up to a 6x performance improvement.
  2. With an virtual machine running an active SQL workload we have typically seen a 2x performance improvement.
  3. We have not been able to construct a virtual machine that was slower to migrate with compression enabled, than with it disabled.

One note to make here – in order to easily demonstrate live migration with compression in a realistic fashion – I run the following PowerShell snippet inside my virtual machine:

$memsize = 1GB
$Array = New-Object Byte[] $memsize
$random = [System.Security.Cryptography.RandomNumberGenerator]::Create()
$random.GetBytes($Array)
read-host

This puts 1GB of hard to compress memory inside my virtual machine, and slows down the migration of an otherwise idle workload!

The final piece of information that I have to share about live migration with compression is that we expose a number of performance counters that allow you to understand exactly what is happening in your environment:

image

You can use the counters to see how much data is being compressed, what sort of compression efficiency we are achieving and how much resource we are utilizing for compression.

Cheers,
Ben