Windows Azure Virtual Machine restarted with out any notification


An unexpected restart of an Azure VM is an issue that commonly results in a customer opening a support incident to determine the cause of the restart. Hopefully the explanation below provides details to help understand why an Azure VM could have been restarted.

 

Windows Azure updates the host environment approximately once every 2-3 months to keep the environment secure for all applications and virtual machines running on the platform. This update process may result in your VM restarting, causing downtime to your applications/services hosted by the Virtual Machines feature. There is no option or configuration to avoid these host updates. In addition to platform updates, Windows Azure service healing occurs automatically when a problem with a host server is detected and the VMs running on that server are moved to a different host. When this occurs, you loose connectivity to VM during the service healing process. After the service healing process is completed, when you connect to VM, you will likely to find a event log entry indicating VM restart (either gracefully or unexpected). Because of this, it is important to configure your VMs to handle these situations in order to avoid downtime for your applications/services. 

 

To ensure high availability of your applications/services hosted in Windows Azure Virtual Machines,  we recommend using multiple VMs with availability sets. VMs in the same availability set are placed in different fault domains and update domains so that planned updates, or unexpected failures, will not impact all the VMs in that availability set. For example, if you have two VMs and configure them to be part of an availability set, when a host is being updated, only one VM is brought down at a time. This will provide high availability since you have one VM available to serve the user requests during the host update process. Mark Russinovich has posted a great blog post which explains Windows Azure Host updates in detail. Managing the high availability is detailed here.

 

While availability sets help provide high availability for your VMs, we recognize that proactive notification of planned maintenance is a much-requested feature, particularly to help prepare in a situation where you have a workload that is running on a single VM and is not configured for high availability. While this type of proactive notification of planned maintenance is not currently provided, we encourage you to provide comments on this topic so we can take the feedback to the product teams.

 

 

Windows Azure Virtual Machine SLA 

http://www.microsoft.com/en-us/download/details.aspx?id=38427 

 

Key words : VM , Restart, Shutdown, Unexpected reboot, Windows Azure

Comments (3)

  1. OlavT says:

    Why don't the VM's get shutdown properly before this happens?

  2. Miya says:

    I have many customers who met the china windows azure Iaas VM reboot issue which impact their business a lot. I know it's pre-production environment. But still it would be better to get the proactive notification. please raise my concern to the product teams and i am eager to know when my customer can get the proactive notification. thanks

  3. Rob says:

    How do you guarantee that sufficient time will be provided between the restart of the VMs within an availability set so that the first server will have time to restart fully before the 2nd VM is taken offline?  It's not sufficient just to say that you won't take both VMs off line at the same time unless you provide a sufficient recycle time for the apps that are running on the VMs.