Handling RoleEnvironment.Stopping event to perform specific action when Windows Azure VM is going down for scheduled update

If you have read my previous article about Windows Azure VM Downtime due to Guest and Host OS update….

 

… I would like to add little more information on this regard for some level of completeness. So now you must be sure that it is very much possible that your Windows Azure VM will be down for a very short while when Host OS is being updated (let’s consider roughly once a month) as well as when Guest OS is updated, pretty much same frequency so what else you can do when your Azure VM going for an short update process….

 

One option is that you can handle RoleEnvironment.Stopping event to perform some actions if you wish to do so. There is one thing to consider then when you get RoleEnvironment.Stopping event your VM is already out from loadbalancer.

Code snippet is as below:

 public override bool OnStart()
 {
 RoleEnvironment.Stopping += RoleEnvironmentStopping;
 
 return base.OnStart();
 } 
 
 private void RoleEnvironmentStopping(object sender, RoleEnvironmentStoppingEventArgs e) 
 {
 // Add code that is run when the role instance is being stopped
 }
 
 public override void OnStop()
 { 
 try
 {
 // Add code here that runs when the role instance is to be stopped
 } 
 catch (Exception e)
 {
 Trace.WriteLine("Exception during OnStop: " + e.ToString());
 // Take other action as needed.
 }
 }
 
 
 Note: Code running in the OnStop method has a limited time to finish when it is called for reasons other than a user-initiated shutdown. After this time elapses, the process is terminated, so you must make sure that code in the OnStop method can run quickly or tolerates not running to completion.
 

So if you decided to write some cleanup code in the based on above note you might say, how much time you have to run your cleanup code in Stopping event or why there is a limit. The reason to have specific time to get out of this event are mainly stabilizing Azure VM health and don’t let your instance out from load balancer longer. Potentially:

  1. It is possible OS gets stuck because your code wouldn’t quit and OS keeps on waiting
  2. It is also possible OS could hit a problem during the shutdown
  3. For any reason the guest OS takes some time to spin down pretty much similar shutting down and OS proceeds to flush the system and it could take a while to get past the shutdown screen however you must know it could be normal.

 

So the period of time is enough for fabric to detect all these variations however this should not be your concern. You can learn more about RoleEnvironment.Stopping event in MSDN article as below: