Windows Azure Worker Role Showing High CPU


I was recently troubleshooting an issue where a Worker Role was deployed on Windows Azure.The Role had 3 instances and during most of the time all the instances used to run at 99% + CPU. The instances query a database, gets social media posts and send emails. Most of the time all the  instances would peg at 99% CPU and won’t come down, this was observed from the Windows Azure Management Portal. So I did an RDP to the instance to look at the process which was consuming more CPU. We found that WaWorkerHost process which was consuming 90% to 96% of the CPU. Since there were no errors or exceptions logged in I had no choice but to take a memory dump of the running process for analysis.  

With high CPU we generally look for tight loops like a trivial do while or for/foreach. 

From the dump files I found the following .

 0:015> !threadpool

CPU utilization: 100%

Worker Thread: Total: 2 Running: 0 Idle: 2 MaxLimit: 32767 MinLimit: 1

Work Request in Queue: 0

————————————–

Number of Timers: 6

————————————–

Completion Port Thread:Total: 1 Free: 1 MaxFree: 2 CurrentLimit: 1 MaxLimit: 1000 MinLimit: 1 

0:015> !clrstack

OS Thread Id: 0x1e0 (15)

Child                                 SP                          IP                                                                                                                                        Call Site

00000000240ce2a0             000007ff00746a4c    System.DateTime.GetDatePart(Int32)                                                                                    [f:\dd\ndp\clr\src\BCL\System\DateTime.cs @ 752]

00000000240ce2e0             000007ff0074694e    System.TimeZoneInfo.GetDateTimeNowUtcOffsetFromUtc(System.DateTime, BooleanByRef)   [f:\dd\ndp\clr\src\BCL\System\TimeZoneInfo.cs @ 2129]

00000000240ce340             000007ff0074685a    System.DateTime.get_Now()                                                                                                [f:\dd\ndp\clr\src\BCL\System\DateTime.cs @ 927]

00000000240ce3a0             000007ff00d8d568     WorkerRole.WorkerRole.Run()

00000000240ce420             000007ff00d8d3c5     Microsoft.WindowsAzure.ServiceRuntime.RoleEnvironment.StartRoleInternal()

00000000240ce470             000007ff00d8d1c3     Microsoft.WindowsAzure.ServiceRuntime.Implementation.Loader.RoleRuntimeBridge.<StartRole>b__1()

00000000240ce4b0             000007ff0073f12c     System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext,System.Threading.ContextCallback, System.Object, Boolean) [f:\dd\ndp\clr\src\BCL\System\Threading\ExecutionContext.cs @ 445]

00000000240ce510             000007ff0073f02b    System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext,System.Threading.ContextCallback, System.Object) [f:\dd\ndp\clr\src\BCL\System\Threading\ExecutionContext.cs @ 411]

00000000240ce560             000007ff0073ef2d    System.Threading.ThreadHelper.ThreadStart() [f:\dd\ndp\clr\src\BCL\System\Threading\Thread.cs @ 94]

00000000240ce9b8             000007fef7cb44c4   [GCFrame: 00000000240ce9b8]

00000000240ceda0             000007fef7cb44c4   [DebuggerU2MCatchHandlerFrame:00000000240ceda0]

00000000240cef38              000007fef7cb44c4   [ContextTransitionFrame:00000000240cef38]

00000000240cf160              000007fef7cb44c4  [DebuggerU2MCatchHandlerFrame:00000000240cf160] 

0:014> !threadpool

CPU utilization: 100%

Worker Thread: Total: 3 Running: 0 Idle: 3 MaxLimit: 32767 MinLimit: 1

Work Request in Queue: 0

————————————–

Number of Timers: 6

————————————–

Completion Port Thread:Total: 1 Free: 1 MaxFree: 2 CurrentLimit:1 MaxLimit: 1000 MinLimit: 1

0:014> !dso

OS Thread Id: 0x1e0 (14)

RSP/REG                  Object                                  Name

rsi                             0000000003097fe8                 Microsoft.CSharp.RuntimeBinder.Semantics.MethPropWithInst

rdi                             0000000002e61038                System.Threading.ContextCallback

00000000240CE420   0000000002e61cb0                 WorkerRole.WorkerRole

00000000240CE470   0000000002e5a3b8                 System.Collections.Generic.Dictionary`2[[System.String,mscorlib],[Microsoft.WindowsAzure.ServiceRuntime.RoleInstanceEndpoint,Microsoft.WindowsAzure.ServiceRuntime]]

00000000240CE4B0  0000000003097f50                  System.IEquatable`1[[Microsoft.CSharp.RuntimeBinder.Semantics.AggregateType,Microsoft.CSharp]]

00000000240CE4B8  0000000003097fe8                  Microsoft.CSharp.RuntimeBinder.Semantics.MethPropWithInst

00000000240CE4C0  0000000003097fe8                  Microsoft.CSharp.RuntimeBinder.Semantics.MethPropWithInst

00000000240CE500   0000000003097fe8                  Microsoft.CSharp.RuntimeBinder.Semantics.MethPropWithInst

00000000240CE510   0000000003097fe8                 Microsoft.CSharp.RuntimeBinder.Semantics.MethPropWithInst

00000000240CE550   0000000003097fe8                 Microsoft.CSharp.RuntimeBinder.Semantics.MethPropWithInst

00000000240CE560  0000000003097fe8                  Microsoft.CSharp.RuntimeBinder.Semantics.MethPropWithInst

00000000240CE950  0000000002f43678                  System.Threading.ThreadStart

0:014> !clrstack

OS Thread Id: 0x1e0 (14)

Child SP                          IP                            Call Site

00000000240ce3a0       000007ff00d8d57c         WorkerRole.WorkerRole.Run()

00000000240ce420       000007ff00d8d3c5         Microsoft.WindowsAzure.ServiceRuntime.RoleEnvironment.StartRoleInternal()

00000000240ce470       000007ff00d8d1c3         Microsoft.WindowsAzure.ServiceRuntime.Implementation.Loader.RoleRuntimeBridge.<StartRole>b__1()

00000000240ce4b0       000007ff0073f12c          System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext,System.Threading.ContextCallback,System.Object, Boolean)[f:\dd\ndp\clr\src\BCL\System\Threading\ExecutionContext.cs @ 445]

00000000240ce510       000007ff0073f02b          System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext,System.Threading.ContextCallback, System.Object)[f:\dd\ndp\clr\src\BCL\System\Threading\ExecutionContext.cs @ 411]

00000000240ce560      000007ff0073ef2d          System.Threading.ThreadHelper.ThreadStart()[f:\dd\ndp\clr\src\BCL\System\Threading\Thread.cs @ 94]

00000000240ce9b8      000007fef7cb44c4         [GCFrame: 00000000240ce9b8]

00000000240ceda0      000007fef7cb44c4         [DebuggerU2MCatchHandlerFrame:
00000000240ceda0]     00000000240cef38        000007fef7cb44c4 [ContextTransitionFrame:00000000240cef38]

00000000240cf160       000007fef7cb44c4        [DebuggerU2MCatchHandlerFrame:00000000240cf160]

 

As it can be seen at the very top of the stack we can see that WorkerRole.Run() method was called. So I dumped out the WorkerRole.dll which is the binary for the custom code. Using any of the widely available reflector tools we can find the code in any .net dll. So I loaded the dll and looked at the code in the Run() method. 

The code on the Run() method was as follows

 public override void Run()

 {         

    Utilities.CreateLogEntry(“WorkerRole is running”, null, LogCategories.Information);

    this.StartProcessors();

    DateTime time = Convert.ToDateTime(“3:00:00 AM”);

      while (true)

           {

               DateTime now = DateTime.Now;

               if ((DateTime.Compare(time,now) == 0) || (DateTime.Compare(time, now) < 0))  This will evaluate to true only between UTC 0:00:01AM till UTC 2:59:59 AM UTC. Rest of the time it will               evaluate to false leading to trivial execution and hence a tight loop.

                {

                   this.UnLoggedUserProcess();

                   time = time.AddDays(1.0);

               }

            }       

}

 

I suggested to modify the code to something as below. 

public override void Run()

        {

           Utilities.CreateLogEntry(“WorkerRole is running”, null, LogCategories.Information);

            this.StartProcessors();

            DateTime time = Convert.ToDateTime(“3:00:00 AM”);

            while (true)

            {

               DateTime now = DateTime.Now;               

              if ((DateTime.Compare(time,now) == 0) || (DateTime.Compare(time, now) < 0))

               
              {

                   this.UnLoggedUserProcess();                   

                   time = time.AddDays(1.0);              

              }

              Thread.Sleep(10000);  The thread will wait for 10 seconds before another evaluation of the if statement

               Utilities.CreateLogEntry(“Working: Waiting for Event”, null, LogCategories.Information);

             }

         }

So the important part is to identify if there are any tight loops due to non trivial work and make sure that we add some delay in the function. With Worker Role Run method its very crucial since it has a While(true), in fact we add this Thread.Sleep in the template of the Worker Role and it’s not advisable to remove it. Hope this will help you in identifying or narrowing down the cause of a high CPU issue with your Worker Roles. 

 

Angshuman Nayak , Cloud Integration Engineer

 

Comments (1)

  1. ant says:

    How did you download the dump file from the remote desktop?

Skip to main content