Troubleshooting Scenario - High CPU usage on PaaS Roles with the same load after a running for a few hours

 

I had this interesting issue reported where the instance count would increase from 10 to 50 over the course of the month. This used to happen with the exact similar load and number of users. This was really perplexing for the customer’s Azure application developers and hence they reported the issue to Microsoft Azure Support.

At the very first analysis we found they used to manually increase the instance count as the existing instances used to hit CPU around 90% and forming a plateau there, and hence becoming more or less unresponsive. So it was relatively simple to isolate the cause of the increase in instance count. The crucial thing was to find the cause of the high CPU. 

High CPU on the instance is generally caused by the application code. So we took memory dump on one of the instance when it gets into high CPU situation. The way we analyze is like the one I did for a situation detailed in this blog https://blogs.msdn.com/b/cie/archive/2013/11/28/windows-azure-worker-role-showing-high-cpu.aspx

Process to collect dumps  

  a) RDP to the instance running the Cloud Service.
b) From task manager check which process is taking most CPU and is staying there without coming down.
c) Right click and collect a Full Crash Dump and repeat this every 1 minute for say 5 minutes. So we will have 5 dump files.
d) Once you have the files you can analyze it or create a ticket with Microsoft for an engineer to help analyze.

In this case it was W3WP process so we collected process dumps on it. In the first 2 memory dumps I didn’t find any high CPU.

Dump Analysis  

I am not delving into the details on how I did it as it merits a separate discussion. In all the dumps I see the CPU moving between 81% to 100%. Most of the calls that are stuck are as following

 SP               IP               Function                                                                                                                                                                                                                                                        Source
 00000011a7ff9008 0000000000000000 HelperMethodFrame                                                                                                                                                                                                                                               
 00000011a7ff9150 00007ff90caf1177 Microsoft.Data.OData.DuplicatePropertyNamesChecker.CheckForDuplicatePropertyNames(Microsoft.Data.OData.ODataProperty)                                                                                                                                           
 00000011a7ff91b0 00007ff90caee6d8 Microsoft.Data.OData.Atom.ODataAtomPropertyAndValueDeserializer.ReadPropertiesImplementation(Microsoft.Data.Edm.IEdmStructuredType, System.Collections.Generic.List`1<Microsoft.Data.OData.ODataProperty>, Microsoft.Data.OData.DuplicatePropertyNamesChecker,  
 00000011a7ff9240 00007ff90caede16 Microsoft.Data.OData.Atom.ODataAtomEntryAndFeedDeserializer.ReadAtomContentElement(Microsoft.Data.OData.Atom.IODataAtomReaderEntryState)                                                                                                                        
 00000011a7ff92c0 00007ff90caec553 Microsoft.Data.OData.Atom.ODataAtomEntryAndFeedDeserializer.ReadAtomElementInEntry(Microsoft.Data.OData.Atom.IODataAtomReaderEntryState)                                                                                                                        
 00000011a7ff9300 00007ff90caec2a2 Microsoft.Data.OData.Atom.ODataAtomEntryAndFeedDeserializer.ReadEntryContent(Microsoft.Data.OData.Atom.IODataAtomReaderEntryState)                                                                                                                              
 00000011a7ff9370 00007ff90cae9665 Microsoft.Data.OData.Atom.ODataAtomReader.ReadEntryStart()                                                                                                                                                                                                      
 00000011a7ff93e0 00007ff90caf2ef6 Microsoft.Data.OData.Atom.ODataAtomReader.ReadAtEntryEndImplementation()                                                                                                                                                                                        
 00000011a7ff9430 00007ff90cae88df Microsoft.Data.OData.ODataReaderCore.ReadImplementation()                                                                                                                                                                                                       
 00000011a7ff9480 00007ff90cae8727 Microsoft.Data.OData.ODataReaderCore.InterceptException[[System.Boolean, mscorlib]](System.Func`1<Boolean>)                                                                                                                                                     
 00000011a7ff94f0 00007ff90cca40a6 Microsoft.WindowsAzure.Storage.Table.Protocol.TableOperationHttpResponseParsers.TableQueryPostProcessGeneric[[System.__Canon, mscorlib]](System.IO.Stream, System.Func`6<System.String,System.String,System.DateTimeOffset,System.Collections.Generic.IDictiona 
 00000011a7ff9580 00007ff90cca3df1 Microsoft.WindowsAzure.Storage.Table.TableQuery`1+<>c__DisplayClassf`2[[System.__Canon, mscorlib],[System.__Canon, mscorlib],[System.__Canon, mscorlib]].<QueryImpl>b__e(Microsoft.WindowsAzure.Storage.Core.Executor.RESTCommand`1<Microsoft.WindowsAzure.Stor 
 00000011a7ff95e0 00007ff90caddc69 Microsoft.WindowsAzure.Storage.Core.Executor.Executor.ProcessEndOfRequest[[System.__Canon, mscorlib]](Microsoft.WindowsAzure.Storage.Core.Executor.ExecutionState`1<System.__Canon>)                                                                            
 00000011a7ff9630 00007ff90cad99e5 Microsoft.WindowsAzure.Storage.Core.Executor.Executor.ExecuteSync[[System.__Canon, mscorlib]](Microsoft.WindowsAzure.Storage.Core.Executor.StorageCommandBase`1<System.__Canon>, Microsoft.WindowsAzure.Storage.RetryPolicies.IRetryPolicy, Microsoft.WindowsAz 
 00000011a7ff9970 00007ff90cca30ae Microsoft.WindowsAzure.Storage.Table.TableQuery`1+<>c__DisplayClass7[[System.__Canon, mscorlib]].<ExecuteInternal>b__6(Microsoft.WindowsAzure.Storage.IContinuationToken)                                                                                       
 00000011a7ff99d0 00007ff90cca2fd8 Microsoft.WindowsAzure.Storage.Core.Util.CommonUtility+<LazyEnumerable>d__0`1[[System.__Canon, mscorlib]].MoveNext()                                                                                                                                            
 00000011a7ff9a40 00007ff90cca22f8 MoviePlayer.TableStorage.GetListRangeEntity[[System.__Canon, mscorlib]](System.Collections.Generic.List`1<MoviePlayer.QueryTableStorage>, System.String)                                                                                                        
 00000011a7ff9b70 00007ff90cca0612 MoviePlayer.NoSQLData.GetListCategoryTS(MoviePlayer.Video, System.String)                                                                                                                                                                          
 00000011a7ffa020 00007ff90cca3ab8 MoviePlayer.NoSQLData.GetCategoryLatest(MoviePlayer.Video, System.String)                                                                                                                                                                            
 00000011a7ffa090 00007ff90cafda64 RepSyndWebApplication.Player.Default.Page_Load(System.Object, System.EventArgs)                                                                                                                                                                             
 00000011a7ffa640 00007ff962abc0b7 System.Web.UI.Control.LoadRecursive()                                                                                                                                                                                                                           
 00000011a7ffa690 00007ff962adcc4a System.Web.UI.Page.ProcessRequestMain(Boolean, Boolean)                                                                                                                                                                                                         
 00000011a7ffa750 00007ff962adbec9 System.Web.UI.Page.ProcessRequest(Boolean, Boolean)                                                                                                                                                                                                             
 00000011a7ffa7c0 00007ff962adbd27 System.Web.UI.Page.ProcessRequest()                                                                                                                                                                                                                             
 00000011a7ffa860 00007ff962ada453 System.Web.UI.Page.ProcessRequest(System.Web.HttpContext)                                                                                                                                                                                                       
 00000011a7ffa8b0 00007ff962ae4b61 System.Web.HttpApplication+CallHandlerExecutionStep.System.Web.HttpApplication.IExecutionStep.Execute()                                                                                                                                                         
 00000011a7ffa990 00007ff962aabee5 System.Web.HttpApplication.ExecuteStep(IExecutionStep, Boolean ByRef)                                                                                                                                                                                           
 00000011a7ffaa30 00007ff962ac954a System.Web.HttpApplication+PipelineStepManager.ResumeSteps(System.Exception)                                                                                                                                                                                    
 00000011a7ffab80 00007ff962aac0f3 System.Web.HttpApplication.BeginProcessRequestNotification(System.Web.HttpContext, System.AsyncCallback)                                                                                                                                                        
 00000011a7ffabd0 00007ff962aa613e System.Web.HttpRuntime.ProcessRequestNotificationPrivate(System.Web.Hosting.IIS7WorkerRequest, System.Web.HttpContext)                                                                                                                                          
 00000011a7ffac70 00007ff962aaefb1 System.Web.Hosting.PipelineRuntime.ProcessRequestNotificationHelper(IntPtr, IntPtr, IntPtr, Int32)                                                                                                                                                              
 00000011a7ffae80 00007ff962aae9e2 System.Web.Hosting.PipelineRuntime.ProcessRequestNotification(IntPtr, IntPtr, IntPtr, Int32)                                                                                                                                                                    
 00000011a7ffaed0 00007ff9632028d1 DomainNeutralILStubClass.IL_STUB_ReversePInvoke(Int64, Int64, Int64, Int32)                                                                                                                                                                                     
 00000011a7ffb6e8 0000000000000000 InlinedCallFrame                                                                                                                                                                                                                                                
 00000011a7ffb6e8 0000000000000000 InlinedCallFrame                                                                                                                                                                                                                                                
 00000011a7ffb6c0 00007ff962b5838b DomainNeutralILStubClass.IL_STUB_PInvoke(IntPtr, System.Web.RequestNotificationStatus ByRef)                                                                                                                                                                    
 00000011a7ffb790 00007ff962aaf19f System.Web.Hosting.PipelineRuntime.ProcessRequestNotificationHelper(IntPtr, IntPtr, IntPtr, Int32)                                                                                                                                                              
 00000011a7ffb9a0 00007ff962aae9e2 System.Web.Hosting.PipelineRuntime.ProcessRequestNotification(IntPtr, IntPtr, IntPtr, Int32)                                                                                                                                                                    
 00000011a7ffb9f0 00007ff9632028d1 DomainNeutralILStubClass.IL_STUB_ReversePInvoke(Int64, Int64, Int64, Int32)                                                                                                                                                                                     
 00000011a7ffda48 0000000000000000 InlinedCallFrame                                                                                                                                                                                                                                                
 00000011a7ffda48 0000000000000000 InlinedCallFrame                                                                                                                                                                                                                                                
 00000011a7ffda20 00007ff962b5838b DomainNeutralILStubClass.IL_STUB_PInvoke(IntPtr, System.Web.RequestNotificationStatus ByRef)                                                                                                                                                                    
 00000011a7ffdaf0 00007ff962aaf19f System.Web.Hosting.PipelineRuntime.ProcessRequestNotificationHelper(IntPtr, IntPtr, IntPtr, Int32)                                                                                                                                                              
 00000011a7ffdd00 00007ff962aae9e2 System.Web.Hosting.PipelineRuntime.ProcessRequestNotification(IntPtr, IntPtr, IntPtr, Int32)                                                                                                                                                                    
 00000011a7ffdd50 00007ff9632028d1 DomainNeutralILStubClass.IL_STUB_ReversePInvoke(Int64, Int64, Int64, Int32)                                                                                                                                                                                     
 00000011a7ffec98 0000000000000000 InlinedCallFrame                                                                                                                                                                                                                                                
 00000011a7ffec98 0000000000000000 InlinedCallFrame                                                                                                                                                                                                                                                
 00000011a7ffec70 00007ff962b5838b DomainNeutralILStubClass.IL_STUB_PInvoke(IntPtr, System.Web.RequestNotificationStatus ByRef)                                                                                                                                                                    
 00000011a7ffed40 00007ff962aaf19f System.Web.Hosting.PipelineRuntime.ProcessRequestNotificationHelper(IntPtr, IntPtr, IntPtr, Int32)                                                                                                                                                              
 00000011a7ffef50 00007ff962aae9e2 System.Web.Hosting.PipelineRuntime.ProcessRequestNotification(IntPtr, IntPtr, IntPtr, Int32)                                                                                                                                                                    
 00000011a7ffefa0 00007ff9632028d1 DomainNeutralILStubClass.IL_STUB_ReversePInvoke(Int64, Int64, Int64, Int32)                                                                                                                                                                                     
 00000011a7fff1a8 0000000000000000 ContextTransitionFrame

The common function in all the call stacks is

MoviePlayer.TableStorage.GetListRangeEntity[[System.__Canon, mscorlib]](System.Collections.Generic.List`1<MoviePlayer.QueryTableStorage>, System.String)

It is executing the following code

 public System.Collections.Generic.List<T> GetListRangeEntity<T>(System.Collections.Generic.List<MoviePlayer.QueryTableStorage> queue, string query)
 {
   string string1 = "";
   if (!string.IsNullOrWhiteSpace(query)) goto lab1;
   if (string.IsNullOrWhiteSpace(query))
   {
     if (que != null)
     {
       System.Collections.Generic.List<MoviePlayer.QueryTableStorage>.Enumerator enumerator1 = queue.GetEnumerator();
       try
       {
         while (enumerator1.MoveNext())
         {
           MoviePlayer.QueryTableStorage storage1 = enumerator1.Current;
           switch (storage1.TypeTS)
           {
             case 0: goto lab2;
             case 1: goto lab3;
             case 2: goto lab4;
             case 3: goto lab5;
             case 4: goto lab6;
             case 5: goto lab7;
             case 6: goto lab8;
             case 7: goto lab9;
           }
           goto lab10;
         lab2:
           string1 = string.Concat(string1, Microsoft.WindowsAzure.Storage.Table.TableQuery.GenerateFilterCondition(storage1.KeyTS, storage1.OperationTS, storage1.ValueTS.ToString()));
           goto lab10;
         lab3:
           
    <SNIPPED>

The collection passed is as below.

0000000f58a9f1f0 System.Collections.Generic.List`1[[MoviePlayer.CategoryList, MoviePlayerDistList]]

The collection is as follows and contains 726 objects

MT Field Offset Type VT Attr Value Name

00007ff96af01250 4000cd1 8 System.Object[] 0 instance 0000000f58b9a7c8 _items

00007ff96af037c8 4000cd2 18 System.Int32 1 instance 726 _size

00007ff96af037c8 4000cd3 1c System.Int32 1 instance 726 _version

00007ff96af011b8 4000cd4 10 System.Object 0 instance 0000000000000000 _syncRoot

Looking at the size of this object.

sizeof(0000000f58b40940) = 137048 (0x21758) bytes (MoviePlayer.CategoryList) 

This object MoviePlayer.CategoryList size is greater than 85,000 bytes. Any object greater than 85,000 bytes will not get allocated in the normal SOH heap but will go to the LOH (Large Object Heap). Details around LOH and GC can be found in the articles

https://msdn.microsoft.com/en-us/library/ee787088.aspx

https://msdn.microsoft.com/en-us/magazine/cc534993.aspx

If I look at the process uptime 1:45:31.000 = 6331 seconds in the first dump. If I look at the number of time GC has run it’s very high for Gen 2. It’s almost like Gen 2 collection is attempted in every two seconds.

.NET CLR Memory

Counter Value

=============== ==============

Bytes in All Heaps 84,495,440

GCHandles 1,514

GEN 0 Collections 55,143

GEN 1 Collections 13,746

GEN 2 Collections 3,463

# Induced GCs 0

# of Pinned Objects 2

Sync Blocks in use 121

Finalization Survivors 0

Total Commited Bytes 502,095,872

Total Reserved Bytes 18,253,578,240

GEN 0 Heap Size 26,423,840

GEN 1 Heap Size 2,989,384

GEN 2 Heap Size 52,896,880

LOH Size 2,185,336

% Time in GC 7.90%

So the Action Plan for this issue was to reduce the size of the object for MoviePlayer.CategoryList. Since most outside developer support engineer roles are not familiar with post mortem analysis the following could be used to find the size of the object using .Net or Visual Studio

https://stackoverflow.com/questions/324053/find-out-the-size-of-a-net-object

 After implementing the suggestions the CPU grows in linear fashion with load and not exponential. The CPU stopped hitting >90% and staying there, so there was no need to spawn additional instances of the role to server users.

Hope this article helps in understanding one of the fundamental causes of frequent GC leading to high CPU. It’s not just Azure but could happen on premises application as well.

Regards,

Angshuman Nayak

Cloud Integration Engineer