Troubleshooting Performance Issues with Cloud Services (PaaS) – Data Collection


Collecting right data at the right time is the most critical part of troubleshooting any sort of Performance issues, especially when the issue is intermittent and you can't reproduce this at will. There are many tools available to collect this data but you need to know which one to use when and how. In this blog post we have tried to summarize what, when and how of Data Collection for troubleshooting various Performance scenarios. Use the table below to find what kind of issue you are dealing with and then follow the link to understand when and how.

You are troubleshooting: Data collection when problem:
  can be reproduced easily is intermittent
Crash ProcDump manual Debug Diag Automated crash rule
Hang ProcDump consecutive Debug Diag Automated Hang Rule
Slow Response time IIS Logs IIS Logs
Freb Logs Freb Logs
ProcDump manual DebugDiag Automate Slow response
PerfView manual
High CPU ProcDump manual Procdump automated
PerfView manual PerfView automated
High Memory PerfView manual Debug Diag Automated high memory

You would need one or more of these tools for collecting the logs depending up on the scenario:

Tool 1: Procdump  https://docs.microsoft.com/en-us/sysinternals/downloads/procdump

Tool 2: Perfview : http://www.microsoft.com/en-us/download/details.aspx?id=28567

Tool 3: Debug Diag: https://www.microsoft.com/en-us/download/details.aspx?id=49924

 

ProcDump manually

Following command can be used to attach the debugger to a process of which you want to capture the memory dump:

C:\Tools\procdump> procdump.exe -ma -s 30 -n 3 <PID> <OutputFolder>

- ma: Write a dump file with all process memory. The default dump format only includes thread and handle information.
-s: Consecutive seconds before dump is written (default is 10).
-n: Number of dumps to write before exiting.
-s:  Seconds between consecutive dumps. Change the parameter based on the slowness.
<PID>: replace this with the IP of the process for which you want to capture the dump. You can use Task Manager to get PID of the process.
<outputFolder>: Location where dump should be stored.

 

ProcDump automated / consecutive

Following command will capture 3 consecutive memory dumps of process with id 5844 each after 5 seconds when CPU of that process reach 70% and save the dumps at c:\dumps\

C:\Tools\procdump> procdump -ma -c 70 -s 5 -n 3 5844 c:\dumps\

-ma: Write a dump file with all process memory. The default dump format only includes thread and handle information.
-c: CPU threshold at which to create a dump of the process.
-s: Consecutive seconds before dump is written (default is 10). Change the parameter based on slowness
-n: Number of dumps to write before exiting.

 

IIS Logs

Location for collecting IIS logs
C:\Resources\Directory\{DeploymentID}.{Rolename}.DiagnosticStore\LogFiles\Web
Collect the log files which are relevant to the time of issue

 

Freb Logs

Location for Collecting FREB logs
C:\Resources\Directory\{DeploymentID}.{Rolename}.DiagnosticStore\FailedReqLogFiles

NOTE: This is not turned on by default in Windows Azure and is not frequently used.  But if you are troubleshooting IIS/ASP.NET specific issues you should consider turning FREB tracing on in order to get additional details. How to enable Failed Request Tracing

 

PerfView manually

  • Open the perfview tool, go to Collect Menu and click on Collect option
  • Select Zip, Merge, thread time check box as below and Click on Start Collection.

  • Leave it for few seconds and stop the collection. (It will take a minute or so to create a compressed file)

 

PerfView automated

Following command can be used to capture 5 Perfview logs, 1000 MB each for w3wp.exe when it reached 60% CPU

C:\Tools\perfview> Perfview /NoGui collect "/StartOnPerfCounter=Process:% Processor Time:w3wp>60" -ThreadTime -CircularMB:1000 -CollectMultiple:5 –accepteula
 

-ThreadTime: Is a parameter that monitors the CPU Thread Time
-accepteula: specifies to automatically accept the Microsoft Software License Terms.
 
Example:

If you're seeing 80% to 90% CPU utilization by w3wp.exe then this threshold value should be 80.

 

Ok, I have captured the data. What do I do with that?

Following are few helpful guidance to assist you in doing some initial analysis from the memory dumps/PerfView logs. Or you can always reach out to Microsoft Azure Support for additional help:

How to analyze a memory dump using Debug Diag

Process Crash analysis from a memory dump

How to analyze Memory Leak with memory dump using  WinDBG

How to use PerfView to diagnose Memory Leak

Channel9 Tutorial on Memory investigation using PerfView

Channel9 Tutorial on CPU Performance investigation using PerfView


Comments (0)

Skip to main content