Capturing dumps for intermittent issues happening only on certain instances could be very challenging in Azure App Service. Depending upon the scenario, you can capture dumps using Auto-heal by defining the triggers in the root web.config file of your web site and configure the actions to invoke procdump when these triggers are hit. With this approach, If you have multiple instances of your web app, it will only generate memory dumps for the instance that has hit this trigger and not all instances. Here is an example to capture dumps using FREB customAction* fields. This approach will have 5-10% performance hit and will require you to enable FREB. This approach of dump collection works best when your scenario fits to the available triggers. The triggers on which you can generate dumps are based upon attributes including: Count, timeInterval, statusCode, subStatusCode, win32StatusCode, and privateBytesInKB.
What if your w3wp.exe crashes with different trigger like an exception code, ex StackOverflow exception (0xC00000FD), AccessViolationException (0xC0000005), etc? In this scenario, you can use Crash Diagnoser siteextension, however it will only monitor the current Instance in the current Kudu site (which is a random instance).
This is where procdumphelper can be used. It runs as a continuous WebJob on all the instances of your web app, and acts as a proxy for procdump.exe automatically attaching to the right non-SCM instance of w3wp.exe. Below are the steps to install and configure procdumphelper WebJob.
Step 1: Create a storage account. Once the storage account is created, select the newly created account and go to "Access keys" blade. Copy the connection string (key1).
[Fig 1: Image showing Application settings with AzureWebJobsDashboard and AzureWebJobsStorage key-value pair]
Step 2: Go to your web app, 'Application settings' blade, and add below two entries (AzureWebJobsDashboard and AzureWebJobsStorage) with the above copied connection string.
[Fig 2: Image showing Application settings with AzureWebJobsDashboard and AzureWebJobsStorage key-value pair]
Step 3: In the 'Application settings' blade, turn ON the 'Always On' options.
[Fig 3: Image showing Application settings with 'Always On' set to 'On']
Step 4: Open KUDU Debug console of your web app and goto “D:\home\site\wwwroot\app_data\jobs\continuous“. Create these folders if not already present.
Step 5: Download procdumphelper.zip to local drive and drag-drop the zip file to "D:\home\site\wwwroot\app_data\jobs\continuous". Once done, you should have "D:\home\site\wwwroot\app_data\jobs\continuous\procdumphelper" with all the contents of the WebJob.
[Fig 4: Animated gif showing how to drag-drop procdumphelper.zip into "D:\home\site\wwwroot\app_data\jobs\continuous\".]
Step 6: Create "D:\home\data\procdumphelper" folder with "params.txt" file as shown below. The contents of params.txt has below special capture usage. Notice the use of special field "$ProcID". This field will be automatically replaced with correct process id for w3wp associated with your web app instance.
Capture Usage: w3wp -accepteula -ma $ProcID [procdumpparams]
1. Create a full user hang dump for w3wp application process.
w3wp -accepteula -ma $ProcID D:\home\Logfiles\
2. Create a full user hang dump on stackoverflow exception code for w3wp application process.
w3wp -accepteula -ma $ProcID -g -e 1 -f C00000FD D:\home\Logfiles\
3. Create a full user hang dump for dotnet application process.
dotnet -accepteula -ma $ProcID D:\home\Logfiles\
4. Create a full user hang dump for custom dotnet application process.
customdotnetapp -accepteula -ma $ProcID D:\home\Logfiles\
[Fig 5: Animated gif showing how to create "D:\home\data\procdumphelper" folder with "params.txt" file.]
Step 7: Reproduce the issue. Once the dump is available, download it.
[Fig 6: Image showing how to download the dump."]
Once the dumps are captured successfully, you can delete the WebJob and its associated storage account and turn OFF 'Always ON' depending upon your need.