Windows Azure worker role crashing

Problem

We have seen an issue where a worker role running inside Windows Azure will crash, especially if it is under heavy load.  The problem is with IntelliTrace being enabled.  To determine if this is causing your worker role to crash, the first thing to look for are some event logs that look like this:

 Application: WaWorkerHost.exe
Framework Version: v4.0.30319
Description: The process was terminated due to an unhandled exception.
Exception Info: System.Runtime.CallbackException
Stack:
   at System.Runtime.Fx+IOCompletionThunk.UnhandledExceptionFrame(UInt32, UInt32, System.Threading.NativeOverlapped*)
   at System.Threading._IOCompletionCallback.PerformIOCompletionCallback(UInt32, UInt32, System.Threading.NativeOverlapped*)
 Log Name:      Application
Source:        .NET Runtime
Date:          10/4/2011 3:16:01 PM
Event ID:      1026
Task Category: None
Level:         Error
Keywords:      Classic
User:          N/A
Computer:      RD00155D3203F9
Description:
Application: WaWorkerHost.exe
Framework Version: v4.0.30319
Description: The process was terminated due to an unhandled exception.
Exception Info: System.Runtime.CallbackException
Stack:
   at System.Runtime.Fx+IOCompletionThunk.UnhandledExceptionFrame(UInt32, UInt32, System.Threading.NativeOverlapped*)
   at System.Threading._IOCompletionCallback.PerformIOCompletionCallback(UInt32, UInt32, System.Threading.NativeOverlapped*)

Event Xml:
<Event xmlns="https://schemas.microsoft.com/win/2004/08/events/event">
  <System>
    <Provider Name=".NET Runtime" />
    <EventID Qualifiers="0">1026</EventID>
    <Level>2</Level>
    <Task>0</Task>
    <Keywords>0x80000000000000</Keywords>
    <TimeCreated SystemTime="2011-10-04T15:16:01.000Z" />
    <EventRecordID>426</EventRecordID>
    <Channel>Application</Channel>
    <Computer>RD00155D3203F9</Computer>
    <Security />
  </System>
  <EventData>
    <Data>Application: WaWorkerHost.exe
Framework Version: v4.0.30319
Description: The process was terminated due to an unhandled exception.
Exception Info: System.Runtime.CallbackException
Stack:
   at System.Runtime.Fx+IOCompletionThunk.UnhandledExceptionFrame(UInt32, UInt32, System.Threading.NativeOverlapped*)
   at System.Threading._IOCompletionCallback.PerformIOCompletionCallback(UInt32, UInt32, System.Threading.NativeOverlapped*)
</Data>
  </EventData>
</Event>

If you see event logs like these, it doesn’t definitely mean that IntelliTrace is causing the problem, but if you have IntelliTrace enabled, it is pretty likely that it is.

To determine for sure that IntelliTrace is the problem, you would need to capture a dump of the Worker Role process (WaWorkerHost.exe) by remoting into the instance and attaching a debugger to the process.  If you look at the dump with Windbg or some other dump analysis tool, you want to look for the following exception:

 Exception object: 0000000004acefc8
Exception type:   System.Runtime.CallbackException
Message:          Async Callback threw an exception.
InnerException:   System.InvalidProgramException, Use !PrintException 0000000004ad8f68 to see more.
StackTrace (generated):
<none>
StackTraceString: <none>
HResult: 80131501
0:018> !pe 0000000004ad8f68 
Exception object: 0000000004ad8f68
Exception type:   System.InvalidProgramException
Message:          Common Language Runtime detected an invalid program.
InnerException:   <none>
StackTrace (generated):
<none>
StackTraceString: <none>
HResult: 8013153a
====================================================================
0:106> !pe 00000000044ec790
Exception object: 00000000044ec790
Exception type:   System.InvalidProgramException
Message:          Common Language Runtime detected an invalid program.
InnerException:   <none>
StackTrace (generated):
    SP               IP               Function
    000000002CF1E600 0000000000000000 System_ServiceModel!System.ServiceModel.Dispatcher.ErrorBehavior.HandleErrorCommon(System.Exception, System.ServiceModel.Dispatcher.ErrorHandlerFaultInfo ByRef)+0x1
    000000002CF1E6D0 000007FF00F6B837 System_ServiceModel!System.ServiceModel.Dispatcher.ChannelDispatcher.HandleError(System.Exception, System.ServiceModel.Dispatcher.ErrorHandlerFaultInfo ByRef)+0xa7
    000000002CF1E730 000007FF00F6B650 System_ServiceModel!System.ServiceModel.Dispatcher.ChannelHandler.HandleError(System.Exception)+0x30
    000000002CF1E780 000007FF00E76AB2 System_ServiceModel!System.ServiceModel.Dispatcher.ChannelHandler.OpenAndEnsurePump()+0xc2
    000000002CF1E7E0 000007FF00E58184 System_Runtime_DurableInstancing!System.Runtime.IOThreadScheduler+ScheduledOverlapped.IOCallback(UInt32, UInt32, System.Threading.NativeOverlapped*)+0xf4
    000000002CF1E860 000007FF00E58035 System_Runtime_DurableInstancing!System.Runtime.Fx+IOCompletionThunk.UnhandledExceptionFrame(UInt32, UInt32, System.Threading.NativeOverlapped*)+0x115
    000000002CF1E8C0 000007FF009DC35B mscorlib!System.Threading._IOCompletionCallback.PerformIOCompletionCallback(UInt32, UInt32, System.Threading.NativeOverlapped*)+0x9b

StackTraceString: <none>

Note, this was shown using psscor4 and running !dumpallexceptions and then !printexception on the inner exception.

Solution

To resolve this, you need to redeploy your package to Azure and disable IntelliTrace.

Note

This problem was first posted on the MSDN forums.