Whidbey Remoting AccessViolation Problem [Maheshwar Jayaraman]

There has been some cases where users have reported an AccessViolation when upgrading their Remoting app's to Whidbey. Some users found that the problem repro'd only when they had some anti virus software (Nod32 in particular) installed and the AV went away when they configured the anti-virus not to scan the problematic exe's. We managed to get a repro in house and the stack of the repro was

Unhandled Exception: System.AccessViolationException: Attempted to read or write protected memory. This is often an indi
cation that other memory is corrupt.
at System.Net.UnsafeNclNativeMethods.OSSOCK.WSAGetOverlappedResult(SafeCloseSocket socketHandle, IntPtr overlapped, U
Int32& bytesTransferred, Boolean wait, IntPtr ignored)
at System.Net.Sockets.BaseOverlappedAsyncResult.CompletionPortCallback(UInt32 errorCode, UInt32 numBytes, NativeOverl
apped* nativeOverlapped)
at System.Threading._IOCompletionCallback.PerformIOCompletionCallback(UInt32 errorCode, UInt32 numBytes, NativeOverla
pped* pOVERLAP) 

Here is the code that invokes the native function WSAGetOverlappedResult function in BaseOverlappedAsyncResult.CompletionPortCallback().

bool success = UnsafeNclNativeMethods.OSSOCK.WSAGetOverlappedResult(
socket.SafeHandle,
(IntPtr)nativeOverlapped,
out numBytes,
false,
IntPtr.Zero);

 IntPtr is a struct and in Whidbey the implementation was changed such that IntPtr.Zero points to a struct but there is no memory allocated for the the void pointer. This means that if someone tries to reference/dereference the value pointed by this IntPtr.Zero struct it will either throw NullRef in managed code or throw AV in unmanaged code.

Here is what MSDN documentation has to say about the WSAGetOverlappedResult  method.

 BOOL WSAAPI WSAGetOverlappedResult( 
  SOCKET  s  , 
  LPWSAOVERLAPPED  lpOverlapped  , 
  LPDWORD  lpcbTransfer  , 
  BOOL  fWait  , 
  LPDWORD  lpdwFlags  ); 
Parameters
  • s

    [in] A descriptor identifying the socket. This is the same socket that was specified when the overlapped operation was started by a call to WSARecv, WSARecvFrom, WSASend, WSASendTo, or WSAIoctl.

  • lpOverlapped

    [in] A pointer to a WSAOVERLAPPED structure that was specified when the overlapped operation was started. This parameter must not be a NULL pointer.

  • lpcbTransfer

    [out] A pointer to a 32-bit variable that receives the number of bytes that were actually transferred by a send or receive operation, or by WSAIoctl. This parameter must not be a NULL pointer.

  • fWait

    [in] A flag that specifies whether the function should wait for the pending overlapped operation to complete. If TRUE, the function does not return until the operation has been completed. If FALSE and the operation is still pending, the function returns FALSE and the WSAGetLastError function returns WSA_IO_INCOMPLETE. The fWait parameter may be set to TRUE only if the overlapped operation selected the event-based completion notification.

  • lpdwFlags

    [out] A pointer to a 32-bit variable that will receive one or more flags that supplement the completion status. If the overlapped operation was initiated through WSARecv or WSARecvFrom, this parameter will contain the results value for lpFlags parameter. This parameter must not be a NULL pointer.

Turns out the flag cannot be a null pointer which means that NCL was passing the wrong argument for this. So the fix is to pass a valid pointer to the unmanaged API.

 

How to get the QFE:

We have issued an QFE for this and since it hasnt been through the requirements for a public KB yet this is only available by calling Microsoft product support. Dont worry this should be a free call. To get this QFE call Microsoft Product Support Services (support.microsoft.com) and ask for the Hotfix for KB number 923028 or mention that you need the fix for "Visual Studio Update QFE 4333" (Thats the internal bug # of this).

Update:  I have a detailed post on how this issue was debugged over here.