The (new) Trouble with select and LSPs...

If you’ve ever had to deal with Winsock LSPs, then you probably know that handling select function calls are rather convoluted – especially with certain apps that pass multiple socket types into select. Some applications (like IE) will pass a UDP/IPv4 and a TCP/IPv4 socket in a single select call. This works fine unless you have a non-IFS LSP installed only over UDP/IPv4 or TCP/IPv4 (but not both) in which case IE may lose network connectivity. This is because the order of the socket handles matter. That is, Winsock will find the first socket in the first non-empty FD_SET and call the provider that owns that handle to service the select call. If the first handle is a UDP/IPv4 socket from a base provider and the second handle is the TCP/IPv4 from an LSP provider, the base provider (BSP) will attempt to lookup the LSP handle which will fail with WSAENOTSOCK. Because of this Microsoft has always recommended that non-IFS LSPs layer over all entries for a particular address family (i.e. layer over all the IPv4 entries).

 

With Vista, things have gotten interesting again. Many system services (like DNS) have been updated to support both IPv4 and IPv6 which means there are applications passing in a mix of IPv4 and IPv6 sockets in a single select call. So, the same problem exists now if your LSP layers only over IPv4 entries but not over IPv6 entries. To prevent Vista from breaking (e.g. DNS name resolution not working) when an IPv4 only LSP is installed, a change was made to select to bypass LSPs entirely if anon-IFSLSP is installed that is layered over just IPv4 (or only over IPv6). Note that only the LSP’s WSPSelect routine is bypassed, all other calls will still be routed to the LSP. Also, Winsock will only enter the “bypass select” mode if it detects there is an LSP installed over only one of the two required protocols (IPv4 or IPv6). If all non-IFS LSPs are installed over both IPv4 and IPv6, then those LSPs will still intercept the select call.

 

Most LSPs do not need to intercept select unless they are interjecting data into the application’s stream in which case they would need to inject read events to notify select based applications that data is present. For those LSPs that do need to intercept select calls, they will have to be modified to do so. The method by which an LSP’s WSPSelect is bypassed is as follows:

  1. The ws2_32 select call translates each application socket to a base provider (BSP) handle, by calling the ioctl SIO_BSP_HANDLE_SELECT (described below) for each socket handle
  2. Once all FD_SETs have been translated, select routes the call to the first socket handle in the first non-empty FD_SET which will be a BSP. Since all socket handles are from base providers, all handles are known to be valid
  3. Once the BSP provider returns, the handles returned are mapped back to the application handles

For an LSP to intercept select calls, it will need to trap the WSAIoctl call and the ioctl SIO_BSP_HANDLE_SELECT in the LSP’s WSPIoctl function. This ioctl passes the application handle as the socket parameter and expects the “base” handle to be returned in the output parameter. For an LSP to successfully intercept select calls on Vista it must do the following:

  1. The LSP must be installed either over all BSP entries or be installed over all IPv4 and IPv6 entries
  2. The LSP must intercept the SIO_BSP_HANDLE_SELECT ioctl call and return the same socket handle value that was passed as the socket parameter (i.e. the LSP’s socket handle that it returned to the caller)

The following code snippet illustrates the basic idea:

int WSPAPI

WSPIoctl(

SOCKET s,

DWORD dwIoControlCode,

LPVOID lpvInBuffer,

DWORD cbInBuffer,

LPVOID lpvOutBuffer,

DWORD cbOutBuffer,

LPDWORD lpcbBytesReturned,

LPWSAOVERLAPPED lpOverlapped,

LPWSAOVERLAPPED_COMPLETION_ROUTINE lpCompletionRoutine,

LPWSATHREADID lpThreadId,

LPINT lpErrno

)

{

if (SIO_BSP_HANDLE_SELECT == dwIoControlCode)

{

if (cbOutBuffer < sizeof(SOCKET))

{

*lpErrno = WSAEFAULT;

Return SOCKET_ERROR;

}

*lpvOutBuffer = s;

*lpcbBytesReturned = sizeof(s);

return NO_ERROR;

}

// … Rest of your WSPIoctl

}

Finally, its important to point out that if your LSP chooses to intercept select it does not 100% guarantee that the problem described here will not be encountered. Under certain LSP layering configurations its always possible that the first handle belongs to a BSP or an LSP positioned lower in the LSP stack such that it will be unable to translate a given higher layer LSP handle. So, unless you absolutely have to intercept select calls, don’t.

Note that this change was made after Beta 2.

--Anthony Jones (AJones)