Team Foundation Server: Unable to read data from the transport connection: An existing connection was forcibly closed by the remote host.

Updated 4/21/2010: The Windows patch to solve this problem has been released. It can be found here.

Recently, while dogfooding our internal pioneer server, a few members of our team started to notice this error message when downloading files from TFS:

C:\dev\code\somefile.txt: Unable to read data from the transport connection: An existing connection was forcibly closed by the remote host.

Luckily, we were able to reproduce the issue on a consistent enough basis to investigate the issue.  After a few fun days of client-side TFS logging, sever-side TFS logging, TFS activity logging, fiddler tracing, netmon tracing, procmon tracing and finally http.sys tracing we were able to determine that we were being affected by an http.sys bug that was introduced in Windows Server 2008 R2.

Are you experiencing the same issue that we were?

Let’s find out.  First of all, this bug is new to Windows Server 2008 R2.  If you are running any version of Windows Server 2003, Windows Server 2008 R1 RTM or Windows Server 2008 R1 SP1 then you are most likely not hitting the same bug that we were.  The issue tended to occur more often if the file that was being downloaded was relatively large (> 2MB) and rate of occurrence seemed to grow with the size of the file.  Finally, we were only able to reproduce the issue over slower networks (3 to 4 kb/sec file transfer rate). 

Great news! The bug has been fixed!

This bug has been fixed by the Windows team and they have released a QFE for it.  You can find the QFE here.  You will need to install in on all of your ATs.

The following lists the initial workarounds that we were able to come up with before this bug was fixed.  I am going to leave them up here for reference but the real solution should be to install the QFE mentioned above.

--

If you are still hitting the issue after setting up the proxy or you are unable to set up a proxy then you are not out of luck just yet.  First, a little more information about why this issue is happening.  Http.sys is the http protocol stack that IIS uses to perform http communication with clients.  It has a timer called MinBytesPerSecond that is responsible for killing a connection if its transfer rate drops below some kb/sec threshold.  By default, that threshold is set to 240 kb/sec.  It turns out that there is a bug with this timer and it is causing connections to be prematurely killed.  We have found that lowering this threshold reduces the number of connections that are killed by the server. 

In order to lower the threshold follow these directions:

  1. Open the IIS Manager
  2. In the Connections pane, make sure the name of your AT is selected.
  3. In the middle pane (titled “<MachineName> Home”), make sure you are in the “Features View” (bottom) and scroll down to the Management section.
  4. Double-click the “Configuration Editor” icon.
  5. The middle pane should now have the title “Configuration Editor”.  In the Section pull down near the top, expand the system.applicationHost and select “webLimits”.
  6. You should now see a bunch of property value pairs, one of which is named “minBytesPerSecond”.  Its value is most like 240.  You will want to lower this value for the workaround. 

In general, the lower the threshold is, the lower the chances are that you hit the issue above.  However, you may not want to lower the setting too much because then connections that should be killed by this timer will not be appropriately cleaned up.  Play around with values to see what works best for you.

If you have any other questions, be sure to let us know.