What’s wrong with this code, part 5


So there was a reason behind yesterday’s post about TransmitFile, it was a lead-in for “What’s wrong with this code, part 5”.

Consider the following rather naïve implementation of TransmitFile (clearly this isn’t the real implementation; it’s just something I cooked up on the spot):

bool TransmitTcpBuffer(SOCKET socket, BYTE *buffer, DWORD bufferSize)
{
      DWORD bytesWritten = 0;
      do
      {
            DWORD bytesThisWrite = 0;
            if (!WriteFile((HANDLE)socket, &buffer[bytesWritten], bufferSize – bytesWritten, &bytesThisWrite, NULL))
            {
                  return false;
            }
            bytesWritten += bytesThisWrite;
      } while (bytesWritten < bufferSize);
      return true;
}

bool MyTransmitFile(SOCKET socket, HANDLE fileHandle, BYTE *bufferBefore, DWORD bufferBeforeSize, BYTE *bufferAfter, DWORD bufferAfterSize)
{
      DWORD bytesRead;
      BYTE  fileBuffer[4096];

      if (!TransmitTcpBuffer(socket, bufferBefore, bufferBeforeSize))
      {
            return false;
      }

      do
      {
            if (!ReadFile(fileHandle, (LPVOID)fileBuffer, sizeof(fileBuffer), &bytesRead, NULL))
            {
                  return false;
            }
            if (!TransmitTcpBuffer(socket, fileBuffer, bytesRead))
            {
                  return false;
            }

      } while (bytesRead == sizeof(fileBuffer));

      if (!TransmitTcpBuffer(socket, bufferAfter, bufferAfterSize))
      {
            return false;
      }
      return true;
}

Nothing particular exciting, but it’s got a big bug in it.  Assume that the file in question is opened for sequential I/O, that the file pointer is positioned correctly, and that the socket is open and bound before the API is called.  The API doesn’t close the socket or file handle on failure; it’s the responsibility of the caller to do so (closing the handles would be a layering violation).  The code relies on the fact that on Win32 (and *nix) a socket is just a relabeled file handle.

As usual, answers and kudos tomorrow.

 

 

 

Comments (52)

  1. Nicholas Allen says:

    Can ReadFile hand you back less than sizeof(fileBuffer) bytes even if its not at the end?

  2. Nicholas: Good thought, but not if it’s reading from a file.

    But that’s a good point: Assume that the file handle’s connected to a file, not a named pipe.

    This one’s REALLY subtle, and a hint may be needed.

  3. Well, I’m not a big fan of the do while() in TransmitTcpBuffer; if buffersize is zero, a WriteFile is still performed. Not specifically against the rules according to the doco for WriteFile, but still a bit niggling.

    WriteFile should wait until it has transmitted the entire contents of the buffer before returning, even if it can’t be transmitted in one lump, as written (that is, if I recall the semantics for blocking IO on WriteFile calls correctly – they’re not spelled out in the docs). So the code in TransferTcpBuffer may be overkill, but should work fine.

    One API design bug might be that MyTransmitFile doesn’t specify how much of the file should be transmitted; at the moment it transmits everything until it hits an EOF condition.

    Other than that, I’m stuck for now.

  4. Nicholas Allen says:

    There’s no general contract for what WriteFile will do on a null write but sockets specifically define a behavior.

  5. Simon, You’re right, MyTransmitFile should take a length, but that would have complicated the code, and introduced the potential for unintended bugs (I’m wary of unintended bugs after the first "what’s wrong with this code fiasco (http://weblogs.asp.net/larryosterman/archive/2004/04/27/121299.aspx).

    As I said in my first comment, this one is subtle (but it’s a surprisingly common problem).

  6. Niclas Lindgren says:

    The value of

    DWORD bytesThisWrite = 0;

    Goes out of scope all the time, and is thus resetted, hence a very long loop in case of a write where the socket couldn’t do a full complete write.

    Would this be the one you are looking for?

  7. Niclas Lindgren says:

    Bad me, read the code wrong.. The names are too equal =) That was not the bug

  8. So, didn’t Niclas get it right? If the file is exactly a multiple of 4096 bytes in length, a WriteFile of 0 bytes will be performed. This triggers a shutdown at the TCP level, such that the following write of bufferAfter will fail.

    I can’t see anything else wrong anyway :).

  9. Anonymous says:

    1. There is no validation for null pointers.

    2. Socket may be put in non-blocking state, I don’t know what’ll ReadFile and WriteFile do in that case. Handle WSAEWOULDBLOCK error!

    That’s all I could find.

  10. Stephane, does it? I thought that TCP swallowed a 0 byte write.

    "I can’t see anything else wrong anyway :)."

    You’re on the right track Stephane. I was serious about saying this was subtle, but surprisingly common. In fact, I was called in earlier this week to look at a version of this problem.

    Anonymous: Good catch! And that case should have been checked. That’s the first unintentional bug.

  11. Niclas Lindgren says:

    Found my glasses and will try to read the code properly this time….

    Are we talking blocking or non blocking operations on the TCP socket?

    The code implies non blocking, but if so, we are missing alot of useful code, so I presume it is blocking mode?

  12. Niclas, assume blocking. As you mentioned, non blocking means there needs to be a lot more code.

    Also assume that there’s someone on the other end who’se reading the data (so the fact that blocking sockets can hang if the guy on the other end hangs) isn’t a problem.

  13. anon says:

    Didn’t you mention in your previous post one has to lock the socket to prevent other threads from writing to it before the transmit is complete?

  14. That’s the real TransmitFile, not my toy implementation. Good thought though :)

  15. anon says:

    One more try – network byte order vs. host byte order?

  16. Norman Diamond says:

    Regarding the base note, we should assume that the socket is open and bound, but I don’t see any assumption (except in the code) as to whether the socket is opened for overlappinging I/O or not.

    Regarding zero-byte writes to a socket, MSDN documentation for the send() function says that a send() of zero bytes will work (whether or not the protocol is TCP) but I can’t find any documentation as to whether a write() of zero bytes will work or not. MSDN documentation for the write() command explicitly sends the reader on a wild-goose chase to documents on the underlying file system.

    (Back to the base note, if sockets were simply relabelled file handles the same as in Unix, then it would be possible to call select() on a combination of a socket and stdin in order to respond to whichever event comes first. In fact in Windows we have to do more complicated programming to respond most quickly to either keyboard or socket input. The code in this question relies on sockets being usable as file handles, but not because they’re relabels of the same thing.)

  17. Norman Diamond says:

    Hmm, 8/4/2004 5:17 PM Larry Osterman posted while I was reading MSDN and writing my previous note. I guess blocking means we assume that the socket’s creator made it non-overlapped.

    Hmm, and in my previous note I wrote write() when I meant WriteFile(). Anyway it isn’t quite obvious if a WriteFile() of zero bytes will be sent as reliably as a send() of zero bytes.

  18. Norman: A large part of the reason that select() doesn’t work that way is because stdin isn’t a file handle in Windows.

    Also the use isn’t symetric – you can use a socket in all the Windows file APIs (not directory APIs but file APIs), but you can’t use a file handle in all the socket APIs (send() doesn’t make sense on a file handle).

    Sockets are always opened for overlapped I/O, btw. I’m 99% certain that a 0 byte WriteFile() is the same as a 0 byte send().

  19. Too bad I am not up on my C++ APIs. I have not done MFC C++ in a long time… Need to read my books again.

    I keep picking thorugh it and think I find little things, but I am not sure… Don’t want to show how rusty I am.. ;-)

    Does it have something to do with the number of bytes being sent and the bounds checking for the file? I think the line [ while (bytesRead == sizeof(fileBuffer)); ] is the key. If the file being read is a multiple of sizeof(fileBuffer), won’t the function return failure always? Cause if the bytesRead is the last set and it is a multiple of sizeof(fileBuffer), it will go around again and of coures the ReadFile function will fail and return false?

    I have a feeling that it is somewhere in there, but my eyes are just not seeing the C# in it… :-D

  20. anon: Interesting point – network byte order vs host byte order’s a different problem (I think that was "What’s wrong with this code, part 2").

  21. Jazzy, the next iteration, bytesRead will be 0, which will break out of the loop. It WILL write a 0 byte buffer, as Simon pointed out in hise comment.

  22. Norman Diamond says:

    8/4/2004 5:43 PM Larry Osterman:

    > Sockets are always opened for overlapped

    > I/O, btw.

    Oh, then it seems to me that WriteFile() will have big problems. Maybe only rarely, but big when they occur.

    A few minutes ago I read in MSDN that overlapped is the default but I thought that meant the program could change it. I never had to worry about it because I used send() and its friends rather than WriteFile().

  23. Actually the WriteFile call should be ok as long as there are no other threads doing I/O on the socket.

  24. Ed Kaim says:

    I’m quite rusty with Win32, so this guess is based on the assumption that sockets being opened for overlapped IO is functionally equivalent to a file being opened with an OVERLAPPED structure. According to the docs for WriteFile (where hFile refers to the file HANDLE and lpOverlapped refers to the OVERLAPPED structure):

    "If hFile was opened with FILE_FLAG_OVERLAPPED and lpOverlapped is NULL, the function can incorrectly report that the write operation is complete."

    This implies that it’s possible for MyTransmitFile to return a false positive before the transfer has actually completed.

  25. That’s right, if the file is being written to by other threads.

    So that’s another good call Norman/Ed.

  26. Girish says:

    I am not sure about this,

    1. ReadFile() docs state that "When a synchronous read operation reaches the end of a file, ReadFile returns TRUE and sets *lpNumberOfBytesRead to zero." Umm, does’nt that mean that the last read data will be lost?

    I wonder.

  27. Norman Diamond says:

    The wording of the MSDN page on ReadFile says the same thing to me as it does to Girish. I hope that the wording is wrong. I was going to say what I hoped it intended, but it’s not exactly trivial.

    Furthermore, as for whether the return value is really equal to TRUE or whether the return value is simply non-zero, and whether programmers should know whether to check for TRUE or whether to check for non-zero, where I have I read some astute blogs recently…

  28. stan says:

    When it calls writeFile with a 0 byte buffer (ie. triggered by a file of 4096 or multiple of 4096 as suggested by others above), would that somehow clobber the socket? (therefore messing up the bufferAfter transmit)

    I’m rusty, but my guess is that a 0 byte write would be interpreted by the socket as an error or disconnect.

  29. Norman, Girish:

    What that means is that the next time you call ReadFile after reaching the end of the file, you’ll get a zero bytes read count, with a no-error result code returned.

    The previous time, you would have received a number-of-bytes-read which is <= the size of your buffer.

  30. Just a hunch… although this might be only important for multithreaded cases…

    Shouldn’t you duplicate the file handle before operating on it? That way it won’t disappear from underneath you while you’re waiting for the operation to complete.

    I get the feeling that this is yet another edge case though.

    Here’s another problem (maybe)… it won’t work on Win95A. :) (IIRC, that version of Winsock didn’t support using socket handles for file operations).

    Aha.. I think I found it.

    If the remote end closes the socket (or the connection is lost), WriteFile will return an EOF condition – that is, it will succeed, but return a write size of 0 bytes. At which point, the do…while in TransmitTcpBuffer becomes an infinite loop.

  31. Thank you Simon, that’s exactly what I would have written.

    Stan, the 0 byte write will be ignored (as far as I know).

    This is turning out to be tougher than I’d expected. Here’s a huge hint: As far as I know, the code is correct (modulo the three issues above (checking for null pointers, the extra 0 write, and the fact that I forgot to stipulate that it was a single threaded app)). There’s still something wrong with it, but the code as written is correct (in other words, it will transmit all the data from the file onto the wire).

  32. anon says:

    "Buffer addresses for read and write operations must be sector aligned"

  33. Anon E. Mouse says:

    EOF

  34. Mike Dimmick says:

    anon, AFAIK that only applies to non-buffered disk I/O operations, where you’ve opened a file with the FILE_FLAG_NO_BUFFERING flag. I think socket operations are all buffered?

    Other than that, I can’t see it. Unless perhaps the file handle was opened FILE_FLAG_OVERLAPPED? MSDN says "[w]hen you specify FILE_FLAG_OVERLAPPED, the file read and write functions must specify an OVERLAPPED structure. That is, when FILE_FLAG_OVERLAPPED is specified, an application must perform overlapped reading and writing."

  35. Girish says:

    Is there a possibility for a buffer overrun in the ReadFile/TranmitTcpBuffer loop?

  36. Jeff says:

    The only thing I can think of is it is horribly slow. Why only read 4096 bytes? Why not read in the whole file and let the socket layer handle sending it. Unless of course you have memory constraints.

  37. Jeff, you’re 100% right – the problem IS a performance problem, but you didn’t quite hit on the reason for the performance problem.

    It’s not the fact that the buffers are small.

    But it IS related to the fact that the buffer is 4096 bytes.

  38. Ohhh.. I was looking for a real bug, not a perf bug.

    A couple then…

    1. You shouldn’t allocate large data buffers on the stack. Everyone does it, but it still doesn’t make it right.

    2. For highest perf, the buffer for file/socket operations should be aligned on a page boundary. Therefore, it makes more sense to use VirtualAlloc to allocate it.

    3. Again, for highest perf for file reads, the buffer should be the size of a page.

    4. For highest perf for file writes, the outbound buffer should probably be the size of the MTU for the TCP connection. (although this may or may not be a good optimization).

    The problem with all of these? I’d want to try them, then measure them.

  39. Jeff says:

    Well, thinking out loud, possible things that have to do with 4k are: page size, I believe in NTFS the cache manager purges in 64k chunks so it might be best to read 64k, TCP packets are 1518 bytes so you aren’t going to get a nice split on the wire. You are also going to be causing a lot of work on the other end since the MS TCP implementation will set the PSH bit on the last packet of each send causes NT to signal the data to the reader a lot more than it would if you sent a large send. Thats all I can think of off the top of my head.

  40. Nope, not related to page size. Next hint, before I post the answer: The problem would occur with any buffer between 2921 bytes and 4379 bytes in size. It would also happen with a buffer that’s between 5841 and 7299 bytes in size.

    I don’t think I can give any stronger hints without just posting todays post.

  41. Jeff: You’re SOOO close. It’s not a question of work, it’s an architectural issue with TCP.

    Oh, and I believe that the TCP payload is 1460 bytes not 1518 (I think 1518’s for UDP). 1568’s the max frame size for ethernet so…

  42. Nicholas Allen says:

    Ah, splitting into an odd number of packets hurts if you’ve got Nagle enabled.

  43. … which is why for non-interactive TCP applications, I always turn OFF nagling, and jam as much data as possible into the pipe before hitting Send.

    The Nagle algorithm was really designed for telnet. That’s all it’s really for. For high loads of data, you should always turn it off and just rely on the window sizing algorithm to perform congestion control for you.

    That, and only send data in LARGE chunks.

    The other thing to increase perf is to re-fill your buffer from the file every time you complete a successful WriteFile, so that you don’t run out of data.

    (I have a couple of algorithms which are very useful for this kind of thing – one of them is outlined here:

    http://www.codeproject.com/internet/bipbuffer.asp

    )

  44. Norman Diamond says:

    8/4/2004 11:14 PM Simon Cooke [exMSFT]:

    > What that means

    No. What MSDN’s wording means is different from the following. If MSDN’s wording would be changed to the following (and if it is correct) then all will be well. Please remember that the meaning of RTFM is that TFM is suitable for R’ing not for F’ing.

    > the next time you call ReadFile after

    > reaching the end of the file, you’ll get a

    > zero bytes read count, with a no-error

    > result code returned.

    The present MSDN wording says that this result occurs if eof is reached during the present operation. Of course your definition is preferable.

    > The previous time, you would have received a

    > number-of-bytes-read which is <= the size of

    > your buffer.

    The present MSDN wording says that the previous time you would have received the same result as this time, instead of what you say. Of course your definition is preferable.

    By the way, console input can send an eof notification. Is your definition still accurate in this case?

    In Unix, sockets can also send eof notifications, but from what I can find in MSDN it does not happen in Windows. When a co-worker wanted to detect a lost connection between a Unix server and a Windows client and have both programs recover gracefully, we could not find any way to do it. She had to document that the application would not perform as expected if the network connection went down.

  45. Unix and Windows sockets both send EOF notifications the same way:

    a recv on the socket returns a length in bytes of 0.

    a send on the socket returns a length in bytes of 0.

    When both of these cases occur, the socket has been closed from the remote side.

    That’s standard TCP, and it’s the only way to detect the end-of-connection case. If you want better notification in the case of a broken connection, you need to send your own heartbeat data and check for that heartbeat not being received at either end within a certain timeout period.

    As for the ReadFile documentation, the MSDN documentation is correct. It will ONLY return 0 for the number of bytes read when the read attempt starts with the file pointer at the end of the file.

    If you’re at the beginning of a 200 byte long file and you ask for 500 bytes, you’ll get 200 bytes of data, and the number of bytes read counter will return a value of 200.

    Your next ReadFile call will return 0. EOF is not an error condition for synchronous operations.

    In the case of console input, ReadFile will block until the console stream is closed (at which point, you’ll read out of the buffer until you start getting 0’s back for the length copied).

    The present MSDN wording says specifically this:

    "If the return value is nonzero and the number of bytes read is zero, the file pointer was beyond the current end of the file at the time of the read operation."

    Which may be the information you’re missing.

    Don’t get me wrong; I’m not trying to bash you for bugging them on documentation… but if you really want to make a difference, please, aim for the Uniscribe API – it’s in desperate need of more real world samples and MUCH better documentation.

  46. Niclas Lindgren says:

    "In Unix, sockets can also send eof notifications, but from what I can find in MSDN it does not happen in Windows. When a co-worker wanted to detect a lost connection between a Unix server and a Windows client and have both programs recover gracefully, we could not find any way to do it. She had to document that the application would not perform as expected if the network connection went down."

    Could you elaborate on that? If the remote TCP connection closes (and not through a crash so the FIN( or RST any case of a crash and the OS supports it ) is sent), then windows will supply a 0 result in recv, which means that the connection was shutdown on the remote side.

    Either way, I have no troubles detecting TCP connection losses on either Unix or Windows, most things actually works smother in Windows.

    ———-

    The Ethernet MTU (payload) size is 1500, although I believe the MTU is of little interest, it should be the MSS value, which isn’t defined, except that it is at least 576. If path discovery is used it can be anything between 576-1460 (1500-IP/TCP header(40)), on Ethernet…

    I wasn’t looking for performance issues, but adding to those we also have the kernel-user-kernel space copying of data, which sort of doesn’t speed things up =).

    Given that the socket hasn’t disabled buffering it fills the output buffer (TCP kernel buffer, and also the one pending req winsock, which should all in all be at least 3 packets of 4096(4 with the one that will block), socket buffer is normally 8192), and then start pumping packet per packet. Nagle isn’t a bad thing, it is actually a good thing, and it was indeed meant for Telnet but not for message based data, as the whole idea of TCP is to stream.

    So Nagle should normally be considered to possibly be turned off if you are running a (small) message based protocol over TCP, or an interactive application where responsiveness is import (X Server, VNC Server etc). (if you stay above MSS always, and you have bidirectional communication (req/resp), you will be fine, Nagle isn’t the problem, the synchronized serialized scheme is, and normally you will hit the RTT as the limiting factor, and soon you will figure a parallel way is the way to go). On windows however you have to worry less (see at the end)

    For chunck transfers like these it should be fine, however it is indeed an interesting point you have brought up!

    If you fill the buffers, the TCP client should always have a packet to send (it has collected more then the MSS at any given time), the remote side will always get a packet awaiting its ACK Nagle timer to expire, and thus will fire it. Nagle might in fact speed your transfer up if you are on a link with a long RTT, since it prevents the tiny packets, so disabling might in fact give you worse results(given that you aren’t able to sustain a full TCP buffer at all times).

    However, this implementation should work as well, I might be wrong though, I haven’t tried this one specifically out, but since reading from a FILE is tenfold faster than the network normally, and since windows returns immediately if there is room in the socket buffer (or winsock one pending req), then we should quickly fill up the TCP buffers, and thereby virtually disable Nagle, and hit the window flow control instead. At least after the first roundtrip of data(not in Windows though since they have tweak Nagle a bit so they always send out data if a MSS amount of data or more is available). The problem you might get into is the delay ack problem, not the nagle problem, however not on Windows.

    Also I believe that the TransmitTCP will always complete a full 4096 write or block, so that loop shouldn’t happen I believe.

    Now I believe it should actually be better to read 64k or so from the file and pump it to the socket buffer in TCP buffer size (8K normally), this should give you 3*8k buffer as we block, and always MSS data available. Although I believe it is better to stay below the socket buffer for better granularity (in short you will keep the TCP buffer "more" full)

    Or… I could be totally wrong and I will soon be heavily shot down by someone :=)

    Nagle was meant to remove _small_ packets off the network, to make them bigger, not to hinder already big packets, since that is what it wants.

    So actually, if you want a responsive telnet like connection, you should turn off nagle, however normally the RTT is the problem, and then you might be worse off with Nagle off. IF however you are running with big chunks you should be fine, and turning off nagle might actually make things worse for you in your huge file transfer, as a lot of “small” packets might be entering the network given that you don’t pump data into the TCP buffer in large chunks, whereas Nagle will save you.

    However for large bulk transfers a more resource fast way to do thing might be to set the TCP send buffer to 0 and use overlapped IO which more than one out standing write request (the old double buffering technique) even then it might not be much better than using standard way, since it already gives you double buffering.

    One very fun way of seeing Nagle in action is to for instance run MSMQ with small messages in a req/resp lock step manner. You will get about 5 req/s, but if you increase your payload in all the messages above 1380 I believe it was, on a local LAN with MSS = MTU, you will see a either CPU bound if your machine is slow, or a network bound barrier if your network is slow.

    There is of course a nob in the registry to turn this behavior off in MSMQ(to turn nagle off).

    The reason that MSMQ gets this behavior, I believe, is because the MSMQ queue manager creates one connection in each direction (in private queue mode), and message from one machine to the other are sent on one socket while the messages from the other machine back to the first one go on the second socket.

    By using two sockets for unidirectional communication you will hit Nagle hard, if instead a single socket would have been used, you would only be hit by the RTT, which is what you expect from a lock step way of communicating.

    So in short I disagree that disabling Nagle on non interactive TCP connections is a good thing, if this solves your performance problem, then likely it didn’t solve it, just making it less noticable.

    Especially on Windows this makes even less sense, since

    1) Data will always be sent if more than MSS is available (exception from Nagle, smart one, as we will hit the window flow control instead)

    2) a delayed ACK will be sent if another packet came in before the timer expired(another smart sidestep from delayed ACK).

    Cheers,

    Niclas

  47. Norman Diamond says:

    8/12/2004 3:21 PM Niclas Lindgren replied to me:

    >> In Unix, sockets can also send eof

    >> notifications, but from what I can find

    >> in MSDN it does not happen in Windows.

    To elaborate on this part, I read a bunch of pages in MSDN trying to see how a sender can send an explicit eof indication (so that the receiver will receive an eof indication) and could not find a way.

    >> When a co-worker wanted to detect a lost

    >> connection between a Unix server and a

    >> Windows client and have both programs

    >> recover gracefully, we could not find any

    >> way to do it. She had to document that the

    >> application would not perform as expected

    >> if the network connection went down."

    >

    > Could you elaborate on that?

    To elaborate on this part, I read a bunch of MSDN pages trying to find a way for a caller of recv() or send() to be informed that there is not presently any network path between the caller and the formerly established peer. Reading included but was not limited to setsockopt. We tested by pulling a cable. Even after 5 minutes, the programs were still waiting instead of detecting errors.

    On rereading my previous posting here, I wonder why I combined these two situations into one paragraph. They are separate issues.

    > If the remote TCP connection closes (and not

    > through a crash so the FIN( or RST any case

    > of a crash and the OS supports it ) is sent),

    This involves the first issue that I described above. In MSDN I could not find any way for a program to request that an EOF indication be sent (I think I agree that FIN would be it, but MSDN still did not say how to send one).

  48. Norman,

    The pulling the cable thing doesn’t work if the connection hasn’t been established very long.

    This is REQUIRED by the TCP/IP spec, Microsoft originally tuned the TCP/IP stack to detect this case quickly, and got flamed mightily when the stack was put on the internet.

    TCP determines the link speed between the two machines over time, and sets timeouts based on the link speed. The initial timeouts are extremely high (because the link might be very slow) and the timeouts reduce over time. If enough packets haven’t been transfered to lower the link timeouts then the fact that the net cable’s been unplugged might not be detected for up to an hour!

    This is 100% per TCP spec, and will happen with every modern TCP implementation.

  49. Norman Diamond says:

    Ouch. Thank you very much for this information.

  50. Niclas Lindgren says:

    "The pulling the cable thing doesn’t work if the connection hasn’t been established very long"

    There is of course one exception, if you pull the cable on the machine you are running, you will detect the loss of network connection, as the NIC goes offline due to the loss of a link =)

    Anyways, you could set the TCP alive timer if you wish a faster detection of a lost connection, and tweak the TCP IP stack registry parameters controlling the interval at which these packets are sent. But I really would recommend some sort of echo protocol to take place in the TCP connection if you need to know if the connection is lost on a silent connection. However I would also recommend that the application shouldn’t much care if the connection is up or not, if the connection is silent anyways. There are of course exceptions, like monitoring, but I would rather rely on a protocol in the TCP connection rather than on the TCP to do the detection.

    Anyways, to send a FIN, you do a shutdown(should say in MSDN) (SD_SEND (FD_WRITE)) on the socket(or a close, but that stops the reading part too, and will bump a RST back to you on your next data packet, as it expects a FIN back)

    In my experience the exact same behaviour is to be expected from a Unix system as well, at least those I have been implementing on. Silent TCP connection generally takes a while to detect.

  51. Absolutely right Niclas.