Understand special TCP/IP property “Keep Alive” in SQL Server 2005


When open TCP/IP properties by following SQL Server Configuration Manager -> SQL Server 2005 Network Configuration ->Protocols -> TCP/IP, you will notice there is new added property named “Keep Alive”, which is not available in SQL Server 2000.

This parameter controls how often TCP attempts to verify that an idle connection is still intact by sending a keep alive packet to its peer. If the remote system is still reachable and functioning, a acknowledge packet is sent back. Otherwise, the local TCP will keep sending “Keep Alive” packet in an interval of KeepAliveInterval for TcpMaxDataRetransmissions times. If no acknowledge packet is received during this period, the local TCP will reset the connection. For every SQL Server TCP connections, Keep Alive is 30,000 millisecond by default and KeepAliveInterval is hard-coded 1,000 millisecond. TcpMaxDataRetransmissions is 5 by default, configurable for entire machine through HKLMSYSTEMCurrentControlSetServicesTcpipParameters. As an example, under default configuration, SQL Server can detect a “orphaned” connection in about 35 seconds. Orphaned connections are often caused by remote system reset, hardware/power failure or network outages. In case where the remote process is shutdown or killed, TCP reset is usually send out by system TCP/IP stack and the connection will be closed as soon as the reset packet is received. The detailed description about TCP keep-alives is in IETF RFC 1122.

In Windows, as well as many other operating systems, TCP/IP Keep-Alive is not enabled by default. By enabling this, SQL Server can, in a timely manner, detect “orphaned connection” and free up valuable resource associated with each connection, including its session context, locks, kernel TCP buffers and etc., which sometime can become very expensive for a SQL Server running heavy transactions. The major drawbacks, among many others, are (1) Keep-Alive consumes bandwidth on a perfect idle connection; (2) It causes good connection to break during transient network failures. So configuring the keep-alive values too small is not recommended.

The following KB about orphaned connection/session for SQL Server 2000 shares valuable insight on how to configure Keep-Alive for named pipe connections.

http://support.microsoft.com/kb/137983/?sd=RMVP&fr=1,

 

Do you know that you can post question w.r.t SQL Server data access, connectivty issues at http://forums.microsoft.com/MSDN/ShowForum.aspx?ForumID=87&SiteID=1 ?

 

Nan Tu
 
Software Design Engineer, SQL Server Protocols

Disclaimer: This posting is provided “AS IS” with no warranties, and confers no rights


Comments (25)

  1. wesclark says:

    We are running automated tests that intentionally drop all the connections to the database and make them again.  If the tests run quickly, we are disconnecting and connecting very frequently.  We notice that after the connection is closed (this is through the JDBC driver, but I’m not sure that matters), the connection port stays open in a wait state for several minutes.  If the number gets too high (about 4000), the range of ports is used up, and the tests start to fail.

    Is there a way to have the TCP/IP port be released on the client immediately, or at least sooner?  Is there a way to specify a large range of ports?

    By change the keep alive setting, do you think the TCP/IP port would close sooner?

  2. Matt Neerincx [MSFT] says:

    Take a look at this article, it explains the issue pretty well:

    328476 Description of TCP/IP settings that you may have to adjust when SQL Server connection pooling is disabled

    http://support.microsoft.com/default.aspx?scid=kb;EN-US;328476

    KeepAlive does not help.

  3. wesclark says:

    Yes.  I found that KB article after posting here.  Thanks.

  4. .agony says:

    We tried this feature with no success – neither the standard properties nor slightly changed properties worked.

    We tested like this:

    SQL Server 2005 – ADO.NET Client

    The client established an explicit lock on one row at one db. Afterwards we disconnected the client by pulling out the network-cable. We waited about 35 sec for the sessios to close – but nothing happend; we waited another minute but nothing changed. The command line "netstat" told us that the connections are alive … so what went wrong? Did we miss something? btw we  installed the sp1 before all tests!

  5. Nan Tu says:

    Did you restarted server after change the setting?

    Can you ping the server after you unplug the network cable?

    The ultimate way to monitor if the keepalive is working is to get a netmon trace. You should be able to see the keepalive request and response on a idle connection or only keepalive request on a broken connection.

  6. .agony says:

    Yes, we did "serveral* restarts to ensure everything is loaded correctly.

    Of course we can’t ping the server – the cable is unplugged!

    We will test the keep-alive-packages with netmon. I’ll post my results!

    Thanks

  7. .agony says:

    Ok, we testet if acknowledge-keep-alive-packages were sent.

    The answer is: NO

    The server behaves like the client computer is online. It sends packages to the client with no responds, but did not disconnect the session.

    So, what could be the problem source? A missing service? Misconfiguration of the TCP protocol?  Something else?

    Suggestions?

    Meanwhile: thanks fot the assistance!

    .agony

  8. Nan Tu says:

    "It sends packages to the client with no responds, "

    [nantu]This looks correct behavior if you unplug the calbe.

    "but did not disconnect the session. "

    [nantu] can you see if the session is still alive by following query,

    "select session_id from sys.dm_exec_connections" to see the the session is closed.

    Otherwise, what is your OS version, is it xpsp2 or w2k3sp1?

    Do you have any third party antivirus software? If so, can you try without them.

  9. .agony says:

    yes, the connections are still alive but no keep-alive packages are send to the client computers.

    Currently my test server is on a xpsp2 computer with EZ Antivirus installed. Hopefully i get an w2k3sp1 server within the next two days.

    Now I’ll try to disable the antivirus software and post the results

  10. .agony says:

    ok, disabling the antivirus software was no solution. no different behaviour.

  11. .agony says:

    some other ideas?

  12. Transakcyjność zgodna z założeniami wyrażonymi w popularnym akronimie ACID (Atomicity, Consistency, Isolation, Durability) w warunkach współbieżności i konkurencyjności o zasoby wymaga od silnika bazodanowego odpowiedniej obsługi blokad zasobó

  13. cory says:

    Hi,

    The article mentions that the tcp/ip keep-alive is NOT on by default on windows OS, but sql server uses it.  Is there a way to turn off tcp/ip keep-alive for SQL Server 2005?  Was tcp/ip keep-alive on or off for sql server 2000?

    Thanks!

  14. Simon says:

    When does the sql server issue the keepalive? Is it when it just finishes answering a query?

    Thanks.

  15. Nan Tu says:

    Keeplive is a TCP feature that SQL Server leverages to detect transport layer failure. In default setting of Windows operating system, it issues keepalive 30 seconds after the connection becomes idle.

  16. pvr_in_msdn says:

    Hi Nan Tu,

    What are the settings we need to enable KeepAlive through SqlServer2005. I know there is an option to configure thru Sql server Configuration Manager TCP/IP options. But is that fine or we need to set someting other than that. Please clarify.

    Thanks in Advance,

    Palani

  17. SQL Protocols says:

    Hi Palani,

    Sorry for the lack of clarity in the blog post.  SQL Server 2005 enables KeepAlive on all of its connections, and that is sufficient to put KeepAlive to use – there is no need to enable it elsewhere in the OS.  Other than to use SQL Server Configuration Manager to potentially change the parameters mentioned above, there should be no configuration necessary to get KeepAlive.

    Hope this helps!

    Dan Benediktson

    SQL Protocols

  18. praveen says:

    We have problem with SQL 2005 in our company. A script is being run to extract the data from SQL server and sometimes server drops the connection when script runs for more than an hour.

    Is there any setting that I can change so that connection will not be dropped.

  19. Peter says:

    How to use netmon to trace keepalive packets?

  20. B. H. says:

    some questions on this topic.

    Does it need to bounce the sql service after reconfig "keep alive" numbers in Sqlserver configuration manager?

    Do any of these window parameters, KeepAliveInterval and KeepAliveTime, should be set as well, as they don’t seem to come with default setting in registry? In other words, if we only config sqlserver side, this feature will take effect as well?

    Also heard a workaround to fix keealive problems by setting "Keep Alive" in sql server to 0, instead of default 30000. By doing this, is keepalive still working?

    Thanks so much

  21. David says:

    So what are the implications of increasing this number, lets say to an hour? Doesn’t that mean I could potential run into port exhaustion or something on a busy system?

  22. Amandeep says:

    Hi,

    Sometimes we get this error below  when our application tries to write to database.

    ODBC ERROR: Thursday July 16 2009 16:51: 1

    SQLState[01000] NativeError[10054]

    [Microsoft][ODBC SQL Server Driver][TCP/IP Sockets]ConnectionWrite (send()).

    Can this be due to ‘keep Alive’ settings on sql server 2005. Can you please throw some light on this error? we really appreciate any response from you.

    Thanks

    -Amandeep

  23. Andrei says:

    Hi there,

    Does any one knows is to why I am getting an error:

    unable to return data for this report

    (01000:[Microsoft][ODBC SQL Server Driver][TCP/IP Sockets]ConnectionWrite(Send()).)

    Thanks in advance for any suggestions.