Connectivity troubleshooting in SQL Server 2008 with the Connectivity Ring Buffer


SQL Server 2008 contains a new feature designed to help troubleshoot particularly tricky connectivity problems. This new feature, the Connectivity Ring Buffer, captures a record for each server-initiated connection closure, including each kill of a session or Login Failed event. To facilitate efficient troubleshooting, the ring buffer attempts to provide information to correlate between the client failure and the server’s closing action. Up to 1k records in the ring buffer are persisted for as long as the server is online, and after 1000 records, the buffer wraps around and begins replacing the oldest records. The Connectivity Ring Buffer’s records are exposed by a DMV which can be queried:

SELECT CAST(record AS XML) FROM sys.dm_os_ring_buffers
WHERE ring_buffer_type = ‘RING_BUFFER_CONNECTIVITY’

This will select all records as XML types; in Management Studio, you can click the records to get a more readable version. If you want to use SQL queries on the XML records to locate your particular problem, you can also use SQL Server’s XML support to turn this into a temp table and query against the records.

A basic ring buffer entry: killed SPID

An easy way to cause a server-initiated connection closure is to open two connections to a SQL Server, find the SPID of one of the connections, and then kill that spid from the other connection:

C:>osql -E
1> SELECT @@spid
2> go
 ——
     51
(1 row affected)

C:>osql -E
1> kill 51
2> go
1>

If you try this and then query the ring buffer, you will get a result which will look like this:

<Record id=2 type=RING_BUFFER_CONNECTIVITY time=110448275>

  <ConnectivityTraceRecord>

    <RecordType>ConnectionClose</RecordType>

    <RecordSource>Tds</RecordSource>

    <Spid>55</Spid>

    <SniConnectionId>B7882F3C-3BA9-45A7-8D23-3C5C05F9BDF9</SniConnectionId>

    <SniProvider>4</SniProvider>

    <RemoteHost>&lt;local machine&gt;</RemoteHost>

    <RemotePort>0</RemotePort>

    <LocalHost />

    <LocalPort>0</LocalPort>

    <RecordTime>5/6/2008 22:47:35.880</RecordTime>

    <TdsBuffersInformation>

      <TdsInputBufferError>0</TdsInputBufferError>

      <TdsOutputBufferError>0</TdsOutputBufferError>

      <TdsInputBufferBytes>60</TdsInputBufferBytes>

    </TdsBuffersInformation>

    <TdsDisconnectFlags>

      <PhysicalConnectionIsKilled>0</PhysicalConnectionIsKilled>

      <DisconnectDueToReadError>0</DisconnectDueToReadError>

      <NetworkErrorFoundInInputStream>0</NetworkErrorFoundInInputStream>

      <ErrorFoundBeforeLogin>0</ErrorFoundBeforeLogin>

      <SessionIsKilled>1</SessionIsKilled>

      <NormalDisconnect>0</NormalDisconnect>

      <NormalLogout>0</NormalLogout>

    </TdsDisconnectFlags>

  </ConnectivityTraceRecord>

  <Stack>

    <frame id=0>0X01CA0B00</frame>

    <frame id=1>0X01CA0DB1</frame>

    <frame id=2>0X01DF6162</frame>

    <frame id=3>0X02E53C98</frame>

    <frame id=4>0X02E54845</frame>

    <frame id=5>0X02E57BE9</frame>

    <frame id=6>0X02E38F57</frame>

    <frame id=7>0X02E3B2C0</frame>

    <frame id=8>0X02E3C832</frame>


Comments (7)

  1. Simon says:

    What are the debug options for 2005?

    I have exactly that 258 error, and need to find out the root cause.

  2. Sent: Monday, March 09, 2009 6:20 PM Subject: Ring Buffers in SQL2008 &#160; Hi, do you have any documentation/white

  3. Dem says:

    Q:对于复杂的连接问题,我如何获得更多的信息以用于纠错?

    A:

    SQLServer2008包含了一个用于帮助特别困难的连接问题纠错的新特性-连接环缓冲区(ConnectivityRing…

  4. jorgepc says:

    Great entry, thanks a lot for putting your time into this!

  5. Ivanov says:

    We are seeing lots of cases where most of the delay is due to LoginTriggerAndResourceGovernorProcessingInMilliseconds. Can you direct us to where we can find more info on what could cause this? We are using SQL 2008 R2 on Win 2008. Thank You

  6. Chandan jha says:

    Hi,

    I am trying to troubleshoot some intermittent connectivity errors to my Sql server 2008 R2 RTM version and used the sys.dm_os_ring_buffers for this task. However, in the output the record time displays wrong time even though the client and server machines have same system time and are in the same domain. It is 21st June 8:30 am EST here but the record time shows the following values:

    2012-06-21 10:21:59.110

    2012-06-21 10:21:59.110

    This is weird. My client uses a legacy odbc connection (System DSN). Any ideas why the time recorded is wrong? Is my SQL machine receiving wrong packets?

    Please suggest some solution.

    Thanks

    Chandan

  7. Feroz Basha says:

    Hi,

    Great article to understand the underlying concept. I have the same issue in my environment and what I can see is LoginTaskEnqueuedInMilliseconds is taking more time(2290 out of 2803) here. So does this mean that SQL server is having resource contention which prevents to establish the connection or does it have network issue?

    Thanks in advance.!