Troubleshooting tempdb growth due to Version Store usage

Troubleshooting tempdb growth due to Version Store usage

Recently I worked on an issue where the tempdb database version store cleanup was not occurring.

I created a small repro to demonstrate the behavior. My tempdb has been showing the version store usage below for the past 3 days:

sys.dm_db_file_space_used

runtime

version_store_kb

2010-10-08 10:28:52.740

19776

The query that I used to get the above data is:

select getdate() AS runtime, SUM (user_object_reserved_page_count)*8 as usr_obj_kb,

SUM (internal_object_reserved_page_count)*8 as internal_obj_kb,

SUM (version_store_reserved_page_count)*8 as version_store_kb,

SUM (unallocated_extent_page_count)*8 as freespace_kb,

SUM (mixed_extent_page_count)*8 as mixedextent_kb

FROM sys.dm_db_file_space_usage

I then used the DMV to track the Version Store cleanup Performance Counter or you can even use Perfmon to find out the same information. The DMV that I used was “sys.dm_os_performance_counters” and the counter name is “Version Cleanup rate (KB/s) “. This is a per second counter and is cumulative in nature. Capturing multiple snapshots of the cntr_value column value, I find that the value doesn’t go beyond 272696576.

The next thing that I would need to track is what is causing the version store usage. I used sys.dm_db_session_file_usage and sys.dm_db_task_space_usage to check if any sessions/tasks were currently accounting for the 18MB of version store usage that I see above. I found none. If you do find a Session ID with a high amount of internal_objects_alloc_page_count value from the aforementioned DMV outputs, then the next investigation point would be to find out what that session is doing. Next I went and tracked down all the transactions currently maintaining an active version store using the DMV sys.dm_tran_active_snapshot_database_transactions.

Query used:

select GETDATE() AS runtime,a.*,b.kpid,b.blocked,b.lastwaittype,b.waitresource,db_name(b.dbid) as database_name,b.cpu,b.physical_io,b.memusage,b.login_time,b.last_batch,b.open_tran,b.status,b.hostname,b.program_name,b.cmd,b.loginame,request_id

from sys.dm_tran_active_snapshot_database_transactions a

inner join sys.sysprocesses b

on a.session_id = b.spid

I found the following information:

transaction_id: 27223
transaction_sequence_num: 771
commit_sequence_num: NULL
is_snapshot: 0
session_id: 52
elapsed_time_seconds: 622
open_tran: 1

*Mentioned only the relevant information from the above query for readability.

Here the observations from the above output:

1. The session is not executing a Snapshot transaction

2. The commit sequence is null

3. The transaction has not been committed due to which open_tran value is 1.

4. The transaction is open since the last 10.3 minutes!!!

I then used the following query to retrieve the input buffer of the above transaction:

select GETDATE() AS runtime,b.spid,c.*

from sys.dm_tran_active_snapshot_database_transactions a

inner join sys.sysprocesses b

on a.session_id = b.spid

cross apply sys.dm_exec_sql_text(sql_handle) c

The input buffer that I found was:

set nocount on

declare cur cursor for select a from tbl_test2 where a between 120 and 150

declare @a int,@text varchar(100)

open cur

fetch next from cur into @a

while @@fetch_status = 0

begin

            update tbl_test2 set b = b+1 where a = @a

            fetch next from cur into @a

            RAISERROR (@text, 0, 0) WITH NOWAIT

end

close cur

deallocate cur

Now it is clear that the cursor performing the above update could have caused the version store to grow but I don’t see the session in sys.dm_exec_requests. So, what is the deal here?

Well, the session has an open transaction. This could be because it had executed a begin transaction or set implicit transactions to ON before the beginning of this query. Since this is a small repro and the person executing this query was the author of this post, I can let you in a secret that I had set implicit transaction to ON before I ran the above cursor batch. But if this is a third party application or an environment/application that you have inherited to maintain, then you are in a soup. What do you do? So far, the only piece of the puzzle that you have is that there is an open transaction for a session which has not committed the transaction. Is that causing the version store cleanup to be halted? Let’s see…

I verified from the sys.databases output that the column values for snapshot_isolation_state and is_read_committed_snapshot_on showed that ALLOW_SNAPSHOT_ISOLATION and READ_COMMITTED_SNAPSHOT are enabled on the database. So any DML on the database would create a version store. Now moving on further, I verified if there were any cursors open for a long period of time using sys.dm_exec_cursors but I didn’t find any.

Then I used the transaction sequence number (transaction_sequence_num) obtained above and queried sys.dm_tran_version_store to find out how many transactions are using version store post the transaction identified above had started. I find that there is only one sequence number post the identified open transaction above. So, I am not in a big soup! But I can start getting into one if there is some maintenance operation that begins on this database and adding extra pressure on my tempdb database.

The only option now is to either commit the transaction or roll it back so that the version store clean-up can proceed further.

The above behavior is by-design.

Sunil Agarwal mentioned about this behavior in his blog post on the Storage Engine blog.

You can use the T-SQL batch below to capture the above information from the DMVs in a loop:

GO

WHILE 1=1

BEGIN

select getdate()

DECLARE @runtime datetime

SET @runtime = GETDATE()

PRINT '-- sys.dm_db_file_space_used'

select CONVERT (varchar(30), @runtime, 121) AS runtime, SUM (user_object_reserved_page_count)*8 as usr_obj_kb,

SUM (internal_object_reserved_page_count)*8 as internal_obj_kb,

SUM (version_store_reserved_page_count)*8 as version_store_kb,

SUM (unallocated_extent_page_count)*8 as freespace_kb,

SUM (mixed_extent_page_count)*8 as mixedextent_kb

FROM sys.dm_db_file_space_usage

RAISERROR ('', 0, 1) WITH NOWAIT

PRINT ' -- Output of active transactions which are using version store'

select CONVERT (varchar(30), @runtime, 121) AS runtime,a.*,b.kpid,b.blocked,b.lastwaittype,b.waitresource,b.dbid,b.cpu,b.physical_io,b.memusage,b.login_time,b.last_batch,b.open_tran,b.status,b.hostname,b.program_name,b.cmd,b.loginame,request_id

from sys.dm_tran_active_snapshot_database_transactions a

inner join sys.sysprocesses b

on a.session_id = b.spid

RAISERROR ('', 0, 1) WITH NOWAIT

PRINT ' -- Input buffer of SPIDs identified above Output of active transactions which are using version store'

select CONVERT (varchar(30), @runtime, 121) AS runtime,b.spid,c.*

from sys.dm_tran_active_snapshot_database_transactions a

inner join sys.sysprocesses b

on a.session_id = b.spid

cross apply sys.dm_exec_sql_text(sql_handle) c

RAISERROR ('', 0, 1) WITH NOWAIT

PRINT ' -- Open cursors'

select * from sys.dm_exec_cursors(0) a

cross apply sys.dm_exec_sql_text(sql_handle)

WHERE DATEDIFF(hh, a.creation_time, GETDATE()) > 1;

RAISERROR ('', 0, 1) WITH NOWAIT

WAITFOR DELAY '00:00:15' -- Change the delay here as appropriate

END

GO

Amit Banerjee
SEE, Microsoft SQL server support