Managing TempDB in SQL Server: TempDB Basics (Version Store: logical structure)

Now that we know a few things about Version Store, let us now look into its structure to understand how it stores rows from different tables/indexes with different schema. Interestingly, you don’t need to look far and the answer is available when you examine the DMV sys.dm_tran_version_store.

This DMV shows the full logical structure of the version store. There are two key points to note. First, the version store consists of 8K pages just like data or index pages. These pages exist in the buffer pool and can be flushed to the disk, in this case to the TempDB, under memory pressure. The rows on the version store follow the same rules as any data/index row. Second, the row versions are stored as their full ‘binary’ image, like the way it is stored in the data page’, in the version store. The binary image is broken into ‘first-part’ and ‘second-part’ which is then combined internally by the SQL Server to interpret it just like it would interpret the actual data/index row according to the schema. This makes the row version storage independent of the owning schema of the object. So a given version store page can have rows from multiple tables and indexes and in fact they can be from any database under SQL Server instance. In other words, the version store is shared among all databases in the SQL Server instance. Just like the pages of a table or an index may need to be kicked out of buffer pool under memory pressure so are the pages of version store. For this reason the version store is backed by persistence in TempDB.

If you look at the fields in the sys.dm_tran_version_store DMV, you will note that the version row stores many other attributes like database-id, row-length etc which is not part of the original data/index row. So you may wonder how does the version of a row with a size of 8060 (max allowed length of a row in SQL Server) stored in the version store as the version store ‘row’ stores many other attributes as described below? Well, the answer is that the data row is broken into 2 rows physically on the version store page but it does not show as such in the DMV because it is ‘virtual’ and shows it as 1 big row.

Here is an example of the contents of version store where transaction with XSN 57 has updated three different rows while a transaction with XSN 58 has updated only 1 row. Note, if a transaction updates the same row multiple times, there is only 1 version created because for other transactions it looks like the first transaction is holding a X lock.

transaction_sequence_num version_sequence_num database_id

------------------------ -------------------- -----------

57 1 9

57 2 9

57 3 9

58 1 9

rowset_id status min_length_in_bytes

-------------------- ------ -------------------

72057594038321152 0 12

72057594038321152 0 12

72057594038321152 0 12

72057594038386688 0 16

record_length_first_part_in_bytes

---------------------------------

29

29

29

33

record_image_first_part

--------------------------------------------------------------------

0x50000C0073000000010000000200FCB000000001000000270000000000

0x50000C0073000000020000000200FCB000000001000100270000000000

0x50000C0073000000030000000200FCB000000001000200270000000000

0x500010000100000002000000030000000300F800000000000000002E0000000000

record_length_second_part_in_bytes record_image_second_part

---------------------------------- ----------------------

0 NULL

0 NULL

0 NULL

0 NULL

In the next blog, I will describe how the pages in the version store are grouped together and the subsequent garbage collection and gotchas

Thanks

Sunil Agarwal