Where is a record really located?

Recently I had a question regarding "What happens to records when they are deleted" – so I wanted to dig into what happens under the covers when just such a thing happens.  The answer is somewhat simple – but the explanation is somewhat more complicated.  So, I am breaking the answer up into 2 different parts.  This first tip will look at how to locate a record from a table in the SQL data file.  The next tip will look at what happens when a DELETE occurs.  We have to be able to find the records to tell if they're there or not after a DELETE – which is why I'm breaking it up.  

To begin I'll create a database for our tests.

[sql]
use master
GO
IF DB_ID('DeletedDB') IS NOT NULL
BEGIN
ALTER DATABASE DeletedDB
SET SINGLE_USER WITH ROLLBACK IMMEDIATE

DROP DATABASE DeletedDB
END
GO
[/sql]

Next, I'll insert 100 records into a table named dbo.Sales.  Any numeric columns from AdventureWorks I am going to cast as a character string.  This just will make it easier to identify data when we look at it through a hex editor.

[sql]
CREATE DATABASE DeletedDB
GO
USE DeletedDB
GO
SELECT TOP(100)
SalesOrderNumber, PurchaseOrderNumber, AccountNumber, CustomerID = CAST(CustomerID AS VARCHAR(12)),
CreditCardApprovalCode, CarrierTrackingNumber, IDCol = CAST(SalesOrderDetailID AS VARCHAR(10))
INTO dbo.Sales
FROM AdventureWorks2014.Sales.SalesOrderDetail s
JOIN AdventureWorks2014.Sales.SalesOrderHeader h ON s.SalesOrderID = h.SalesOrderID
ORDER BY SalesOrderDetailID ASC
GO
[/sql]

I'll order the table based on the IDCol column and tell SQL that the column is unique.

[sql]
ALTER TABLE dbo.Sales
ALTER COLUMN IDCol VARCHAR(10) NOT NULL
GO
ALTER TABLE dbo.Sales
ADD CONSTRAINT PK_Sales_IDCol PRIMARY KEY CLUSTERED (IDCol)
GO
[/sql]

I can use the handy function sys.fn_physlocformatter to query the table and identify specifically where a row is.  The column 'Location(File:Page:Slot)' tells me that the row I have highlighted is on page 330.  So, this is where our journey will start.

[sql]
USE DeletedDB
GO
SELECT IDCol, CarrierTrackingNumber,
'Location(File:Page:Slot)' = sys.fn_PhysLocFormatter (%%physloc%%),
LockResource = %%lockres%%
FROM dbo.Sales
CROSS APPLY sys.fn_physLocCracker (%%physloc%%) x
[/sql]

We can use sys.dm_os_buffer_descriptors to view the page in the buffer pool.

[sql]
SELECT * FROM sys.dm_os_buffer_descriptors
WHERE page_id = 330 AND database_id = DB_ID('DeletedDB')
[/sql]

Next I am going to use DBCC PAGE to view the contents of the page.  This will give us all of the information we need to find the row on disk.  Notice what I have highlighted from DBCC PAGE – the offset for the row (purple) and the offset for the column (yellow).  These will be our guide for finding the contents on disk.

 

[sql]
DBCC TRACEON(3604)
DBCC PAGE(DeletedDB, 1, 330, 3) WITH TABLERESULTS
[/sql]

 

Now that I have found the page (330) and the needed offsets for showing the contents on disk, I am going to detach the database so I can read in the file with a hex editor.

[sql]
use master
GO
ALTER DATABASE DeletedDB
SET OFFLINE
GO
[/sql]

Once the file is detached, I am going to open it with the XVI32.exe hex editor.

OK – time for some math.  The DB page I am looking at is 330.  Each page is 8192 bytes in size.  If I multiply 330 by 8192 I get the starting offset in the file for where that page begins.  In this case it is hex value 294000.

[sql]
DECLARE @PageStart INT
DECLARE @TargetPage INT = 330
DECLARE @PageSize INT = 8192

SELECT @PageStart = CAST(@TargetPage * @PageSize AS VARBINARY(100))
SELECT PageStart = @PageStart
[/sql]

In the hex editor I can go to that specific address.

OK, cool.  I have found some data.  But, I don't really know where anything starts just yet column-wise.  I have highlighted one column since it looks familiar to me from looking the resultset from the table earlier. 

We can see here the AccountNumber column from the table is highlighted.

So – where does the first column (IDCOl) start?  At this point we have all of the information we need.  We just need to do a couple of calculations to find it.

What I have highlighted in purple above is the offset for the first row of the table.  The row offset is 0x60.  If we add this to the page offset we get:  0x2940000 + 0x60 = 0x2940060

If I lookup that address in the hex editor I am able to find a value – but it isn't quite what I am looking for just yet…

So – we now know where the row starts on disk.  How about where the first column starts?

To find the location of the first column I simply add the offset for the column (0x17) to the value calculated above for where the row starts.  The math is:

0x2940060 + 0x17 = 0x2940077

If I go to that address, I find the value 1 – which is the character string value of the IDCol in the first row.  Very cool!

To verify I can find out where other columns start – let's try the CarrierTrackingNumber column.  The offset for this column is 0x6a.

When I add 0x6a to the offset of where the row starts I get 0x2940CA.

0x2940060 + 0x6a = 0x2940CA

When I look up that address I am directed to exactly where the CarrierTrackingNumber column value begins on disk.

I hope this helps a bit with understanding how we are laying data for data pages and rows out on disk.  Its literally a series of offsets of values that SQL Server uses to pull pages into memory.