How It Works: File Stream the Before and After Image of a File

The Senior Escalation Engineers do various training and mentoring activities. As I do this I thought I would try to propagate some of this information on the blog.

The question that spawned this entry was: "I am performing updates of a specific row and column and after 5 updates I see 5 files for the same row and column stored in the file stream container folder - why?"

The SQL Server 2008 file stream feature allows VARBINARY(MAX) data to be stored as individual files on disk.  The SQL Server database engine has been updated to treat this data as part of the database container.

Walkthrough A Transaction

I found it helpful to walkthrough a transactional operation to better understand how file streams maintains the before and after image of the data without storing it directly in the log (.ldf). As you can see the new file stream log records use the physical files to keep the before and after image, much like the bulk insert (BCP) log records indicate the pages changed. BACKUP LOG understands what actual files need to be included with the BACKUP. This can make a log backup much larger than the .ldf file.

File stream files are not physically deleted when the row is deleted or update; nor is the before image immediately purged. The purging takes place with the garbage collector . The file streams components follow a strict protocol, understanding when a file can be deleted. Only then is an entry inserted into the garbage collection table and processed by the garbage collector.

The following discussion outlines how a transaction uses the physical file links to accomplish a rollback operation.

Assume you have a Row #1 in the File Stream enable table with a pointer to LSN 1.

1. Issue a Begin Tran

2. Select and OpenSqlFileStream the file associated with Row #1 for modification (WRITE)

3. The new file is created, named LSN 2 and the link on the page points to it.

4. The before log record points to the LSN 1 file (Prev Image)

image

When the rollback is issued the row on the page is reverted and it points to the LSN 1 and the Win32 HANDLE is invalidated.  

This is really no different than what would happen if the VARBINARY(MAX) was stored in the database. The old page chain pointer remains in the log record and a new chain established for modifications. If a rollback occurs the old state is returned. 

For a commit an open handle for the transaction (open for read or write access) results in an error. All handles associated with the transaction must be closed by the application before a COMMIT is executed.

The before and after images are kept until they can safely be removed.   Safe removal often requires a backup to secure the database image, can involve replication processing and other SQL Server features.

Tombstone Tables

WARNING: The tombstone tables are internal tables, stored on the default file group of the database. They are for server use only and are not designed to be queried, altered and updated from T-SQL. If you feel you need to do this you should be under the guidance of Microsoft SQL Server Support.

I have include them in the discussion as examples and not for general purpose usage.   

The garbage collector task implements all the necessary logic to determine proper cleanup boundaries.   The individual tasks for the garbage collector are stored in tombstone tables. 

What tombstone tables exist?

The tombstone table is an ‘IT’ internal database table. You can find the various tombstone tables using the following query.

select * from sys.internal_tables where name like 'filestream_tomb%'

What rows exist in the tombstone table?

Using a DAC connection allows direct selection of internal table data. An example of a tombstone select is shown here. (Again this should only be used for emergency purposes as indicated by the DAC connection.)

select * from sys.filestream_tombstone_2073058421

The oplsn* columns indicate the LSN of the tombstone entry. This can be different from the filename (LSN) because this could be a delete entry and the filename would not change but the entry in the tombstone table would occur at a later time. The oplsn’s are used in conjunction with the transaction logging and backup facilities to maintain the proper image of the files on disk.

How do I map the tombstone entry to a table?

The sys.tombstone* internal table contains columns such as file_id, rowset_id that can be joined with sys.database_files, sys.partitions and others. The following query shows a mapping from the tombstone table row to the parent object.

 select t.*,

              f.name as [FileStreamContainer],

              f.physical_name as [FileStreamContainerRoot],

              p.object_id,

              object_name(object_id)

       from sys.filestream_tombstone_2073058421 t

              inner join sys.database_files f on f.file_id = t.fileid -- Map to a database file

              inner join sys.partitions p on p.partition_id = t.rowsetid -- Map to object

Bob Dorr
SQL Server Senior Escalation Engineer