The SQL Swiss Army Knife #4 - Making sense of FILESTREAM containers

Hello all,

Here is another one focusing on SQL scripts that may help DBAs, following the series "SQL Swiss Army Knife". This time we are exploring FILESTREAM.

Consider the following FILESTREAM enabled database and table in a SQL Server 2008:

CREATE DATABASE Archive
ON
PRIMARY ( NAME = Archive1,FILENAME = 'c:\data\archdat1.mdf'),
FILEGROUP FileStreamGroup1 CONTAINS FILESTREAM( NAME = Archive3,FILENAME = 'c:\data\filestream1')
LOG ON ( NAME = Archivelog1,FILENAME = 'c:\data\archlog1.ldf')
GO

 

CREATE TABLE [dbo].[Records](
[Id] [uniqueidentifier] ROWGUIDCOL NOT NULL UNIQUE CONSTRAINT DF_Records_Id DEFAULT (newsequentialid()),
[FileName] [varchar](255) NOT NULL,
[File] [varbinary](max) FILESTREAM NOT NULL,
UNIQUE NONCLUSTERED ([Id] ASC) ON [PRIMARY]
) ON [PRIMARY] FILESTREAM_ON [FileStreamGroup1]
GO

 

I highlighted the FileName column because it will be useful in all this exercise. On the above table I inserted the following values:

 

INSERT INTO Archive.dbo.Records ([FileName], [File])
VALUES ('test_ppt.pptx' , CAST ('test_ppt.pptx' as varbinary(max)));
GO
INSERT INTO Archive.dbo.Records ([FileName], [File])
VALUES ('test_word_3.docx' , CAST ('test_word_3.docx' as varbinary(max)));
GO

 

When I select in SQL Server directly this is how they look like:

 

image

 

Not very enlightening as to what those files are in the file system… So how to make sense of this information, and lets say even store it securely somewhere? Say, for instance, your database gets corrupted (let’s assume it’s non-repairable) and you have no valid backups (yes, a very far-fetched situation).

All you have left is a NTFS container filled with “gibberish” files that no one can make sense. How to at least recover your files in a more efficient way? This is when you should have a list of those files securely stored somewhere right?

Ok, so if you were fortunate enough to make such listing AND you already had a column that stored the filename to help in this task, at least you’re not looking at either a complete data loss, or opening countless files to see what’s inside. Sounds interesting?

So, this script will make sense of a FILESTREAM container by listing the following details per FILESTREAM enabled table within a database:

  • Table Name
  • Column Name
  • Partition
  • 1st Level Folder
  • 2nd Level Folder
  • File Name (the column stored in the table)
  • NTFS File Name (its BLOB counterpart name in the file system)

The output will be something like this:

image

Let’s check it’s accuracy:

image

Yes, checks out. So by adding an “artificial” piece of data (the FileName column) we are now able to map the FILESTREAM container data into real filenames, and have a mapping of which “logical” filename matches which physical filename. If I have a table that uses FILESTREAM and stores the filenames, I am able to export this data on a regular basis and make use of it if the time comes.

The code that outputs this result can only be executed thru the DAC, and is based on posts by the excelent Paul S. Randal in his BLOG. An undocumented DBCC command is used in this process, so any future versions of SQL Server might break this. 

Download code here: FILESTREAM_Data.sql

Until next time!

Disclaimer: I hope that the information on these pages is valuable to you. Your use of the information contained in these pages, however, is at your sole risk. All information on these pages is provided "as -is", without any warranty, whether express or implied, of its accuracy, completeness, fitness for a particular purpose, title or non-infringement, and none of the third-party products or information mentioned in the work are authored, recommended, supported or guaranteed by Ezequiel. Further, Ezequiel shall not be liable for any damages you may sustain by using this information, whether direct, indirect, special, incidental or consequential, even if it has been advised of the possibility of such damages.

References:
SQL Server Storage Engine BLOG - How to use DBCC PAGE by Paul S. Randal
KB224453 - Understanding and resolving SQL Server blocking problems
Paul S. Randal BLOG - FILESTREAM directory structure - where do the GUIDs come from?
Paul S. Randal BLOG - FILESTREAM directory structure