Yet another TVF: returning files from a directory

In my previous BLOG entry dated May 5th, I promised another sample that retrieves multiple files from a directory based on a specified name pattern. Here it is, this time in VB.NET.

This is a more traditional Table-Valued-Function in that it actually returns multiple rows, each with two columns (file-name and contents). Please refer to the earlier entry on the basic requirements of a TVF. Here, I will devote to describing the implementation of this particular TVF.

As we saw in the previous entry already, the key is to implement the IEnumerator interface with three members. The IEnumerator interface is implemented by the MultiFileLoader class. There is some initial work done during the construction of the MultiFileLoader class itself. The constructor retrieves the list of file-names  of the specified pattern in the specified directory and loads it in the array FileNames. Note that the opening of the directory is done after impersonating the caller (and un-impersonating once we are done).

MoveNext - Advances the position in the array. Makes sure to close the currently opened filestream before returning.

Current - Opens the File as per the file-name in the current position in the FileNames array and returns the FileStream; using Filestream ensures streamed access without having to buffer the entire contents in memory; we also make sure to impersonate the caller before opening the file.

Reset - simply resets the array position to before the start.

Here is the VB.NET code:

Imports System
Imports System.Data
Imports System.Data.Sql
Imports System.Data.SqlTypes
Imports Microsoft.SqlServer.Server
Imports System.IO
Imports System.Collections
Imports System.Security.Principal

' The record that holds each row returned by the TVF
Partial Public Class FileRecord
Public FName As String
Public FContents As FileStream

    Sub New(ByVal fn As String, ByVal fc As FileStream)
Me.FName = fn
Me.FContents = fc
End Sub
End Class

'The core of the TVF's implementation that implements IEnumerator
Partial Public Class MultiFileLoader
Implements IEnumerator

    ' Array that holds the list of file names to be returned
Private FileNames As String()

    Private CurrentFile As Integer = -1
Private CurrentFileStream As FileStream
Private CallerIdentity As WindowsIdentity

    ' The constructor retrieves the list of file-names into Me.FileNames; it does the
' appropriate impersonation and un-impersonation after the directory has been read.
Sub New(ByVal PathName As String, ByVal Pattern As String)

        Me.CallerIdentity = SqlContext.WindowsIdentity
Dim OriginalContext As WindowsImpersonationContext = CallerIdentity.Impersonate()

        Try
Me.FileNames = Directory.GetFiles(PathName, Pattern)
Catch
Me.FileNames = Nothing
Finally
If Not (OriginalContext Is Nothing) Then
OriginalContext.Undo()
End If
End Try
End Sub

    ' Retrieves the FileStream for the current file as pointed to by the CurrentFile array index.
' Constructs a FileRecord with the current file's name and the contents (as FileStream) and
' returns it.
' Impersonates caller before opening the file and unimpersonates after getting the filestream.
Public ReadOnly Property Current() As Object _
Implements IEnumerator.Current
Get
Dim OriginalContext As WindowsImpersonationContext = CallerIdentity.Impersonate()

            Try
Me.CurrentFileStream = New FileStream(FileNames(CurrentFile), FileMode.Open)
Catch
Me.CurrentFileStream = Nothing
Finally
If Not (OriginalContext Is Nothing) Then
OriginalContext.Undo()
End If
End Try
Return New FileRecord(FileNames(CurrentFile), CurrentFileStream)
End Get
End Property

    'Advances the CurrentFile position in the FileNames array.
'Also makes sure to close the currently open FileStream before advancing to the next File.
Public Function MoveNext() As Boolean _
Implements IEnumerator.MoveNext

        ' If there are no files, indicate end of result set
If FileNames Is Nothing Then
Return False
End If
If (CurrentFile < FileNames.Length - 1) Then
CurrentFile = CurrentFile + 1
If Not (CurrentFileStream Is Nothing) Then
CurrentFileStream.Close()
End If
Return True
Else
If Not (CurrentFileStream Is Nothing) Then
CurrentFileStream.Close()
Return False
End If
End If
End Function

    Public Overridable Sub Reset() _
Implements IEnumerator.Reset
CurrentFile = -1
End Sub
End Class

Partial Public Class MultiFileAccess
' Root method that returns MultiFileLoader which does the bulk of the work.
<Microsoft.SqlServer.Server.SqlFunction(FillRowMethodName:="FillRow", TableDefinition:="FName nvarchar(300), FContents varbinary(max)")> _
Public Shared Function GetFilesInDirectory(ByVal PathName As String, ByVal Pattern As String) As IEnumerator
Return New MultiFileLoader(PathName, Pattern)
End Function

    ' The fill row method that cracks the FileRecord and returns the individual columns.
Public Shared Sub FillRow(ByVal Obj As Object, ByRef FName As SqlString, ByRef FContents As SqlBytes)
If Not (Obj Is Nothing) Then
Dim fr As FileRecord
fr = Obj
FName = fr.FName
FContents = New SqlBytes(fr.FContents)
Else
FName = SqlString.Null
FContents = SqlBytes.Null
End If
End Sub
End Class

Assuming above code is compiled into an assembly called VBTVF.dll then the following is a T-SQL script to upload the assembly into SQL Server and register the TVF GetFilesInDirectory.

 create assembly VBTVF
from 'c:\Projects\TVF\VBTVF\bin\VBTVF.dll'
with permission_set = external_access
go

 create function GetFilesInDirectory(@PathName nvarchar(300), @Pattern nvarchar(300))
returns table(FileName nvarchar(300), FileContents varbinary(max))
as external name VBTVF.MultiFileAccess.GetFilesInDirectory
go

Now you can test the TVF with queries like:

 select FileName, datalength(FileContents), FileContents
from GetFilesInDirectory('C:\Projects', '*.*')

 - Should Return empty result set.
select FileName, datalength(FileContents), FileContents
from GetFilesInDirectory('C:\NonExistentDirectory', '*.*')

Enough about TVFs. We will talk about something different next time.

- Balaji Rathakrishnan

Microsoft SQL Server

This posting is provided "AS IS" with no warranties, and confers no rights.