There have been a couple of improvements made to the Raw File Source and Destination in RC0.
- The Raw file format now contains Sort information
- The Destination saves all sort info (including the Comparison Flags for string columns)
- The Source reads and honors the sort info
Let’s take a look at this new functionality.
My package is retrieving data from DimProduct, and sorting on the ProductKey field.
The Raw File Destination editor has a new button at the bottom of the form.
Clicking the button shows the list of columns that will be output, and gives us a choice to cancel and modify them, or to proceed with creating the file. Clicking OK creates a metadata-only file at the specified location.
I add another Data Flow Task with a Raw File Source component. After pointing the Raw File Source at the newly created DimProduct.raw file, I see that it has read in all of the columns. Opening the Advanced editor, I can see that that the IsSorted property on the Raw File Source Output is set to True, and that the SortKeyPosition for the ProductKey column is set to 1.
Sort Information when Appending to an Existing Raw File
Important: The Raw File Destination allows you to append to an existing raw file – be very careful when appending to a file that has sort information set. This operation does not reset the sort info, or re-sort the file, so you need to make sure that the data you’re putting into it is still correctly sorted. You can manually override the sort information in the Raw File Source (using the Advanced editor) if you want it to ignore the sort flags from the file. Note that Truncate and Append is not affected here, and this operation replaces all data in the existing file.
A validation warning is issued when there is a sort info mismatch between the Data Flow and the Raw file you are appending to. For example, if I add another Sort to my Data Flow (this time on ProductAlternateKey) and append to the same DimProduct.raw file, I get the following warning:
The sort key position or comparison flag values for the "ProductAlternateKey" column in the appended data do not match the corresponding values for the"ProductAlternateKey" column in the existing file. Appending the data to the existing file will change the sort order information.
If you continue with the operation, the resulting file will have the sort info from the original file metadata – the additional SortKeyPosition info from the data flow is ignored.