How to use Partition Processing Destination Task to pump data into a SSAS partition via SSIS

To load and process the data into Analysis Services, SQL Integration Services provides two data flow components: Partition Processing Destination and Dimension Processing Destination. However, I found very few users know how to use them or use them correctly.  Some said the features look nice but can’t find any good example to follow. That makes many users stay away from using the features.

This blog, I am to going to write an example to show how to use Partition Processing Destination. In general, the same concept can be applied in Dimension Processing Destination as well. The whole example is based on the sample data set Adventure Works DW 2008.

Our example scenario:  To pump the data from a relational table directly to partition [Reseller_Orders_2001] under [Reseller Orders] measure group in Adventure Works DW

The source SQL (is taken from the partition properties):

    SELECT [dbo].[FactResellerSales].[ProductKey],[dbo].[FactResellerSales].[OrderDateKey],

[dbo].[FactResellerSales].[DueDateKey],[dbo].[FactResellerSales].[ShipDateKey],

[dbo].[FactResellerSales].[ResellerKey], [dbo].[FactResellerSales].[EmployeeKey],

[dbo].[FactResellerSales].[PromotionKey],[dbo].[FactResellerSales].[CurrencyKey],

[dbo].[FactResellerSales].[SalesTerritoryKey],[dbo].[FactResellerSales].[SalesOrderNumber],

[dbo].[FactResellerSales].[SalesOrderLineNumber],[dbo].[FactResellerSales].[RevisionNumber],

[dbo].[FactResellerSales].[OrderQuantity],[dbo].[FactResellerSales].[UnitPrice],

[dbo].[FactResellerSales].[ExtendedAmount],[dbo].[FactResellerSales].[UnitPriceDiscountPct],

[dbo].[FactResellerSales].[DiscountAmount],[dbo].[FactResellerSales].[ProductStandardCost],

[dbo].[FactResellerSales].[TotalProductCost],[dbo].[FactResellerSales].[SalesAmount],

[dbo].[FactResellerSales].[TaxAmt],[dbo].[FactResellerSales].[Freight],

[dbo].[FactResellerSales].[CarrierTrackingNumber],[dbo].[FactResellerSales].[CustomerPONumber]

FROM [dbo].[FactResellerSales]

WHERE OrderDateKey <= '20011231'

 

Steps:

1. Draft a “Data Flow Task” on to the Control Flow tab.

2. Double click the Data Flow Task to open Data Flow tab.

3. Create a OLE DB Source pointing to your Adventure Works DW 2008 relational database

4. Select “SQL Command” for access method

5. Copy and paste the SQL statement above into the dialog box

6. Close it

7. Drag the “Partition Processing Destination” Task over

8. Connect the data flow task from OLE DB source to Partition Processing task.

9. Create a data source within the destination task to point to your SSAS server.

10. Select the partition [Reseller_Orders_2001] under [Reseller Orders] measure group.

11. Select one of the processing methods. In this example, we pick Full processing

12. Map the input columns with the partition attributes (see screen shot #1)

13. At the end, we will find we don’t have any columns that can map to Geograph.Geography Key and Reseller Order Count. If we leave them unmapped (<ignore>, any attempt to execute the task, the following error is seen:

Error "Pipeline processing can only reference a single table in the data source view"

The error message itself is indeed little unclear and little misleading.

The point is: we can’t leave any attributes unmapped even the UI allows us to select <ignore> in the field but it does not mean the columns are ok to ignore.

Screen shot #1

clip_image001[10]

Then, why we don’t have a column for Geography.Geography Key and Reseller Order Count.  It is because is Geography is a referenced dimension. It does not link to the fact table directly. Reseller Order Count is a distinct count measure base on SalesOrderNumber but the column is used already in mapping to Reseller Sales Order Details.Reseller Sales Order.Key(0). The UI does not allow mapping a single column to two different attributes.

We need to do something here to work around the limitations:

1. For referenced dimension key “Geography.Geography Key”, we extend the query input include the relevant column. PLUS uncheck the materialize checkbox of the referenced dimensions, it will work around this limitation (see screen shot #2 below)

2. For Reseller Order Count, we select the same column twice but give them a different column alias.

Remark: The "materialize" is a performance feature targeting for reference dimension.  Unchecking the materialize checkbox may impact the cube query performance. You can learn more about this feature in the section of "Reference relationships" in SQL Server 2005 Analysis Services Performance Guide.

 

Screen shot #2

clip_image001[12]

We rewrite the SQL statement base on the workaround above into following: 

 

SELECT

DimGeography.GeographyKey,

[dbo].[FactResellerSales].[SalesOrderNumber] as resellerordercount,

[dbo].[FactResellerSales].[ProductKey],[dbo].[FactResellerSales].[OrderDateKey],

[dbo].[FactResellerSales].[DueDateKey],[dbo].[FactResellerSales].[ShipDateKey],

[dbo].[FactResellerSales].[ResellerKey], dbo].[FactResellerSales].[EmployeeKey],

[dbo].[FactResellerSales].[PromotionKey],[dbo].[FactResellerSales].[CurrencyKey], [dbo].[FactResellerSales].[SalesTerritoryKey],

[dbo].[FactResellerSales].[SalesOrderNumber], [dbo].[FactResellerSales].[SalesOrderLineNumber],

[dbo].[FactResellerSales].[RevisionNumber], [dbo].[FactResellerSales].[OrderQuantity],[dbo].[FactResellerSales].[UnitPrice], [dbo].[FactResellerSales].[ExtendedAmount],

[dbo].[FactResellerSales].[UnitPriceDiscountPct], [dbo].[FactResellerSales].[DiscountAmount],

[dbo].[FactResellerSales].[ProductStandardCost], [dbo].[FactResellerSales].[TotalProductCost],[dbo].[FactResellerSales].[SalesAmount],

[dbo].[FactResellerSales].[TaxAmt],[dbo].[FactResellerSales].[Freight], [dbo].[FactResellerSales].[CarrierTrackingNumber],[dbo].[FactResellerSales].[CustomerPONumber]

FROM [dbo].[FactResellerSales], DimReseller, DimGeography

WHERE

       FactResellerSales.ResellerKey=DimReseller.ResellerKey and DimReseller.GeographyKey=DimGeography.GeographyKey and

      OrderDateKey <= '20011231'

 

We replace the query in step #5 with this new SQL and then remap all the columns again (see screen shot #3)

Screen shot #3

clip_image003

Now, the task will pump the data directly from the source into the target partition.

C S John Lam | Premier Field Engineer - SQL Analysis Services