Proper partitioning can improve dramatically the writeback process when dealing with large data sets
Author: Nicholas Dritsas
Reviewers: Richard Tkachuk, Akshai Mirchandani
Customer is using SQL 2008 and SSAS’s writeback abilties to do 52-weeks rolling sales forecasting. Their fact table has reached 250 million rows and they want to add 40 million records into the writeback table each week using weight allocation when updating 2 measures and executing 20,000 update cube statements per week. This process takes 11 hours currently using MOLAP writeback storage. They have 5 dimensions, with the biggest been the item dimension with 130,000 members and 7 levels. They do not have properly defined aggregations or attribute relationships yet. Overall, as number of records in writeback table is increased, performance is progressively slower. The number of records in a writeback partition will have an impact on both query performance and writeback performance.
Suggestion was around proper partitioning. Transferring rows from the writeback partition into multiple regular partitions is likely to help only from the standpoint of standard partitioning optimization techniques. All relevant partitions (and not just the writeback partition) will be scanned for the deltas. The Writeback partition usually doesn’t have good partition slices, and by distributing records to other partitions that have good partition slices you may get better performance. For example, if you move all updates for Year 2008 into the Year 2008 partitions – now any subsequent updates to Year 2009 cells will avoid unnecessarily scanning all of those records.
The number of records in the writeback partition will have an impact on both query performance and writeback performance – particularly if the partition is ROLAP. However, if this partition is MOLAP then it should behave very much like a standard MOLAP partition for most situations – although 250 million rows will still probably result in perf slowdowns to calculate the deltas at the leaf level.
By doing the following operations, we improved writeback performance from 11 hours to 2 hours 46 minutes.
1. We introduced partitions to the fact and writeback tables. We aligned them so they correspond to each other. We added partitions for years 2005, 2006, 2007 and 2008 plus 9 partitions for 9 weeks todate for 2009. So, a total of 13 partitions for fact and 13 for writeback tables. The 2005, 2006, 2007 and 2008 partitions were done using a WHERE clause against the main single table. For 2009, we have a separate physical table per week. ETL process will be creating a new weekly table and SSAS partition each week.
2. We applied time slicer to all partitions, even if they were MOLAP.
Profiler result for one update cube statement.
A. Before : When writeback was executed to 200908, update cube accessed three partitions:
1. 200908 Partition
2. All Partition
3. Writeback Partition
B. After : When writeback is executed to 200909, update cube accessed two partitions:
1. 200909 Partition
2. 200909 Writeback Partition