Using ProcessingGroup Dimension property option ByTable vs. ByAttribute may error with string keys

Author: Nicholas Dritsas

Reviewers: Thomas Kejser, Stuart Ozer

In SSAS 2005 and later, there is a dimension property called ProcessingGroup. It has two values; ByAttribute (default) and ByTable.

 

When you use ByAttribute, SSAS will send a SELECT DISTINCT query to the relational engine for each attribute PLUS, at the end, an additional SELECT DISTINCT that combines all the attributes plus key. As you can imagine, if you have several attributes and a very large dimension table, this process can take a while. Our customer in this case has a 100 million members dimension table in Oracle.

 

Using the option ByTable, SSAS sends one table scan query to the relational engine and temporarily caches in memory the results. This can work well only if you have enough memory for the cache (see related blog for potential issues here).

Now, even if you have enough memory, you may encounter a different problem. If some of the attribute keys are strings, you may hit the 4gb limit of SSAS for the string store. This limitation is faced typically on an attribute level, if you process ByAttribute. But, since we process ByTable here, the limit applies to the whole table. So, if you have a very large dimension table, you will hit this limit even if no single attribute size is more than 4gb.

 

There are a few workarounds, but, the problem of having a very large dimension with many attributes that have string keys is still hard to manage currently:

 

1. ByAttribute and configure relational engine to perform better by adding indexes. For more details, please look at the SQL Server 2008 Analysis Services Performance Guide here.

2. ByTable and reduce the dimension string sizes at the key.