How to warm up the Analysis Services data cache using Create Cache statement?

Goal

This document describes how to build Create Cache commands. Create Cache for Analysis Services (AS) was introduced in SP2 of SQL Server 2005. It can be used to make one or more queries run faster by populating the OLAP storage engine cache first.

Some customers have found certain queries benefit other later queries. For example, ascmd.exe could be used every hour to execute all queries in a directory keeping the cache ready for subsequent user queries. The other approach, which has been used, is to create a cache query for each user query. This is feasible if the MDX query is part of a report, then one simply adds another query that has the side effect of populating the cache, thereby speeding up the next query.

The root of the problem is that during a query, the AS Server does only local optimizations. Calculations, mixed granularities, and other more complex MDX statements can result in a chatty communication between the FE (Formula Engine) and the SE (Storage engine). In AS2000 and earlier, this was also a network round trip.

By issuing a cache statement, we can populate the cache with one or more subcubes that cover the regions of cube space that the query will actually need. We often find approximately the same time taken for each subcube query, so the effect can be dramatic overall.

With this methodology, the collection of MDX queries will appear as first executing inside the storage engine, and second inside the formula engine. In addition to reducing overall time, this can make it easier to predict the effect of multi-user load testing, because the first part uses 100% of all CPUs, and the second part uses 100% of one CPU.

Summary Steps

It is an iterative process. I would describe the identification of potential scenarios where create cache would help:

  1. Run profiler
  2. Run the query looking for Non-cache SE queries (query subcube filtered by subevent = 2)
  3. Look at the total time of the query vis-à-vis of the sum of the times of the non-cache SE queries.
  4. If:
    1. They are pretty close and
    2. There are many non-cache SE queries within the same range of time (not just 1 or 2 time consuming)

then create cache might help.

There could be variations, but here are one set of steps that have been used successfully:

1. Extract all MDX queries as separate files.

2. Add a Create Cache statement to correspond for every MDX query file.

3. Run Clear Cache, Create Cache, then the user query.

4. Verify that the Create Cache is effective and improve, if necessary.

5. Work on next query

6. When done with all queries, combine the Create Cache into 1 or more Create Cache queries.

7. Verify the combined Create Cache.

Detailed Description

Below are details about each of the steps.

1. Extract MDX queries as separate files

Placing the queries in separate MDX files, it can be faster to work one by one and verify if each query is handled correctly by Create Cache before moving on.

1. Start a trace.

2. Run the report.

3. Stop the trace.

4. Extract the queries. One way is with SQLProfiler, /File /Export /Extract SQLServer Analysis Serverices Events / Extract All Queries. This creates a text file, with each query on a separate line. When queries have multiple lines this might be confusing, but is easy to add an extra line or otherwise edit. Either copy each query to a separate file, or, for every query that is being worked on, comment out the other queries.

2. Add Create Cache for every MDX query

Below is an example of a Create Cache statement.

create cache

for [MyCube]

as (

            { [USA].[Oregon], [USA].[Colorado], [USA].[Florida], [USA].[Washington] }

            * { [Measures].[mybasemeasure] }

            * { [2006].children, parallelperiod( [Time].[Year], 1, [2006].[Q1].[Jan] ), YTD( parallelperiod( [Time].[Year], 1, [2006].[Q1].[Jan] ) ) }

            * { [Products].[Shoes].children }

)

It should be apparent that it is basically a crossjoin of each dimension member that is specified in the query. Note that set expressions are allowed.

2.a. Add Specified Members

First add all members specified in the query. NOTE: Create Cache covers static analysis rather than dynamic. MDX with dynamic members will not benefit from this approach.

2.b. Add Calculated Members and Definitions

During execution, the Formula Engine can issue a subcube query which includes calculated members.

For example, if the cube has a calculated member:

            [MyDim].[Calc123] as [MyDim].[A] + [MyDim].[B]

one should include the following members in Create Cache:

            [MyDim].[A], [MyDim].[B] }

2.c. Account for Custom Rollups

During execution, there may be custom rollups that affect the calculation. For example, a custom rollup may involve QTD() or YTD(). This might be hard to detect since the actual calculations are stored in the relational database. Custom rollups can be viewed from BI Development Studio. The dimension must be processed (otherwise the calculations still only exist on the relational database).

1. Go to Solution Explorer, double-click on the dimension you want to examine.

2. Click on the Browser tab.

3. Find the icon for Member Properties, click on it. Select "Custom Rollup", "Custom Rollup Properties".

4. It will display the Custom Rollup (formula) and other properties (such as solve order) for each member.

For example, if calculations include YTD or PeriodsToDate, the member list should account for that. So, if the query includes:

            [Date].[2006].[June],

            ParallelPeriod( [Date].[2006].[June], [Date].[Year], -1 )

            YTD( [Date].[2006].[June] )

the only calculations resulting in output are [Date].[2005].[June], [Date].[2006].[Jan] ..[Date].[2006].[June]. However, because of indirect relationships, you should also add the member set:

            YTD( ParallelPeriod( [Date].[2006].[June], [Date].[Year], -1 )

3. Run Clear Cache, Create Cache, User Query

Now, it is time to run the Clear Cache statement and examine its effect. Note that the Cube ID can be specified or left blank to clear the entire database cache entries.

<!-- Can be used through ADOMD.NET ExecuteNonQuery call, or passed to ADODB as CommandText or SQL Server Management Studio XMLA query -->

<Batch xmlns="https://schemas.microsoft.com/analysisservices/2003/engine">

  <ClearCache>

    <Object>

      <DatabaseID>FoodCmp</DatabaseID>

      <CubeID>SalesCube</CubeID>

    </Object>

  </ClearCache>

</Batch>

3.a. Connection string parameters

We recommend these parameters to be added at the connection string.

Provider=msolap.3;Datasource=MyServer;Initial Catalog=MyDatabase;Timeout=300;Disable Prefetch Facts=true;Cache Ratio=1

Provider: The name of the OLE DB provider, msolap.

Datasource: The name of the server. Might be named instance like MyServerMyInstance.

Initial Catalog: The name of the database to use.

Timeout: Optional number of seconds for command timeout. This can help to avoid very long runs until the queries are optimized to run faster.

Disable Prefetch Facts: Optional new parameter to disable the Formula Engine heuristic that sends queries for possibly more data than it is requested.