What does Import from PowerPivot actually do?

We have two pieces of functionality in Denali that allow you to promote a self-service model in PowerPivot (v1 or v2) to a tabular model. One is import from PowerPivot, where you create a new tabular modeling project from a pre-existing PowerPivot workbook. Another is restore from PowerPivot (pretend this documentation exists, it will get there eventually), where you can restore a PowerPivot workbook in SSMS in exactly the same way as any other Analysis Services backup. The restore puts the metadata and the data on the VertiPaq server.

Import and Restore from PowerPivot work because the database is in the workbook. The .data file shown in the workbook is really just an old school Analysis Services Backup (or ABF) file. The basic concept for implementation is simple - crack open the workbook, get the ABF, slap it on the server, script the metadata to the BIM file, call it good.

I have received some questions before as to why we don’t have some kind of macro to do this automatically from the PowerPivot UI, since this is conceptually so simple. This import/restore process is actually quite a bit more complicated than what I just described. Let’s fire up Profiler and walk through the commands sent by the UI to see why.

First command:

 <ImageLoad xmlns="https://schemas.microsoft.com/analysisservices/2003/engine" //some namespaces redacted>
    <ddl200_200:ImagePath>C:\Program Files\Microsoft SQL Server\MSAS11.TABULAR2\OLAP\Backup\bikes.xlsx</ddl200_200:ImagePath>
    <!-- some items redacted-->                            
    <DatabaseName>bikes</DatabaseName>                              
    <DatabaseID>bikes</DatabaseID>                          
</ImageLoad>

This ImageLoad command is the same as when you restore an ABF. This is taking the workbook and applying its database to the server. So far, so good. Let’s look at the next major command:

 <Alter AllowCreate="true" ObjectExpansion="ExpandFull" xmlns="https://schemas.microsoft.com/analysisservices/2003/engine">

    <Object>
            <DatabaseID>bikes</DatabaseID>
            <DataSourceID>6433d27a-6c36-4fbb-a1fa-0487baf54eec</DataSourceID>
    </Object>

    <ObjectDefinition>            
        <DataSource xsi:type="RelationalDataSource" //namespaces redacted>
              <ID>6433d27a-6c36-4fbb-a1fa-0487baf54eec</ID>
              <Name>SqlServer . AS_AdventureWorksDWBikeSales</Name>
              <Annotations>
                <Annotation>
                  <Name>ConnectionEditUISource</Name>
                  <Value>SqlServer</Value>
                </Annotation>
              </Annotations>
              <ConnectionString>Provider=SQLNCLI11;Data Source=.;Initial Catalog=AS_AdventureWorksDWBikeSales;Integrated Security=SSPI;Persist Security Info=false</ConnectionString>
              <ImpersonationInfo>
                <ImpersonationMode>ImpersonateServiceAccount</ImpersonationMode>
              </ImpersonationInfo>
              <Timeout>PT0S</Timeout>
          </DataSource>
      </ObjectDefinition>
        
</Alter>

What this is doing is changing the impersonation settings on the data source. PowerPivot workbooks only support ImpersonateCurrentUser. What that means – whenever you refresh a PowerPivot workbook in Excel, the engine uses the currently logged on user’s credentials to connect to the data source and fetch the data. ImpersonateCurrentUser is not supported for Analysis Services instances running in tabular mode, because the idea of having an interactive user session for processing data doesn’t make sense. Therefore, since we have to pick a new impersonation setting, we fall back in this case to impersonating the service account so you’re not prompted for credentials on load.

Next major step (this is the first in a set of related commands):

 <Alter AllowCreate="true" ObjectExpansion="ObjectProperties" xmlns="https://schemas.microsoft.com/analysisservices/2003/engine">
  <Object>
    <DatabaseID>bikes</DatabaseID>
    <DataSourceID>PushedDataSource-F052E9FD-98DA-441C-A0C0-B84DA82E5F25</DataSourceID>
  </Object>
  <ObjectDefinition>
    <DataSource xsi:type="OlapDataSource" //namespaces redacted>
      <ID>PushedDataSource-F052E9FD-98DA-441C-A0C0-B84DA82E5F25</ID>
      <Name>PushedDataSource-F052E9FD-98DA-441C-A0C0-B84DA82E5F25</Name>
      <ConnectionString>.</ConnectionString>
      <Timeout>PT0S</Timeout>
    </DataSource>
  </ObjectDefinition>
</Alter>

This command creates a new data source for your copy/pasted or linked table data. In PowerPivot, the copy/pasted or linked table data is passed along using an out-of-line binding, and not using an explicit data source. This makes the pasted data transient. If we were to continue to use this out-of-line binding without doing any sort of transformation, import from PowerPivot would succeed (as you are simply restoring the data from the image), but deploy or script out of the database would fail to pass along the data.

So what we do is we “upgrade” the copy/pasted or linked table data to use an explicit data source in the workbook. The next Alter (which I won’t paste here) goes and sticks the data in a schema rowset, which is serialized in the BIM file. The next Alter (which again I won’t paste) goes and fixes up the dimension after you are done.

Here’s the last interesting command, this time we alter the cube (even truncated this command is huge, sorry):

 <Alter AllowCreate="true" ObjectExpansion="ExpandFull" xmlns="https://schemas.microsoft.com/analysisservices/2003/engine">
  <Object>
    <DatabaseID>bikes</DatabaseID>
    <CubeID>Model</CubeID>
  </Object>
  <ObjectDefinition>
    <Cube //namespaces redacted>
      <ID>Model</ID>
      <Name>Model</Name>
      <Annotations>...</Annotations>
      <Language>1033</Language>
      <Dimensions>...</Dimensions>
      <MeasureGroups>...</MeasureGroups>
      <Source>...</Source>
      <MdxScripts>
        <MdxScript>
          <ID>MdxScript</ID>
          <Name>MdxScript</Name>
          <Commands>
            <Command>
              <Text>
                CALCULATE;
                CREATE MEMBER CURRENTCUBE.Measures.[__No measures defined] AS 1;
                ALTER CUBE CURRENTCUBE UPDATE DIMENSION Measures, Default_Member = [__No measures defined];
              </Text>
            </Command>
            <Command>
              <Text>
                ----------------------------------------------------------
                -- PowerPivot measures command (do not modify manually) --
                ----------------------------------------------------------


                CREATE MEASURE 'BikeSales'[Sum of SalesAmount]=SUM('BikeSales'[SalesAmount]);
                CREATE MEASURE 'BikeSales'[Count of SalesOrderNumber]=COUNTA('BikeSales'[SalesOrderNumber]);
                CREATE MEASURE 'BikeSales'[Distinct Count of SalesOrderNumber]=DISTINCTCOUNT('BikeSales'[SalesOrderNumber]);
                CREATE MEASURE 'BikeSales'[Sum of SalesAmount 2]=SUM('BikeSales'[SalesAmount]);
                CREATE MEASURE 'BikeSales'[Average of SalesAmount]=AVERAGE('BikeSales'[SalesAmount]);
                CREATE MEASURE 'Bike'[Measure 1]=COUNTROWS(Bike);
              </Text>
            </Command>
          </Commands>
          <CalculationProperties>
            <CalculationProperty>
              <Annotations>
                <Annotation>
                  <Name>Type</Name>
                  <Value>Implicit</Value>
                </Annotation>
                <Annotation>
                  <Name>Format</Name>
                  <Value>
                    <Format Format="Currency" Accuracy="2" ThousandSeparator="True" xmlns="">
                      <Currency LCID="1033" DisplayName="$ English (United States)" Symbol="$" PositivePattern="0" NegativePattern="0"/>
                    </Format>
                  </Value>
                </Annotation>
                <Annotation>
                  <Name>RefCount</Name>
                  <Value>1</Value>
                </Annotation>
                <Annotation>
                  <Name>Column</Name>
                  <Value>SalesAmount</Value>
                </Annotation>
                <Annotation>
                  <Name>Aggregation</Name>
                  <Value>Sum</Value>
                </Annotation>
              </Annotations>
              <CalculationReference>[Sum of SalesAmount]</CalculationReference>
              <CalculationType>Member</CalculationType>
              <FormatString>'\$#,0.00;(\$#,0.00);\$#,0.00'</FormatString>
              <ddl300:VisualizationProperties>
                <ddl300:IsSimpleMeasure>true</ddl300:IsSimpleMeasure>
              </ddl300:VisualizationProperties>
            </CalculationProperty>
            
            <CalculationProperty>
              <Annotations>
                <Annotation>
                  <Name>Type</Name>
                  <Value>User</Value>
                </Annotation>
                <Annotation>
                  <Name>Format</Name>
                  <Value>
                    <Format Format="General" xmlns=""/>
                  </Value>
                </Annotation>
              </Annotations>
              <CalculationReference>[Measure 1]</CalculationReference>
              <CalculationType>Member</CalculationType>
              <FormatString>''</FormatString>
            </CalculationProperty>
            <CalculationProperty>
              <CalculationReference>Measures.[__No measures defined]</CalculationReference>
              <CalculationType>Member</CalculationType>
              <Visible>false</Visible>
            </CalculationProperty>
          </CalculationProperties>
        </MdxScript>
      </MdxScripts>
      <StorageMode valuens="ddl200_200">InMemory</StorageMode>
    </Cube>
  </ObjectDefinition>
</Alter>

The command above fixes up the measures. PowerPivot workbooks have two different types of measures – explicit measures and implicit measures. Explicit measures are the ones where users entered by hand a DAX formula, either from the PowerPivot field list or from the measure grid. Implicit measures, by contrast, are created automatically by PowerPivot when users drop a field into the Values area of the PowerPivot field list. PowerPivot has some smarts and treats these two measures differently. However, the tabular designer has no concept of implicit or explicit measures – all it knows are explicit measures. So during the import process, any implicit measures created are converted to explicit measures. The measures are tagged as to their origin, so we do things like hide unused implicit measures.

After all this business is done, we process tables that need processing. 

Finally there are some other actions that are not captured in this particular explanation. If you upgrade from PowerPivot v1 to a tabular model (instead of doing PowerPivot Denali->Tabular model), we will go through the upgrade codepath and upgrade relationships, etc. Also, we restore the view state settings for the model as well, so you see the same measure grid settings, diagram settings, etc.

Simple, right? Moral of this story – it’s not a trivial task to write this in a macro or encapsulate this logic into AMO. Let the UI do the heavy lifting for now.

There is one last task for you to do on your original workbook before you are done. You will need to update all your pivot tables and slicers to point to the new data source. I have a trick for this, I will post it later.

[Update 8/17/2011 - the trick is posted here]