PowerPivot Component Architecture

(excerpt from a pending “Microsoft SQL Server PowerPivot Planning and Deployment” technical white paper that will be published in April/May, 2010, reviewed by Daniel Yu, Dave Wickert, Denny Lee, Donald Farmer, James Howey, John Hancock, )

Executive Summary

Microsoft® SQL Server® PowerPivot is an innovative data analysis technology that redefines how organizations of all kinds deliver and succeed with business intelligence (BI). The focus shifts from IT delivering corporate BI solutions to a managed BI collaboration environment that gives users the power to get timely and reliable information to make more relevant decisions. PowerPivot does not replace corporate BI, but complements it with managed, self-service solutions.

Providing business insights to all employees means giving producers of intelligence access to the best data analysis tools and reliable access to trustworthy data, as well as facilitating knowledge-sharing and collaboration within teams of producers and consumers of intelligence and across departmental boundaries. It means leveraging the network of power users in each department, the ones who create departmental and team solutions, assist colleagues in ad-hoc analysis, and, when a centralized solution is best, communicate BI requirements back to the IT department. It also means implementing reliable BI monitoring and management processes to ensure availability and performance for mission-critical, self-service BI applications. PowerPivot enables organizations to extend the reach of BI in the enterprise from corporate to team and individual spaces, while at the same time increasing IT management and operations efficiency.

PowerPivot integrates with Microsoft Office Excel® 2010 to give users unmatched computational power for advanced data analysis with a familiar user interface. PowerPivot also integrates with Microsoft SharePoint® Server 2010 to establish a managed, self-service BI environment that takes advantage of all the standard SharePoint features, such as role-based security, compliance policies, workflows, and versioning, and introduces new features and capabilities, such as PowerPivot Gallery and automatic data refreshing for shared workbook applications. And PowerPivot features a management dashboard that lets users monitor shared applications, track usage patterns over time, drill down to reveal hidden details, discover mission-critical solutions, and make sure appropriate server resources are provisioned.

Establishing a managed, self-service BI environment entails deploying the PowerPivot for Excel 2010 add-in on workstations running Microsoft Office 2010 and PowerPivot for SharePoint 2010 on SharePoint application servers. These deployments can be performed independently. The PowerPivot for Excel 2010 add-in does not require a SharePoint environment and the SharePoint environment does not require PowerPivot for Excel 2010 on all workstations. The Excel add-in is a requirement only for those creating and publishing workbook applications. Other users can access published workbook applications in SharePoint via a Web browser, with the same performance and most of the features as the Excel client. However, planning and coordination are required to determine the best deployment sequence and configuration options, provision adequate storage capacities and system resources, and optimize the managed BI collaboration environment for high availability and performance.

This white paper contains information for technical decision makers, IT administrators, and system architects who are planning to deploy PowerPivot technology in an enterprise environment. This paper assumes the audience is already familiar with the Windows Server® operating system, Microsoft Office, SharePoint, SQL Server, online analytical processing (OLAP), and self-service BI. A high-level understanding of the new features and technologies in Microsoft SQL Server 2008 R2 is also helpful. Detailed product information is available in the MSDN® Library for SQL Server 2008 R2 at https://go.microsoft.com/fwlink/?LinkId=181772.

Introduction

Microsoft Office Excel has long been the dominant data analysis tool in the enterprise. Despite the availability of alternative solutions, employees and decision makers continue to favor Excel because the user interface is familiar and the analysis features are comprehensive and intuitive. Workbooks and spreadsheets are easy to use. Without having to wait weeks or months for IT to deliver BI solutions, users can import data from virtually any source directly into a workbook, process the data using Excel formulas and other data manipulation capabilities, and analyze the data in a variety of ways using PivotTables, PivotCharts, and SQL Server Data Mining add-ins, and so forth. In this way, Excel has served as the basis for self-service BI for more than a decade.

“The way in which Excel particularly can serve as a front end to data warehouses has been dramatically enhanced, and I think you'll find that particularly interesting,” Steve Ballmer, Launch of Microsoft SQL Server 7.0 at COMDEX/Fall ’98, November 16, 1998.

Now, PowerPivot for Excel 2010 takes the self-service BI capabilities of Excel to an unprecedented level. As a separate add-in, PowerPivot exploits the familiarity of Excel while adding an in-memory BI engine and new compression algorithms to load even the biggest data sets into memory. Users can process enormous quantities of data with incredible speed. Processing millions of rows takes about the same time as processing thousands, and by using Data Analysis Expressions (DAX) in addition to standard Excel features, power users can easily create advanced workbook applications that rely on data relationships between tables as in a database, include calculated columns and measures, and aggregate over billions of rows. In many cases, PowerPivot for Excel 2010 can establish the table relationships automatically. Workbooks become more powerful and more mission-critical than ever.

However, bigger, more powerful, mission-critical workbook applications pose significant challenges for the IT department. It is vital to avoid the proliferation of spreadmarts in the enterprise, and to ensure data consistency, integrity, security, and compliance across the myriad of user-generated workbooks. Perhaps even more fundamental, it is crucial to be able to discover mission-critical workbook applications and keep an eye on performance, availability, and quality of service. It is not uncommon for user-generated spreadsheets to evolve into mission-critical business applications, and for IT departments to remain unaware of their existence until an update or change in the underlying data sources breaks these spreadsheets and users turn to IT for troubleshooting with mission-critical urgency. Another key issue is maintenance, as conflicting and outdated data in spreadsheets contributes to misleading analysis, confusing results, and complications in decision-making. Clearly, a managed BI collaboration environment is required to meet the self-service BI challenges with operational efficiency.

PowerPivot integrates with SharePoint Server 2010 as a reliable platform for building the managed BI collaboration environment. Among other capabilities, SharePoint facilitates seamless and secure sharing and collaboration on user-generated workbook applications, while PowerPivot provides the management tools and usage data that put IT in control of spreadmarts. Now the IT department can determine who is using shared workbook applications, when, how often, and with what client application. Perhaps even more importantly, the IT department can also analyze the data source information to see where users are importing their data from. By supporting automatic data refreshing for shared workbooks, PowerPivot helps to ensure consistent and accurate information in self-service BI applications. It is even possible to use shared workbooks as data sources for further analysis, similar to a SQL Server Analysis Services cube. By delivering SharePoint-based BI management tools, PowerPivot makes it possible for IT departments to establish efficient monitoring processes, discover mission-critical workbook applications proactively, and apply appropriate maintenance procedures.

Benefits and Advantages

The success of BI in the enterprise depends on the ability of the BI environment to satisfy user needs, IT needs, and business requirements. Users want to work with familiar tools and need clear and accurate answers to their questions. IT must ensure security and compliance with reliable, scalable, and fast-performing systems that deliver accurate data to users, facilitate information and analysis sharing, and make it easy to collaborate on user-generated BI solutions. Business requirements, on the other hand, are concerned with achieving a positive return on investments (ROI) quickly, reducing operations costs, and increasing business agility and productivity. PowerPivot—unlike any other technology—enables organizations to address these needs and requirements with familiar tools and efficient processes.

Table 1 summarizes the benefits and advantages of PowerPivot technology.

Table 1: PowerPivot Benefits and Advantages

User-Related

IT-Related

Business-Related

· Maximized utilization of familiar Excel features, such as the Office Fluent™ user interface, PivotTables, PivotCharts, and the new Slicers feature.

· Fast calculations and advanced analysis capabilities, such as through automatically established data relationships and DAX expressions, which make actionable insight readily accessible to everyone.

· More and faster answers by combining massive amounts of data from a multitude of sources, including relational databases, multidimensional sources, cloud services, data feeds, Excel files, and text files, in the corporate network and on the Internet.

· Access to relevant information virtually anytime and from any location through PowerPivot galleries in SharePoint.

· Increased IT efficiency associated with monitoring and managing mission-critical self-service BI applications in the enterprise.

· Increased consistency, integrity, security, and compliance, reliability, and scalability for self-service BI applications based on standard SharePoint-based features.

· High degree of data accuracy for decision-making through automatic data refreshing.

· Reduced IT backlog by delegating BI support responsibilities to power users in each department.

· Increased business agility through better, faster, more relevant decisions.

· Increased employee and team productivity through shared self-service BI applications.

· Maximized ROI into SQL Server 2008 R2, SharePoint Server 2010, and Office 2010.

· Reduced operations costs associated with maintaining and supporting self-service BI applications in the enterprise.

PowerPivot Architecture and Design

SQL Server PowerPivot comprises client and server technology. On the client side, power users work with PowerPivot for Excel 2010 to create and work with spreadsheets and workbooks. On the server side, PowerPivot for SharePoint 2010 adds content types, administration and application pages, dashboards, timer jobs, and Web services, and integrates with Excel Services to establish a full-featured BI collaboration environment. Excel Services gives business users the ability to view shared PowerPivot workbooks in a Web browser, any time and at any place with network connectivity.

PowerPivot for Excel 2010

PowerPivot for Excel 2010 is an application-level Excel add-in that implements advanced data analysis features, enhances the user experience through ribbon customizations and spreadsheet templates, and overrides the default PivotTable field list to implement its own task pane, thereby enabling the functionality of OLAP PivotTables and PivotCharts without requiring SQL Server Analysis Services cubes.

Application Architecture

Figure 1 shows the PowerPivot for Excel 2010 architecture, which relies on an add-in assembly, a VertiPaq engine, and a VertiPaq database module. The add-in assembly loads PowerPivot into the Excel application process. The VertiPaq engine performs query processing and implements a column-based data store with efficient compression algorithms to get massive amounts of data directly into memory. With all the data in memory, PowerPivot can perform its query processing, data scans, calculations, and aggregations without having to go to disk. As a consequence of working with column-based data and avoiding the overhead of disk I/O operations, PowerPivot for Excel 2010 achieves a very high analysis performance.

image

Figure 1: PowerPivot for Excel 2010 Architecture

The PowerPivot application architecture includes the following main components:

· PowerPivot add-in assembly   Loads PowerPivot into Excel. PowerPivot then interacts with Excel through the Component Object Model (COM) interop assembly of Excel, calls into the Excel object model, and responds to events and callbacks raised in Excel.

· Registry settings and manifest files   Enable Excel 2010 to load PowerPivot at run time. The registry settings under HKEY_CURRENT_USER\Software\Microsoft\Office\Excel\Addins point to the deployment and application manifests, which are Extensible Markup Language (XML) files that describe how to load the add-in assembly.

· VertiPaq engine   Processes PivotTable and PivotChart queries and DAX expressions. The VertiPaq engine leverages the richness of Analysis Services object models internally, but without imposing dimensional modeling concepts on users.

· VertiPaq database   Stores the PowerPivot data. The VertiPaq module maintains its database in a custom XML part within the Excel workbook. Note that you can locate the VertiPaq database in an Excel file by renaming the workbook using the .zip file extension, opening the .zip file in Windows® Explorer, and then navigating to the xl\customData folder. For technical details, refer to the blog post “For Excel PowerPivot, the Database is IN the Workbook” at https://go.microsoft.com/fwlink/?LinkId=181771.

· Data cache   Stores temporary files during save operations in the %TEMP% directory. With the exception of this temporary data cache, PowerPivot maintains no data outside the Excel workbook. For details, refer to the blog post “Understanding why an Excel PowerPivot Workbook is so Large” at https://go.microsoft.com/fwlink/?LinkId=181773.

· Microsoft OLE DB for Online Analytical Processing (OLAP) Exposes a set of COM objects and interfaces to access multidimensional data stores. The OLE DB for OLAP provider can connect to SQL Server Analysis Services as well as PowerPivot workbooks on the local computer or published in a SharePoint farm.

· Analysis Management Objects (AMO) and ADOMD.NET   Communicate with a running instance of SQL Server Analysis Services or PowerPivot by using XML for Analysis (XMLA). AMO exposes management objects and ADOMD.NET enables client applications to retrieve data and metadata from Analysis Services and PowerPivot data sources.

Note: PowerPivot features a document-centric view of the data, meaning all imported data and connection information for external data sources reside in the Excel workbook.

Hardware and Software Requirements

PowerPivot benefits from multi-core processors, large memory and storage capacities, and a 64-bit operating system on the client computer. The x64 platform removes the 4 GB limitation for addressable memory. Disk storage is important because PowerPivot workbooks with hundreds of millions of rows can quickly exceed several hundred megabytes in size, even though column-based compression helps to reduce the data volume in comparison to the source data. Users also benefit from a fast computer network when importing data and uploading workbooks to PowerPivot galleries and libraries in SharePoint.

Note: The maximum file size of a PowerPivot workbook is 2 GB. There are no restrictions on the amount of data users can import into a workbook, but workbooks exceeding the maximum file size can’t be saved. A 2-GB workbook typically corresponds to a 4-GB dataset, considering a 2:1 compression ratio.

Table 2 summarizes minimum and recommended system configurations for PowerPivot for Excel 2010 on client computers.

Table 2: PowerPivot for Excel 2010 System Requirements

Component

Minimum

Recommended

Processor

500 megahertz (MHz) processor

Dual-core processors faster than 2.6 gigahertz (GHz)

Random Access Memory (RAM)

1 GB

4 GB or more

Disk

NTFS file system–formatted partition with a minimum of 3 GB of free space

NTFS file system–formatted partition with 40 GB of free space for workbook files

Display

1024 × 768

1024 × 768 or higher resolution monitor

Network

56 kilobits per second (Kbps)

1 gigabit per second (Gbps)

Operating System

Microsoft Windows XP with Service Pack 3 (32 bit)

Microsoft Windows Vista® or Microsoft Windows 7 Business Edition or Ultimate Edition (64-bit)

Software

Microsoft Excel 2010

Microsoft .NET Framework 3.5 SP1

Microsoft Office 2010 Microsoft .NET Framework 4.0

Note: Performance tests show that PowerPivot for Excel 2010 can load more than 100 million rows and maintain adequate processing performance with 2 GB of memory. However, test results vary depending on the compressibility of the data. For fastest processing performance, Microsoft recommends multi-core processors and more than 4 GB of RAM.

PowerPivot for SharePoint 2010

While PowerPivot for Excel 2010 extends the capabilities of Excel on the desktop, PowerPivot for SharePoint 2010 complements the capabilities of Excel Services in a SharePoint farm. Excel Services provides the foundation to share workbooks as interactive, Web-based reporting tools. Excel Services also supports integration with other BI features and technologies, such as SharePoint Report Center, SQL Server Reporting Services, and Microsoft PerformancePoint® Services 2010. PowerPivot also adds its own front-end Web service so that PowerPivot for Excel 2010 and other client applications can access PowerPivot data in SharePoint directly.

System Architecture

Figure 2 shows the system architecture of PowerPivot for SharePoint 2010 in relationship to Excel Services and typical client applications. As a service application, PowerPivot is fully integrated into the SharePoint infrastructure for hosting services.

image

Figure 2: PowerPivot for SharePoint 2010 Architecture

The PowerPivot for SharePoint architecture relies on the following main components:

· Analysis Services Client Libraries   Enable client applications to communicate with the PowerPivot Web service. Clients must use the SQL Server 2008 R2 version of the client libraries to support PowerPivot connections.

· Analysis Services Service   Loads the VertiPaq engine to manage the data within PowerPivot workbooks retrieved from SharePoint content databases. The PowerPivot system service extracts the database from the Excel workbook, selects an appropriate SharePoint application server, which runs the Analysis Services service as well as an instance of the PowerPivot system service, and then attaches the database to the Analysis Services instance without requiring Excel 2010 on the server.

Note: On a SharePoint server, an Analysis Services instance can only load VertiPaq databases for in-memory BI processing. Analysis Services in VertiPaq mode does not support traditional OLAP processing modes. Although it is possible to install additional Analysis Services instances on the same server to support traditional OLAP processing, Microsoft does not recommend this configuration because of the differences in workload patterns and resource consumption.

· Content Database   Stores the data, documents, and workbooks associated with SharePoint lists, libraries, and sites. SharePoint maintains all PowerPivot workbooks in content databases. Analysis Services instances can retrieve the workbooks by using the SharePoint object model. The workbook URL identifies the SharePoint content item.

· Data Cache   Maintains workbooks in memory and on disk in the %PROGRAMFILES%\Microsoft SQL Server\MSAS10_50.Powerpivot\OLAP\Backup folder to increase system performance by eliminating roundtrips to the SharePoint content database.

· Excel Calculation Services   Loads PowerPivot workbooks in much the same way as other Excel workbooks, calls user-defined functions if necessary, and refreshes external data as needed, such as when a user interacts with PivotTables or PivotCharts via slicers. The connection information for PowerPivot data points to the PowerPivot OLE DB provider, which enables Excel Calculation Services to request the data from the PowerPivot system service. The DataSource property for VertiPaq databases is listed as $EMBEDDED$.

· Excel Web Access   Renders workbooks on a Web page and allows the user to interact with those workbooks and explore them in Microsoft Internet Explorer®. Excel Web Access does not require Excel on the client computer. The Excel Web Access Web part renders the workbook using Excel Calculation Services.

· Excel Web Services   Provides programmatic access to workbooks. Business applications can calculate, set, and extract values from workbooks, refresh external data connections, and create application-specific user interfaces around shared workbooks.

· PowerPivot Application Database   Stores configuration settings, connection information, server status, and schedule information for data refresh operations.

· PowerPivot Gallery   Implements custom content types and Microsoft Silverlight™-enabled application pages for document libraries to visualize shared workbooks and Reporting Services applications interactively.

· PowerPivot Management Dashboard   Facilitates centralized BI monitoring and management in a PowerPivot for SharePoint environment.

· OLE DB Provider   Enables Excel Calculation Services and other processes, such as Reporting Services and PerformancePoint Services, to locate and establish connections to a PowerPivot system service in the local farm or a PowerPivot Web service in a remote farm. The PowerPivot transport and connection logic is directly integrated into the Analysis Services OLE DB provider of SQL Server 2008 R2, which also provides connectivity to SQL Server Analysis Services (SSAS). The OLE DB provider uses the IsHosted flag to distinguish between SSAS and PowerPivot clients and relies on Windows Communication Foundation (WCF) to communicate with a PowerPivot system service in the local farm or Hypertext Transfer Protocol (HTTP) to communicate with PowerPivot Web services in remote farms.

· PowerPivot System Service   Uses Microsoft ADOMD.NET to communicate with an Analysis Services instance. The PowerPivot system service also manages the application database, monitors server health, coordinates client requests for load balancing, collects usage data, and performs automatic data refreshing for PowerPivot workbooks.

· PowerPivot Web Service   Exposes shared PowerPivot workbooks via XML for Analysis (XML/A) as data sources to external applications, such as Excel or Microsoft Report Builder 3.0. Support of XML/A enables any client application that uses Multidimensional Expressions (MDX) and the Analysis Services OLE DB provider to query PowerPivot data in workbook applications hosted in a SharePoint farm.

· Service Applications   Provide backend services for Web applications. SharePoint 2010 no longer uses Shared Services Providers (SSP). Instead, SharePoint 2010 uses an Internet Information Services (IIS) Web site called SharePoint Web Services, which hosts all service applications in one or multiple application pools for security and process isolation. Default and custom service application groups associate service applications with Web applications.

· Web Applications   Provide IIS sites and application pools, security identities, and isolation for front-end processes on a SharePoint server. A single Web application can host multiple site collections and each site collection can have the PowerPivot feature activated or deactivated individually.

· SharePoint Timer Jobs   Run scheduled processes to collect and process usage information, refresh data in shared workbooks, gather health statistics, and update the Management Dashboard workbook with data from the PowerPivot application database. Table 3 summarizes the timer jobs that PowerPivot adds to the SharePoint configuration.

Table 3: PowerPivot-Specific Timer Jobs

Timer Job

Default Schedule

Description

Usage Data Import timer job

Hourly

Collects usage information from PowerPivot components and imports the data into the centralized SharePoint Usage Collection database.

Usage Data Processing timer job

Daily

Processes the usage information from the centralized SharePoint Usage Collection database and imports the normalized data into the PowerPivot application database.

PowerPivot Data Refresh timer job

Minutely

Determines if a data refresh job needs to be scheduled and executed.

PowerPivot Health Statistics Collector timer job

Every 15 minutes

Collects statistics regarding the CPU and memory utilization of PowerPivot components and imports the data into the PowerPivot application database.

PowerPivot Management Dashboard Processing timer job

Daily

Updates the Management Dashboard workbook based on the data in the PowerPivot application database. This timer job must be scheduled to run after the Usage Data Processing timer job.

PowerPivot Setup Extension timer job

Hourly

Verifies that all installed PowerPivot components are up-to-date and running and checks whether there are PowerPivot components that need to be patched after SQL Server patching runs.

Note: Microsoft SQL Server Management Studio can connect to an Analysis Services instance on a SharePoint server and list the VertiPaq databases loaded on demand when users interact with the workbooks. However, Microsoft does not support the use of SQL Server Management Studio for Analysis Services administration on a SharePoint server. Use SharePoint Central Administration and PowerPivot Management Dashboard instead.

PowerPivot Farm Topologies

SQL Server PowerPivot supports all SharePoint farm topologies, ranging from standalone server deployments to large multi-server installations. A standalone server hosts all Web applications, service applications, and SQL Server databases on a single computer. This configuration is suitable for evaluation purposes and small workgroup environments. In most environments, however, it is more advantageous to deploy separate front-end servers and application servers. Figure 3 illustrates this concept.

image

Figure 3: PowerPivot for SharePoint in a Multi-Server Farm Environment

By scaling out with multiple application servers, an organization can support SharePoint, SQL Server, and PowerPivot with higher availability, increased performance, better manageability, and greater security. PowerPivot can run together with other components and services on the same SharePoint server as long as the computer is not configured as a dedicated front-end server for Web applications or SharePoint Central Administration. It is also possible to deploy dedicated PowerPivot application servers that do not run other service applications, which is especially useful in environments with a large number of workbooks and long-running queries.

In farms with multiple PowerPivot servers, PowerPivot automatically performs load-balancing in a round-robin fashion or based on server health, depending on the system configuration. Each PowerPivot service application maintains an internal list of available servers and datasets in the farm, which PowerPivot uses as follows:

1. Data is already loaded.    The PowerPivot service application checks its internal list of active PowerPivot datasets to determine an Analysis Services instance in the farm that already has the data loaded into memory. If this is the case, PowerPivot passes the requests to the PowerPivot system service responsible for the selected server. This PowerPivot system service then passes the request to its local Analysis Services instance. This mechanism ensures that only one copy of a workbook is loaded across the entire farm.

2. Data is not loaded but it is cached.    The PowerPivot service application checks its internal list of cached PowerPivot datasets that are already on the file system. If the same PowerPivot dataset is cached on multiple servers, the PowerPivot service application will check server health and build a list of servers that have sufficient resources to accept the request. From that list, a server is picked at random.

3. Data is neither loaded nor cached.    If the data must be retrieved from the content library, PowerPivot uses an allocation scheme to select the server for the request. The PowerPivot service application then issues a load request to the selected Analysis Services instance, passing the SharePoint URL of the PowerPivot workbook.

Note: You can create multiple PowerPivot service applications in a farm to isolate service processing and configuration settings for different Web applications, but you cannot restrict a PowerPivot service application to a specific subset of PowerPivot servers. A PowerPivot service application always includes all PowerPivot servers in the local farm. Deploying additional PowerPivot servers increases the processing capacity of the entire farm.

PowerPivot Service Associations

As a service application, PowerPivot takes advantage of the SharePoint 2010 infrastructure for hosting services. This provides administrative flexibility in centralized SharePoint environments that host site collections and sites for multiple departments and teams with different processing and configuration requirements. It is even possible to configure Web applications to use services from different farm environments. Services that can be shared across server farms include the Secure Store Service, User Profile, Managed Metadata, Business Data Connectivity, Web Analytics, and Search. Note, however, that Excel Services and PowerPivot for SharePoint are single-farm services. These services cannot be shared across farms. They must reside in the farm that hosts the content.

image

Figure 4: PowerPivot for SharePoint and Service Associations

Figure 4 illustrates how Web applications in a SharePoint farm can share the same PowerPivot service application, while another Web application uses a custom service association that excludes the PowerPivot service application. This configuration can be useful for Web applications of departments that have not yet completed their onboarding to the managed BI collaboration environment. As soon as a department completes the onboarding process, a farm administrator can change the service associations to include the PowerPivot service application so the department can use PowerPivot in the local farm. The section “PowerPivot Deployment Planning” later in this white paper covers BI onboarding processes in more detail.

Note: PowerPivot Management Dashboard requires a PowerPivot service application in the default service connection group. Configurations that exclude PowerPivot for SharePoint from the default group are not supported.

System Requirements

Table 4 lists general PowerPivot for SharePoint system requirements. These requirements can serve as a starting point to build a reference system for evaluation purposes. Microsoft recommends performance testing on reference systems prior to any production deployment.

Table 4: PowerPivot for SharePoint 2010 System Requirements.

Component

Minimum

Recommended

Processor

2 dual-core processors

2 dual-core processors for commodity enterprise servers or 4 quad-core processors if the organization standardizes on quad-core servers.

Random Access Memory (RAM)

8 GB

64 GB (depending on the number of currently used workbooks and their average size)

Network

100 megabit per second (Mbps)

Multiple 1-gigabit-per-second (Gbps) network interface cards (NICs)

Disk

NTFS–formatted partition with a minimum of 80 GB of free space

NTFS–formatted partitions with more than 80 GB of free space on dedicated direct attached storage (DAS)-based or storage area network (SAN)-based drives.

Database Engine (for the SharePoint farm; not the PowerPivot app servers)

SQL Server 2005 Service Pack 2 (Cumulative Update 12) or Service Pack 3 (Cumulative Update 2) or SQL Server 2008 (Cumulative Update 4)

SQL Server 2008 R2

Operating System

Windows Server 2008 SP2 (64-bit) or Windows Server 2008 R2

Software

SQL Server 2008 R2 Enterprise Edition

SharePoint 2010, with Excel Services

SQL Server Native Client

Windows PowerShell™ V2

Microsoft .NET Framework 3.5 SP1

Microsoft "Geneva" Framework Runtime

Microsoft Sync Framework Runtime v1.0 (x64)

Microsoft Chart Controls for Microsoft .NET Framework 3.5

Microsoft SQL Server 2008 Analysis Services ADOMD.NET

Application Servers without PowerPivot for SharePoint

Microsoft OLE DB Provider for Analysis Services 10 (MSOLAP100.DLL)

SharePoint Central Administration servers without PowerPivot

Microsoft SQL Server 2008 R2 Analysis Services ADOMD.NET

Security Accounts

Separate Windows domain accounts for:

· Analysis Services instance

· PowerPivot service application

· Unattended PowerPivot data refresh

Note: The general recommendations are sufficient for only a relatively small number of users and workbooks in a workgroup or team environment with fewer than 100 concurrent PowerPivot connections. Large PowerPivot deployments benefit from higher memory capacities to load more data and extra processors to support long running scans of raw, unaggregated data. Note also that PowerPivot performance depends on the net available bandwidth between application and database servers. Consider a connection speed of 1 Gbps or faster and multiple NICs.

Database Storage Requirements

SharePoint stores all user data in content databases, including PowerPivot workbooks, which can be as large as the maximum upload size. In conjunction with the estimated number of PowerPivot workbooks in all site collections and sites, the maximum upload size provides a good indicator of database storage requirements. Microsoft recommends a maximum upload size of 2 GB for PowerPivot-enabled Web applications.

It is important to note that SharePoint versioning and recycle policies can increase storage requirements substantially. By default, SharePoint retains deleted workbooks in the recycle bin, which contributes to storage requirements. Furthermore, with versioning enabled, SharePoint creates a new item copy each time a user checks in a changed document or workbook. The system retains the old copy as part of the versioning history. Because of the increased demand for database storage, Microsoft does not recommend enabling versioning for Web applications that use PowerPivot for SharePoint unless there are compelling business reasons. Table 5 lists general factors that contribute to database storage requirements.

Table 5: PowerPivot Database Storage Considerations

Content Database for Site Collections

Content Database for SharePoint Central Administration

Content Database for PowerPivot Service Application

· PowerPivot workbooks and other documents up to the maximum upload size.

· Reporting Services reports that include charts, maps, and other visual elements.

· Versioning history, which retains previous versions of the same large files many times over.

· Recycle policies, which retain deleted items in the recycle bin.

· Customized versions of site pages.

· Excel worksheets and a PowerPivot data files used in the PowerPivot Management Dashboard.

· Internal reporting database that tracks usage data and data refresh history for 365 days, by default.

· Lists of cached or loaded PowerPivot data files.

· Data refresh schedules.

· PowerPivot usage data copied from the central usage data collection database.

Note:    SharePoint Server 2010 supports SQL Server 2008 R2 Remote BLOB Storage (RBS) to move the storage of binary large objects (BLOBs) from database servers to commodity storage solutions. In conjunction with PowerPivot workbooks, RBS can help to avoid wasting expensive database storage capacities. For details about using RBS in a SharePoint environment, refer to the Microsoft TechNet article “Install and Configure Remote BLOB Storage (SharePoint Server 2010) ” at https://go.microsoft.com/fwlink/?LinkId=181774.

Local Disk Space Requirements

On application servers, PowerPivot for SharePoint caches files to eliminate roundtrips to the content database. This increases performance because the Analysis Services instance can load files cached on disk immediately rather than retrieving the workbooks from the content database each time a load request is received. By default, PowerPivot caches workbooks in the %PROGRAMFILES%\Microsoft SQL Server\MSAS10_50.Powerpivot\OLAP\Backup folder on the local system drive.

Scale-out deployments based on multiple load-balanced servers can help to reduce the number of cached workbooks on each individual server because PowerPivot prefers to direct new requests to a server that already has the requested workbook loaded or cached, as mentioned earlier in this white paper. However, multiple application servers might cache the same file, depending on downtime and other reasons that prevent a server with a loaded or cached copy from handling the current request. Although it is not possible to predict or restrict caching servers on a per-workbook basis in a multi-server topology, the scale-out deployment can help to distribute cached files across a larger number of servers, thereby reducing individual local disk space requirements.

Note: The PowerPivot system service on each server performs a daily synchronization check to remove cached files if the original file in the content database is modified or deleted or if the cached files are older than five days. You can’t configure the retention time. PowerPivot for SharePoint caches workbooks on application servers for up to five days.