Customer experience with SAP BW FEMS-Pushdown

A few months ago we released a new SAP BW statement generator, which increases BW query performance for complex queries containing FEMS filters, see https://blogs.msdn.microsoft.com/saponsqlserver/2017/03/06/bw-queries-by-factors-faster-using-fems-pushdown. In the meanwhile, a few customers who tested the new feature, provided feedback to the "SAP on SQL Server development". Based on this feedback, we further improved the performance of FEMS queries and released the optimized code in SAP Note 2483734 (see below). However, in some cases the query performance was still not optimal, because of an unsuitable system configuration. The intention of this blog is give guidance and best practices, based on our customer experience.

Prerequisites

Customers typically do not want to apply and test new SAP code on the productive system. It is a good idea to use a virtual machine for the testing. However, for FEMS-Pushdown, you should keep in mind that you want to test performance, not simply functionality. Therefore, you should provide sufficient resources to the VM.

  • Hardware requirements
    For FEMS-Pushdown we consider 16 CPU threads for SQL Server as a minimum configuration. As a matter of course, SQL Server should also have access to sufficient RAM and a fast I/O system
  • SQL Server version
    We strongly recommend SQL Server 2016 (SP1 CU2 or newer) when using FEMS-Pushdown. SAP BW can force a parallel execution plan on SQL Server 2016 using an optimizer hint. Furthermore, SQL Server 2016 always uses batch-mode processing for the columnstore. See the following blog for details: https://blogs.msdn.microsoft.com/saponsqlserver/2017/05/18/parallel-processing-in-sap.
    Technically, FEMS-Pushdown also works with SQL Server 2014. In this case, you should set SQL Server startup parameter -T8649 for forcing a parallel execution plan. However, SQL Server 2014 may use row-mode processing under high work load, which decreases BW query performance.
  • Required BW code The SAP code for columnstore and BW queries is permanently being updated. We regularly publish known documentation and code issues in SAP Note 2116639 – SQL Server Columnstore documentation. The scope of this SAP Note has been extended to FEMS-Pushdown. Therefore, it contains a link to the newest code improvements in SAP Note 2483734 (see below).
    FEMS-Pushdown requires the Columnstore Optimized Flat Cube. You can create a Semantically Partitioned Cube (SPO) as a Flat Cube, but you cannot convert an existing SPO to a Flat Cube yet. The conversion report is still under development by SAP.

Best Practices

When running a BW query with FEMS-Pushdown, you can run into the same issues as with conventional BW queries: Lack of system resources, sub-optimal execution plans and poorly designed BW queries. Therefore, you should follow the following recommendations:

  • Update Statistics When loading data into a cube, you should update the database statistics and perform columnstore rowgroup compression within the BW process chain. This is described in https://blogs.msdn.microsoft.com/saponsqlserver/2016/11/14/simplified-and-faster-sap-bw-process-chains. However, when using the Flat Cube, SQL Server execution plans are much more robust, even with outdated database statistics.
  • Force parallel execution plans After applying the newest SQL Server 2016 Cumulative Update and the newest SAP BW code, all SQL Server queries created by the SAP BW statement generator will have a parallel execution plan. See SAP Note 2395652 - SQL Query Hints USE HINT for details.
  • Avoid Sematically Partitioned Cubes (SPOs)
    This recommendation is not specific for FEMS-Pushdown. It is related to all cubes using the columnstore. Existing SPOs work fine with the columnstore. However, we do not encourage our customers to create additional SPOs
    • A BW Multi-Provider is a logical cube consisting of many physical cubes. This concept is similar to a union-view on many tables in SQL Server. There are often organizational (structure of data load) or business reasons for using Multi-Providers. Therefore, customers often use Multi-Provider (with or without columnstore)
    • An SPO is a specific Multi-Provider, where all part-providers have the exactly same structure. It logically "partitions" a cube by time (or another characteristic). The idea is to speed-up BW query performance by dividing a cube into smaller chunks and running SQL queries on these chunks in parallel.
      However, when using the columnstore, one large cube results in better performance than many small ones. Selects on the columnstore use efficiently intra-query parallelism and can benefit from rowgroup elimination (similar to partition pruning). Also archiving is very fast on columnstore tables (however, archiving is not so important anymore, because columnstore data is stored very storage-efficient).

New improvements (with SAP Note 2483734)

Optimized BW code for FEMS-Pushdown has been released in SAP Note 2483734 - FEMS-Pushdown performance improvements for Microsoft SQL Server. The correction instructions of this SAP Note are available as of SAP BW 7.50 SP4. They are not available on SAP BW 7.51 or 7.52. On these SAP BW releases, you have to wait for the next SAP Support Package. The following improvements have been implemented

  • Columnstore-Pushdown The idea of FEMS-Pushdown is to shift the evaluation of SAP BW query filters from the SAP application server to the database server. Therefore, a SQL command is being created in the SQL statement generator for FEMS-Pushdown. The new version of this statement generator creates additional, redundant filters in the SQL Server statement. Therefore, these filters can be directly evaluated in the columnstore clustered index scan (before running the first level of aggregation). Hereby, the BW filters are further pushed-down inside the SQL Server statement execution.
  • Intra-Query parallelism
    BW queries with FEMS-Pushdown benefit much more from additional CPU threads than other BW queries. Furthermore, increasing the maximum intra-query parallelism on SQL Server 2016 does not have the negative side effect as on SQL Server 2014 (sporadic row-mode processing). With the new FEMS-Pushdown code, the maximum number of CPU threads for a FEMS query is calculated based on the complexity of the query. It can even exceed the value of the RSADMIN parameter MSS_MAXDOP_QUERY, but it will never be higher than the new parameter MSS_MAXDOP_FEMS. Hence, FEMS-Pushdown queries can use more SQL Server threads than normal BW queries. However, SQL Server can reduce the actual used CPU threads at query runtime, once there is a resource bottleneck. Only for SQL Server 2014, we recommend setting RSADMIN parameter MSS_MAXDOP_FEMS. There is no need for this on SQL Server 2016 or newer.
  • BW hierarchy improvements We implemented some additional improvements, for example for BW hierarchy filters. Keep in mind, that we did not use any of the new improvements of the new FEMS-Statement generator when measuring the performance in https://blogs.msdn.microsoft.com/saponsqlserver/2017/05/08/performance-evolution-of-sap-bw-on-sql-server.

Analyzing FEMS Queries

FEMS-Pushdown cannot be used for all FEMS queries. For example, inventory queries cannot use FEMS-Pushdown yet. There are several tools, where you can check the FEMS-Pushdown usage in SAP BW:

  • SQL statement in RSRTYou can easily verify, that FEMS-Pushdown is actually used, by looking at the SQL query in SAP transaction RSRT. The query contains a common table expression (CTE) starting with "WITH [T_FEMS] AS".
  • Statistics Data in RSRT For a FEMS-Pushdown, the aggregate name <cube>$F is displayed in the Aggregation Layer tab of the RSRT statistics data
  • Event IDs in Statistics Data The idea of FEMS-Pushdown is reducing the runtime on the SAP application server. Particularly, the runtime of OLAP event ID 3110 should be significantly reduced. However, seeing a long runtime for event ID 3110 does not necessarily mean, that FEMS-Pushdown was not used. When using BW Exceptional Aggregation, additional time is spent for event ID 3110 and 3200.
  • ST03 Statistics The best way for monitoring BW query runtime is the BI Workload Monitor in SAP transaction ST03. Here you can see the runtime of BW queries by day, cube and query. Furthermore, you can see where the time was spent: "DB Time" is the time consumed by SQL Server and "OLAP Time" is consumed by the SAP application server. You can reset the statistics (on your test system) by running report RSDDSTAT_DATA_DELETE. Take care, this permanently deletes the ST03 statistics, also for other SAP users.
    Be aware, that the SQL Statement statistics in SAP transaction DBACOCKPIT can be misleading, particularly for SAP BW queries. SAP BW always opens a database cursor for running a BW query. Processing in the BW OLAP engine is always performed in packages between database fetches. SQL Server is measuring the runtime of a SQL query as the time between the OPEN and the last FETCH. Therefore, the SQL query runtime in DBACOCKPIT contains the processing time on the application server! However, in SAP transaction ST03 (or RSRT), the processing time on the application server is correctly not included in the "DB Time" (or "Data Manager" time).

Conclusion

For best BW query performance, we recommend using SQL Server 2016 and the newest SAP BW code of SAP Note 2483734. SAP BW FEMS-Pushdown requires using the Flat Cube. More and more customers start using the Flat Cube actually because of the FEMS-Pushdown. We got feedback from many customers, the merely the Flat Cube (even without FEMS-Pushdown) running on a modern hardware results in similar performance as they observed on their BW Accelerator. Using the FEMS-Pushdown can reduce peaks in query runtime caused by the most complex BW queries.