ExecutionLog2 View - Analyzing and Optimizing Reports

There are several options for monitoring performance of a report server.  The fourth article in our technical note series on performance and scalability covers information for optimizing entire report servers.  Besides performance counters, analyzing report execution log data in particular can help answer questions such as, which reports might be good candidates for caching, how many reports were returned from cache vs. live execution vs. execution snapshot, what was the most popular report for the week, and what are the poor performing reports that need to be taken action upon?

Before you can optimize particular reports or your entire system, you need metrics and understand what they tell you.  In this posting, I want to focus on how to effectively interpret and utilize the data present in the new ExecutionLog2 view in the Reporting Services 2008 catalog database.  In summary, I am covering the following topics:

  • Description of ExecutionLog2 columns, with tips on how to interpret values
  • ExecutionLog2.AdditionalInfo and some interesting pieces of information it provides
  • Tips for analyzing ExecutionLog2 information
  • Tips for optimizing reports

Note that the ExecutionLog2 view defines appropriate (nolock) hints to minimize the impact when querying this view directly.  Let's start with a simple query against the report server's database:

use

ReportServer
select * from ExecutionLog2 order by TimeStart DESC

Description of ExecutionLog2 Columns

  • InstanceName
    The RS instance which processed the request.  In a scale-out configuration, this value specifies which node/instance of the RS web farm executed a particular request.  The InstanceName distribution may help determine whether your network-load balancer distributes requests across a RS scale-out configuration as expected.
     
  • ReportPath
    Specifies the full path and name of the report executed.  If the ReportPath is null, it is an ad-hoc execution (Source=Adhoc) meaning that it is either a dynamically generated report model based drillthrough report, or a Report Builder 2.0 report that is previewed on a client utilizing the report server for processing and rendering.
     
  • UserName
    Specifies the user who initiated the request.  For subscription executions, this is the RS service account.
     
  • ExecutionId
    The id associated with the request. 
     
  • RequestType
    Specifies the type of the request: { Interactive, Subscription }
    Analyzing execution log data filtered by RequestType=Subscription and sorted by TimeStart may reveal potential subscription "hotspots" if many subscriptions are scheduled for a similar time window (e.g. Monday morning around 7 AM) or overlap with high interactive load time windows.  In those cases, you might want to consider adjusting subscription schedules to stagger those executions and/or move them to low load time windows.
     
  • Format
    The rendering format for a particular request.  If null, then the request is an interactivity event (such as toggle, drillthrough, interactive sort) and not a rendering request.  Typical values for rendering formats in RS 2008 include RPL, HTML4.0, Excel, Word, PDF.
    Note that RPL (Report Page Layout Format) in RS 2008 represents interactive viewing of a report in HTML (report server / report manager) or Preview in Report Builder 2.0 when connected to a report server.  RPL is an internal, binary format utilized by the RS 2008 server when communicating with the new viewer controls, to offload some of the rendering work from the server to the client viewer control.
    If you see HTML4.0 as rendering format in RS 2008, it means that the report was not requested with the new viewer controls, but likely with the VS 2005 or VS 2008 report viewer controls in server mode that don't support client side rendering yet.  This capability will only be available with the updated report viewer controls.
     
  • Parameters
    The report parameter values associated with the request.
     
  • ReportAction
    The action performed on the report.  It is one of the following values: { Render, BookmarkNavigation, DocumentMapNavigation, DrillThrough, FindString, GetDocumentMap, Toggle, Sort }
    Note that this detailed report action information is new in RS 2008 with the ExecutionLog2.  It provides information about what interactivity events where utilized in a report session and the corresponding event values (shown as part of the AdditionalInfo column).
     
  • TimeStart
    Time the request began.
     
  • TimeEnd
    Time the request finished.
    Note that (TimeEnd - TimeStart) provides the end-to-end duration of the request.
     
  • Important Notes on TimeDataRetrieval, TimeProcessing, TimeRendering
    A value of -1 for any of the timing related columns indicates that the report server detected that the timer was unreliable.  This can happen on particular multi-proc/core hardware, particularly X64, when the CPU clocks get out of sync.
    Execution log entries are created only for the initial request to the server.  If you want to get separate execution log entries for every page rendering request sent to the server while interactively navigating individual pages of a particular report viewed in report manager or previewed in Report Builder 2.0, remove the LogAllExecutionRequests="false" setting from the RPL renderer registration in the RSReportServer.config file.
     
  • TimeDataRetrieval
    The number of milliseconds spent interacting with data sources and data extensions for all data sets in the main report and all of its subreports. This value includes:
    - Time spent opening connections to the data source
    - Time spent reading data rows from the data extension
    Note: If a report has multiple data sources/data sets that can be executed in parallel, TimeDataRetrieval contains the duration of the longest DataSet, not the sum of all DataSets durations.  If DataSets are executed sequentially, TimeDataRetrieval contains the sum of all DataSet durations.
     
  • TimeProcessing
    The number of milliseconds spent in the processing engine for the request.  This value includes:
    - Report processing bootstrap time
    - Tablix processing time (e.g. grouping, sorting, filtering, aggregations, subreport processing), but excludes on-demand expression evaluations (e.g. TextBox.Value, Style.*)
    - ProcessingScalabilityTime (see AdditionalInfo column)
     
  • TimeRendering
    The number of milliseconds spent after the Rendering Object Model is exposed to the rendering extension.  This value includes:
    - Time spent in renderer
    - Time spent in pagination modules
    - Time spent in on-demand expression evaluations (e.g. TextBox.Value, Style.*). This is different from prior releases, where TimeProcessing included all expression evaluation. 
    - PaginationScalabilityTime (see AdditionalInfo column)
    - RenderingScalabilityTime (see AdditionalInfo column)
      
  • Source
    Specifies the type of the execution.  It is one of the following values: { Live, Cache, Snapshot, History, AdHoc, Session, Rdce }
    Source=Cache indicates a cached execution, i.e. dataset queries are not executed live.
    Source=AdHoc indicates either a dynamically generated report model based drillthrough report, or a Report Builder 2.0 report that is previewed on a client utilizing the report server for processing and rendering.
    Source=Session indicates a subsequent request within an already established session (e.g. the initial request is to view page 1, and the subsequent request is to export to Excel with the current session state). 
    Source=Rdce indicates a Report Definition Customization Extension (RDCE; a new feature in RS 2008) that can dynamically customize a report definition before it is passed to the processing engine upon report execution.  
      
  • Status
    rsSuccess if the execution succeeded. 
    If multiple errors occurred, only the first critical error code is recorded.
     
  • ByteCount
    Number of bytes generated as output streams for this request.
     
  • RowCount
    Number of data set rows processed in this request.
     
  • AdditionalInfo
    In short, an XML property bag containing additional information about the execution. 
     

ExecutionLog2.AdditionalInfo

This is where it gets really interesting.  A typical value for the AdditionalInfo column may look like this:

<AdditionalInfo>
<ProcessingEngine>2</ProcessingEngine>
<ScalabilityTime>
<Pagination>0</Pagination>
<Processing>0</Processing>
</ScalabilityTime>
<EstimatedMemoryUsageKB>
<Pagination>0</Pagination>
<Processing>115</Processing>
</EstimatedMemoryUsageKB>
<DataExtension>
<SQL>1</SQL>
</DataExtension>
</AdditionalInfo>

Most frequent / interesting AdditionalInfo entries:  

  • ProcessingEngine
    Identifies the version of the processing engine utilized for the report. 
    1 = Yukon Engine
    2 = On-Demand Processing Engine
    After a deployment is upgraded to RS 2008, reports will internally auto-upgrade to the latest RDL 2008 schema so that the on-demand processing engine can be used for processing.  However, if the auto-upgrade is not possible (e.g. reports that were built with 2005-based CustomReportItem controls), those particular reports will be processed with the old Yukon engine in a transparent backwards-compatibility mode.
     
  • ScalabilityTime
    A value of 0 indicates that no additional time was spent doing scalability operations and means the request was not under memory pressure.  If the value is > 0, it represents the cumulative number of milliseconds spent performing scalability specific operations in a particular report server component to react to memory pressure during the request. 
    A scalability time value should be set in perspective with the total execution time (TimeEnd - TimeStart) of the request.  If a report frequently shows high relative scalability time values, it is a great starting point for optimizing that particular report to reduce memory pressure and optimize the overall system as well.
     
  • EstimatedMemoryUsageKB
    In short, this indicates resource utilization of a report.
    A value of 0 indicates memory usage too low to be tracked specifically to minimize the overhead of this algorithm.  If the value is > 0, it represents an estimate of the peak amount of memory, in kilobytes, consumed by a particular component during a request.  These memory estimates only include objects allocated via the report server's scalability infrastructure, which is mainly utilized by data size dependent operations such as grouping, sorting, filtering, aggregations, page layout, and rendering operations.  Generally, reports with estimated peak memory usage above 10000 (i.e. above 10 MB) for scalable data structures may deserve a closer look from a report design point of view.
     
  • DataExtension
    Identifies which types of data extensions / data sources are utilized by the report.  For report models, this is prefixed with "Model:". The numeric value of the element is a counter to determine how many times that particular type of data source is queried.
     

Tips for Analyzing ExecutionLog2 Information

  • Long running reports
    Which reports have the highest (TimeEnd - TimeStart) values? 
    If the TimeDataRetrieval is high, the data source could be a bottleneck, or the queries retrieve lots of data.  If the there is a high RowCount, lots of data is being retrieved - you might want to review the dataset queries.  If high volumes of data are grouped, sorted, and aggregated, high EstimatedMemoryUsageKB values for Processing are very likely. 
     
  • Live Data or Snapshots
    Sorting by the Source field reveals the types of load on the server.  If reports don't need to always run with the latest up-to-the-millisecond data (e.g. parameterized reports that report from constant past data, or reports that run from data sources that are only refreshed with delays anyway), consider creating history snapshots, configuring execution snapshots, and/or setup caching. Thereby repeated query executions for the same data are avoided, including certain aspects of report processing.
     
  • Discover report patterns
    Sorting by ReportPath and TimeStart may reveal interesting report execution patterns – for example, an expensive report that takes 5 minutes to run is executed every 10 minutes.  
     
  • Most popular reports
    Grouping by ReportPath and counting report executions shows the most popular reports.  Analyze the results by Source type, and focus on live executions first.  Review and optimize the ones with the highest resource utilization (RowCount, TimeDataRetrieval, TimeProcessing, TimeRendering, EstimatedMemoryUsageKB, ScalabilityTime).
     
  • Users utilizing the report server the most, and their favorite reports
    Group the data by user and look for total report executions, live executions, reports with high resource utilization.
     
  • High resource utilization
    Review reports with the highest resource utilization, particularly those with EstimatedMemoryUsageKB > 50000, ScalabilityTime > 10000.  Frequent high values for ScalabilityTime might also point to general high load on the report server, which causes memory pressure for large requests.  You might want to review the memory configuration of the report server as well.
     
  • Number of bytes/rows returned
    Are there reports that return 100,000 and more rows (RowCount), but are only ever viewed interactively (Format = RPL, HTML4.0), and never exported to other formats?  If yes, do these reports aggregate data, or just provide a list of several thousand pages of detail data and users only look at the first few pages?
     
  • Large report outputs
    Are there reports with very large outputs (ByteCount)?  You might want to investigate e.g. reports with >100 MB PDF output.
     
  • Report Health
    Sort by Status and look for report executions with Status != rsSuccess, which might indicate missing subreports, expired stored data source credentials, subscription delivery errors, invalid report definitions, etc.
     

Quick Tips for Optimizing Reports

Once you identified great candidates for further individual report design review and optimization, I recommend to utilize EstimatedMemoryUsageKB, ScalabilityTime, TimeDataRetrieval, TimeProcessing, TimeRendering, and ByteCount for gauging the impact a particular optimization has made.  In order to reduce noise and improve repeatability of measurements, you should use a separate report server with minimal other user load. 

My specific report optimization tips of the day:

  • Subreports
    One of the aspects I usually investigate when optimizing reports is the usage of subreports.  Subreports provide a convenient way to reuse reports, but it may make sense to refactor the report design to lift the subreports contents into the main report.  Not only does this frequently result in improved performance, it can also provide better control over rendering behavior.
     
  • Splitting large reports into parameterized drillthrough reports
    When reports use large datasets, I have sometimes seen the usage of drilldown group hierarchies (toggles) to provide an initial overview to the report consumer, and enable dynamic drilldown to interesting areas. 
    What may work really well for those kind of reports is to change the original report to only include data till e.g. the third level of drilldown (and thereby reduce the amount of data retrieved and processed by the initial report, and improve response time). 
    Then at the innermost drilldown level of the main report, create drillthrough actions with parameters based on the current group values to another report.  The drillthrough target report provides further details and drilldown capabilities, but only has to retrieve a small subset of the original data.