Troubleshooting Memory Issues with Reporting Services

We had a case where Reporting Services was crashing with the following error.

Failed allocate pages: FAIL_PAGE_ALLOCATION 2

The number at the end can be different.  In the customer’s case it was a 1.  In my local repro it was a 2.  This is basically an indication that we are out of memory for the process and the process crashes at that point as we can’t allocate any more memory.  You won’t see an OutOfMemory error, as the allocations come from SQL Operating System (SOS) which hosts the .NET CLR.  So, it is SOS that is trying to do the allocation that fails.  SOS is also used with the SQL Engine, so you will see this error from that side as well when you are out of memory.

Before we get into the meat of it, here is a link to an article that goes through the memory thresholds for Reporting Services and explains how it will handle memory when those thresholds are hit.

Configure Available Memory for Report Server Applications
https://msdn.microsoft.com/en-us/library/ms159206.aspx

When Reporting Services starts, it will calculate what the maximum amount of memory will be for the process.  This will be done in one of two ways.

First, if WorkingSetMaximum is not set in the rsreportserver.config (which it isn’t there by default), then Reporting Services will derive the maximum memory setting based on your total physical memory.  To see this happen, we can look at the Reporting Services Trace Log and look for Derived memory.

rshost!rshost!19c4!01/29/2015-05:03:22:: i INFO: Derived memory configuration based on physical memory as 33486264 KB

servicecontroller!DefaultDomain!b14!01/29/2015-05:03:22:: i INFO: Total Physical memory: 34289934336

You may or may not see something like the following entry if you configure WorkingSetMaximum.

library!DefaultDomain!3278!12/02/2014-16:11:18:: i INFO: Initializing WorkingSetMaximum to '12486264' kilobytes as specified in Configuration file.

We also have the concept of MemorySafetyMargin and MemoryThreshold.  These are used to alert Reporting Services and to start to back off as we need to allocate more memory and we are already pretty full.  These are also configured within the rsreportserver.config.  The default values are 80% and 90% respectively of whatever our Maximum value is set to.

<MemorySafetyMargin>80</MemorySafetyMargin>
<MemoryThreshold>90</MemoryThreshold>

We can also validate these values within the Reporting Services Trace Log.

library!DefaultDomain!19c4!01/29/2015-05:03:19:: i INFO: Initializing MemorySafetyMargin to '80' percent as specified in Configuration file.
library!DefaultDomain!19c4!01/29/2015-05:03:19:: i INFO: Initializing MemoryThreshold to '90' percent as specified in Configuration file.

All of this amounts to when Reporting Services will start triggering memory pressure notifications.  These notifications can either be Low, Medium or High.  The link above has a great image that shows you when each one will trigger.

Configuration settings for memory state

NOTE: You will only see the NotifyMemoryPressure items in the log if you have the log set to Verbose. Specifically the applicationdomain category.

You also have the opportunity to define WorkingSetMaximum and WorkingSetMinimum in the rsreportserver.config if you know that you have other items running on this machine and you don’t want Reporting Services to starve other items.  Or at the minimum, for Reporting Services and the other services to play nice with each other.  This allows us to cap Reporting Services manually instead of going on the derived value.

Out of Memory Condition

Let’s go back to the FAIL_PAGE_ALLOCATION error that I mentioned at the beginning.  If you receive this, we ran out of memory and couldn't recover fast enough and then fail to allocate because nothing is available.

Without verbose logging, we can see the following type of behavior in the Reporting Services Log.

processing!ReportServer_0-2!199c!01/29/2015-05:28:37:: w WARN: Processing Scalability -- Memory Shrink Request Received
processing!ReportServer_0-2!199c!01/29/2015-05:28:38:: w WARN: Processing Scalability -- Memory Shrink Request Received
Failed allocate pages: FAIL_PAGE_ALLOCATION 1

Here we can see that there were Memory Shrink Requests.  This is an indication that we are starting to hit the ceiling and Reporting Services wants to back off to have more breathing room.  The allocation error caused the process to crash.  It would then restart on it’s own.  There is no other information that is logged.  Nothing in the Event Logs or from the Reporting Services Trace log.

Troubleshooting

rsreportserver.log

The first thing I tend to look at for this is what are the settings defined in the rsreportserver.config.  In the customer’s case, we see the following.

<MemorySafetyMargin>80</MemorySafetyMargin>
<MemoryThreshold>90</MemoryThreshold>
<WorkingSetMaximum>1000000</WorkingSetMaximum>
<WorkingSetMinimum>400000</WorkingSetMinimum>

This alone is a giant red flag.  WorkingSetMaximum looks really low.  I don’t necessarily care about WorkingSetMinimum.  And the defaults for MemorySafetyMargin and MemoryThreshold are fine.

This is the definition of WorkingSetMaximum from the MSDN article referenced at the top of this blog.

WorkingSetMaximum

Specifies a memory threshold after which no new memory allocations requests are granted to report server applications.

By default, the report server sets WorkingSetMaximum to the amount of available memory on the computer. This value is detected when the service starts.

This setting does not appear in the RSReportServer.config file unless you add it manually. If you want the report server to use less memory, you can modify the RSReportServer.config file and add the element and value. Valid values range from 0 to maximum integer. This value is expressed in kilobytes.

When the value for WorkingSetMaximum is reached, the report server does not accept new requests. Requests that are currently in progress are allowed to complete. New requests are accepted only when memory use falls below the value specified through WorkingSetMaximum.

If existing requests continue to consume additional memory after the WorkingSetMaximum value has been reached, all report server application domains will be recycled. For more information, see Application Domains for Report Server Applications .

If this value is specified in kilobytes (KB), then the WorkingSetMaximum defined above is only about 1GB of memory.  That definitely won’t be enough.  We won’t get very far with only 1GB of maximum memory.  From the Reporting Services Trace Log, we can see what the customer’s total physical memory was.

servicecontroller!DefaultDomain!9e80!01/26/2015-13:48:48:: i INFO: Total Physical memory: 137372422144 <--137GB!!!

I’m guessing that the thought was that WorkingSetMaximum was defined in megabytes (MB) instead of kilobytes (KB).  So, if we really wanted 100GB for the WorkingSetMaximum, we would need to add two more 0’s.

Verbose Logging

Verbose Logging can also help you understand the situation a little better, especially if the out of memory condition doesn’t happen right away.  From a memory perspective, I only really care about the appdomainmanager category.  We can set that up for verbose by setting the following within the ReportingServicesService.exe.config file.

<system.diagnostics>
<switches>
<add name="DefaultTraceSwitch" value="4" />
</switches>
</system.diagnostics>
<RStrace>
<add name="FileName" value="ReportServerService_" />
<add name="FileSizeLimitMb" value="32" />
<add name="KeepFilesForDays" value="14" />
<add name="Prefix" value="appdomain, tid, time" />
<add name="TraceListeners" value="file" />
<add name="TraceFileMode" value="unique" />
<add name="Components" value="all:3;appdomainmanager:4" />
</RStrace>

Here is what the output will look like.  I included some INFO messages to show the flow of what happened.

library!DefaultDomain!33e4!01/29/2015-05:44:31:: i INFO: Initializing MemorySafetyMargin to '80' percent as specified in Configuration file.
library!DefaultDomain!33e4!01/29/2015-05:44:31:: i INFO: Initializing MemoryThreshold to '90' percent as specified in Configuration file.
library!DefaultDomain!33e4!01/29/2015-05:44:31:: i INFO: Initializing WorkingSetMaximum to '1000000' kilobytes as specified in Configuration file.
library!DefaultDomain!33e4!01/29/2015-05:44:31:: i INFO: Initializing WorkingSetMinimum to '400000' kilobytes as specified in Configuration file.

servicecontroller!DefaultDomain!3034!01/29/2015-05:44:35:: i INFO: Total Physical memory: 34289934336

library!ReportServer_0-2!17a0!01/29/2015-05:45:46:: i INFO: RenderForNewSession('/MemoryPressure/MemoryHogContoso')
library!ReportServer_0-2!31f8!01/29/2015-05:45:46:: i INFO: RenderForNewSession('/MemoryPressure/MemoryHogContoso') <-- This is the report request received before we started seeing the shrink requests.

appdomainmanager!DefaultDomain!18e4!01/29/2015-05:45:50:: v VERBOSE: Received NotifyMemoryPressure(pressureLevel=MediumPressure, kBytesToFree=33960)
appdomainmanager!WindowsService_0!18e4!01/29/2015-05:45:50:: v VERBOSE: Memory Statistics: 0 items, 0KB Audited, 0KB Freeable, 924384KB Private Bytes
appdomainmanager!WindowsService_0!18e4!01/29/2015-05:45:50:: v VERBOSE: Spent 3ms enumerating MAP items and 2ms dispatching notifications.

appdomainmanager!DefaultDomain!18e4!01/29/2015-05:45:51:: v VERBOSE: Appdomain (ReportServer) attempted to free 23016 KB.

appdomainmanager!DefaultDomain!18e4!01/29/2015-05:45:52:: v VERBOSE: Received NotifyMemoryPressure(pressureLevel=HighPressure, kBytesToFree=121216)

appdomainmanager!WindowsService_0!18e4!01/29/2015-05:45:52:: v VERBOSE: Skipping shrink request for appdomain (WindowsService_0) because no memory consuming requests are registered.
appdomainmanager!ReportServer_0-2!18e4!01/29/2015-05:45:52:: v VERBOSE: Skipping shrink request for appdomain (ReportServer_MSSQLSERVER_0-2-130670054877986461) because not enough time has passed since last shrink request.

Failed allocate pages: FAIL_PAGE_ALLOCATION 1

Assuming the issue was not due to a low value setting on the WorkingSetMaximum value, I would look to see what report executed before we started seeing the shrink requests and look at how that may be consuming a lot of memory.  Depending on your server, it may be a result of several report requests working together and we would need to see what we can do to stagger them or pull back the amount of data they are consuming. 

If it is due to the number of users hitting the Report Server, you may want to look at going into a scale-out configuration to spread the load. Also, if you are running subscriptions, you could look at offloading those to a separate server from on demand report requests.

Performance Monitor

Performance Monitor (perfmon) can be used to see the consumption as well.  Unfortunately, it won’t really do a lot to help you pinpoint the problem.  It will just help you confirm that you do have a problem.

SNAGHTML601db6

SNAGHTML60c783

The counters I used were the following.

Process : Private Bytes (ReportingServicesService)
Process : Virtual Bytes (ReportingServicesService)
Process : Working Set (ReportingServicesService)
Process : Working Set – Private (ReportingServicesService)
ReportServer : Service : Memory Pressure State
ReportServer : Service : Memory Shrink Amount
ReportServer : Service : Memory Shrink Notifications/sec

Unfortunately, the ReportServer : Service counters did not appear to pick anything up.

To wrap up, there isn’t a magic bullet when it comes to memory issues. We need to investigate what the Report Server is doing and what is running. It could be because of a Report, a 3rd Part Extension, custom code segments.  Try to narrow it down to a specific report and go from there.  Also, make sure Reporting Services is playing nice with other services on the machine, if there are any.  You can use WorkingSetMaximum to do that.

 

Adam W. Saxton | Microsoft Business Intelligence Server Escalation Services
Twitter    YouTube