Troubleshooting Subscriptions: Part II, Using the Reporting Services Trace Log File

Hey There – About a year ago I posted this blog about troubleshooting subscriptions. Since then I’ve received quite a few questions and come across more subscription problems in our production Reporting Services server that identified a few gaps in my previous posting, so I thought I would elaborate more on the subject here.

To recap a few key points from before:

When you create a subscription several things are added to the RS server:

- A row is placed in the Subscriptions table…

- A row is placed in the Schedule and ReportSchedule tables…

- A SQL Server Agent job is created…

When the subscription runs several things happen:

- The SQL Server Agent job fires and puts a row in the Event table…

- The RS server service has a limited number of threads (2 per CPU) that poll the Event table…

- When it finds an event, it puts a row in the Notifications table and starts processing

That blog content is a good start at isolating the issue to one of the key sources of problems, but since then I’ve found that many of the subscription problems I deal with fall into the “starts processing… ” point above. I often find my investigation of these problems go well beyond my “this can usually be tested by running the report manually” recommendation, so I’ll attempt to give this area better justice here by sharing some of my experience with the trace log file.

The Trace Log: More Than You Ever Needed or Wanted to Know About Your RS Server

The trace log file is created in a folder under your SQL Server installation path. In my machine this happens to be:

C:\Program Files\Microsoft SQL Server\MSRS10_50.MSSQLSERVER\Reporting Services\LogFiles

for a default install of SQL Server 2008 R2. Previous versions of SQL Server will be named differently, but your’s should be in a similar location. A quick search of the SQL Server books online will give you some configuration options that allow you to control how big the files get, how long they are kept, and the type of information and activities that are logged to the file and so on. For a detault RS installation, these files will be limited to a max size of 32mb and “rollover” to a new file if that size is exceeded, and old files will be deleted after 14 days. The file will also be closed and recreated each time the RS server is recycled, which by default will happen every day at midnight. The name of the trace log file will be similar to "ReportServerService__02_15_2010_12_56_37.log", and the “02_15_2010_12_56_37” portion will indicate the date/time the file was created in the “MM_DD_YYYY_HH_mm_SS” format. If you happen to be researching a problem that just ocurred, you can just view the newest file, otherwise you will have to find the file that contains the range of information including the entries you are looking for, based on the timestamps in the file names and the entries in the file.

A common response when people ask for help with a report problem is, “What did you see in the trace log file?”. However, it can be a bit daunting the first time you open the file, and it can take some practice to get good at reading it effectively. The first problem is the sheer bulk of content that is logged in the multi-mb file, and the next problem is the cryptic nature of the multi-mb content.

I’d like to help you make sense of this information to diagnose and fix subscription and other Reporting Services problems.

This file contains a multitude of information about practically every action, operation, process etc that occurs in the RS server. Some of them are for internal processing, and some are related to the processing of reports, subscriptions or other user initiated or defined operations. For example:

library!WindowsService_9!12c0!02/15/2010-02:00:16:: e ERROR: Throwing Microsoft.ReportingServices.Diagnostics.Utilities.LogonFailedException: Logon attempt for user 'username' failed., Microsoft.ReportingServices.Diagnostics.Utilities.LogonFailedException: Log on failed. Ensure the user name and password are correct. ---> System.ComponentModel.Win32Exception: Logon failure: unknown user name or bad password

There are several aspects of this entry that are worth noting. First of all, each "entry" is prefixed by a set of fields similar to:

processing!WindowsService_9!12c0!02/15/2010-02:00:16::

Which can be parsed down to:

"Processing": The library within the RS server

"WindowsService_9":The AppDomain within the library

"12c0": The thread within the Appdomain

"02/15/2010-02:00:16": The timestamp when the information was logged

Secondly, most entries contain the strings “I Info”, “w Warn”, or “e Error”; the “e Error” entries are usually the most interesting and valuable, yet the other entries around the error entries can also contain interesting information. The remaining parts of the entry line describe what occurred in the server.

And finally, as you will see below, there is sometimes a set of callstack frames following (what I call) the regular entry lines, and they typically have the form:

at System.Data.SqlClient.SqlConnection.OnError(SqlException exception, Boolean breakConnection)

at System.Data.SqlClient.TdsParser.ThrowExceptionAndWarning(TdsParserStateObject stateObj)

These entries refer to the code inside the Reporting Services server and its referenced libraries. They will not typically be of much value to you in researching report or subscription errors, but if you do happen to experience other types of RS server problems (or yes, even server bugs we may have unfortunately missed in testing), including the callstack with an explanation of the problem will be very helpful for us or CSS to understand more about the problem.

Locating Subscription Entries in the Trace Log File

Step 1: Get the Schedule Id

A good general approach to researching problems in the trace log file is to determine when a problem occurred, find the entries in the log file around that time, and just start poking around. This can be enough if you have a quiet RS server with very little activity, but an active server can have hundreds of entries within a few minutes, so “poking around” may not be too successful.

The next step would be to get a more precise date/time value for when a problem occurred, this can be obtained from the TimeStart or TimeEnd fields in the ExecutionLog table, or the LastRunTime field of the Subscription table in the catalog database. But again, even with this information, there may still be hundreds of entries in the trace log file at key times (say, 8am) when there is a lot of other successful and unsuccessful subscription or report activity in the server at the same time as the failed subscription.

The key point in isolating subscriptions further goes back to what happens “When a subscription runs”. As I said before, when a subscription runs, the status in the subscription table is updated, but I should have added “and entries are placed in the trace log file with the schedule id.” In other words, the key link between a subscription execution and the entries in the trace log file is the schedule id for the subscription. The schedule id must be obtained from the Subscription table, it is not visible via the UI tools, and the query below can help, assuming you understand the legal stuff:

Disclaimer: we do not officially support querying all of the Reporting Services tables mentioned below, the columns and information in them may change in the future.

select

'SubnDesc' = s.Description,

'SubnOwner' = us.UserName,

'LastStatus' = s.LastStatus,

'LastRun' = s.LastRunTime,

'ReportPath' = c.Path,

'ReportModifiedBy' = uc.UserName,

'ScheduleId' = rs.ScheduleId, -- This is what you need

'SubscriptionId' = s.SubscriptionID

from ReportServer.dbo.Subscriptions s

join ReportServer.dbo.Catalog c on c.ItemID = s.Report_OID

join ReportServer.dbo.ReportSchedule rs on rs.SubscriptionID = s.SubscriptionID

join ReportServer.dbo.Users uc on uc.UserID = c.ModifiedByID

join ReportServer.dbo.Users us on us.UserID = s.OwnerId

Step 2: Find That Schedule ID in the Trace Log File

Once you have the schedule id for the subscription, you can look for the value in the trace file to find the entries for that subscription. The good news is that the schedule id is a guid; the only entries you find in the file with that string will be the entries for this subscription.

àSide Note: subscriptions can use either a custom schedule created for that subscription, or they can use a shared schedule. I only use custom schedules, and the examples here assume a custom schedule.

The not-so good news is that if the subscription fired several times within the timeframe of the contents in the trace log file, you will have to find the entry nearest to the one that failed based on the timestamp of the entry.

In my local server, I created a subscription and waited for it to fire. The rows in the Subscription table using the query above told me that my schedule id was “4459bc37-97d5-46c4-bd9d-d1e97e4e1cac”, and the first entry with this guid was:

library!WindowsService_0!1354!02/15/2010-14:51:03:: i INFO: Schedule 4459bc37-97d5-46c4-bd9d-d1e97e4e1cac executed at 02/15/2010 14:51:03.

àSide Note: I’m using the latest and greatest version of Reporting Services in SQL 2008 R2, and my dev partner who helps me with this stuff improved the log file content for subscriptions in this release. Your log file content may be somewhat different, but the approach is the same.

Step 3: Look at the other entries with the same thread in the same time proximity

Based on the entry above with the schedule id, you can see that the thread is “1354”, and searching on that thread, you will see another set of entries similar to:

schedule!WindowsService_0!1354!02/15/2010-14:51:03:: Creating Time based subscription notification for subscription: 41344f16-88e5-43b2-9e6b-1996e210eb6b

library!WindowsService_0!1354!02/15/2010-14:51:03:: i INFO: Schedule 4459bc37-97d5-46c4-bd9d-d1e97e4e1cac execution completed at 02/15/2010 14:51:03.

notification!WindowsService_0!1354!02/15/2010-14:51:03:: i INFO: Handling notification of subscription 41344f16-88e5-43b2-9e6b-1996e210eb6b to report Reportname via delivery provider Report Server FileShare

library!WindowsService_0!1354!02/15/2010-14:51:04:: i INFO: RenderForNewSession('/pathname/reportname’)

You may notice that the subscription id guid is also in the entries here, they can also be used, but in previous versions of RS we only logged the schedule id, so I'm focusing on it here instead.

In this example, the subscription was successful, yeah, big deal, but it is a good start in learning how to find the information when they are not successful. I also used a quiet local report server, so all my entries were literally next to each other in the file making them easy to identify. But again, in our production server, there are a lot of subscriptions firing at common times, so their entries may be spread more broadly throughout the file. Also, be careful to watch the timestamps of the entries for each thread, if the same thread happened to process another subscription later, you need to make sure you don’t associate your current subscription with the entries for the next.

Common Types of Subscription Failures

Now, lets have more fun and look at a couple common types of subscription errors and their trace log file entries.

Invalid logins

In today’s world of defense against hackers (those guys really take the fun out of developing softwareL) the sorry truth is that SQL passwords have to expire. This is the most common cause of recurring problems I see, and they often happen when people forget to update their passwords in all the necessary data sources. In my example here, I changed the password for the data source of the report used by my subscription to be invalid. The error displayed in the subscription table and in Report Manager was simply “An error has occurred during report processing.” which doesn’t tell much, but the entries recorded in the trace log file included a much more useful error:

processing!WindowsService_0!fd8!02/15/2010-20:52:06:: e ERROR: Throwing Microsoft.ReportingServices.ReportProcessing.ReportProcessingException: , Microsoft.ReportingServices.ReportProcessing.ReportProcessingException: Cannot impersonate user for data source 'datasourcename'. ---> Microsoft.ReportingServices.Diagnostics.Utilities.LogonFailedException: Log on failed. ---> System.ComponentModel.Win32Exception: Logon failure: unknown user name or bad password

Report Query Errors

Another common theme I see when users say “my subscription is not working” is, report queries that were successful when the report is created may not always stay that way; they may later return typical TSQL errors indicating deadlocks, query timeouts, divide by zero, invalid casting and so on. In my example, the report was based on a call to a stored procedure, so I changed the procedure name in the report dataset to an invalid name, and when the subscription fired Report Manager still displayed the error “An error has occurred during report processing.”, but the trace file displayed this entry:

subscription!WindowsService_0!fd8!02/15/2010-21:09:08:: i INFO: Microsoft.ReportingServices.Diagnostics.Utilities.RSException: An error has occurred during report processing. ---> Microsoft.ReportingServices.ReportProcessing.ProcessingAbortedException: An error has occurred during report processing. ---> Microsoft.ReportingServices.ReportProcessing.ReportProcessingException: Query execution failed for dataset 'datasetname'. ---> System.Data.SqlClient.SqlException: Could not find stored procedure 'databasename.dbo.badprocname'.

at System.Data.SqlClient.SqlConnection.OnError(SqlException exception, Boolean breakConnection)

at System.Data.SqlClient.TdsParser.ThrowExceptionAndWarning(TdsParserStateObject stateObj)

Data Driven Subscription Errors

Both of the examples above were the result of common report definition problems, and as I alluded to in my other posting, these types of problems could have been found by just running the report locally using a report designer where the errors are returned to the client. This is how I often research problems with the reports I own, because a) I don’t have to hassle with the trace log file, and b) I commonly have to fix the problem using the report designer anyways.

But another recent case I dealt with was related to a data driven subscription. The IT admin of the server knows just enough about Reporting Services to be dangerous, and their response to me (and a large alias of RS people who watch the server) was “Why is the subscription failing when the report runs fine manually? Subscriptions are not working, please fix!”. To make matters worse, the report and subscrition permssions were locked up tight and I couldn't view it in Report Manager. This is a case where you cannot just run the report to see the problem, you have to look in the trace log file. For this example I created a valid data driven subscription, and the entries in the log file were similar to:

library!WindowsService_0!fd8!02/15/2010-21:19:03:: i INFO: Schedule 2fa7246d-c112-496f-969b-680e279413d3 executed at 02/15/2010 21:19:03.

schedule!WindowsService_0!fd8!02/15/2010-21:19:03:: Creating Time based subscription notification for subscription: 3c963022-70b7-4172-926a-f9a045a2d38f

library!WindowsService_0!fd8!02/15/2010-21:19:03:: i INFO: Schedule 2fa7246d-c112-496f-969b-680e279413d3 execution completed at 02/15/2010 21:19:03.

As you may notice, the log file entries for the data driven subscription were in a somewhat different format from regular subscriptions.

To simulate the error, I dropped the table that was used in my data driven subscription query, and the trace log entry was:

library!WindowsService_0!18e4!02/15/2010-21:38:03:: e ERROR: Error processing data driven subscription: System.Data.SqlClient.SqlException: Invalid object name 'dbname.dbo.testtable'.

at System.Data.SqlClient.SqlConnection.OnError(SqlException exception, Boolean breakConnection)

at System.Data.SqlClient.TdsParser.ThrowExceptionAndWarning(TdsParserStateObject stateObj)

Ok, the subscription status I could have queried for this particular error was also “Error: Invalid object name 'dbname.dbo.testtable'.”, but I’m on such a roll with the trace log file I thought I would include it anyways.

File Delivery Errors

As I was creating the subscriptions to demonstrate the log file information for this blog, I have been using file share delivery because my local box is not configured with SMTP to deliver emails, and in the process I mistakenly used the wrong password for the credentials to access the file share in my subscription. The error displayed in Report Manager was “Done: 1 processed of 1 total; 1 errors.”, but the entry in the trace log file was:

subscription!WindowsService_0!fd8!02/15/2010-21:34:10:: i INFO: Error writing file xx.xml to path \\myserver\myshare

Using the Trace Log File to Investigate Other Types of Errors

The errors mentioned above are just a sampling of the types of problems that can affect subscriptions, many are the same as or very similar to the types of errors that affect live report execution outside of subscriptions. I won’t take blog space to cover more here, but there are many other types of problems that can be diagnosed through the trace log file. They can also be investigated using many of the same techniques I mention here, the only difference is that with problems with subscriptions, you start by searching for the schedule id. But for reports, the entries are in a form similar to:

library!WindowsService_0!fd8!02/15/2010-21:34:03:: i INFO: RenderForNewSession('/pathname/reportname')

Problems with data sources are similar to:

processing!WindowsService_0!18e4!02/15/2010-20:50:05:: e ERROR: Throwing Microsoft.ReportingServices.ReportProcessing.ReportProcessingException: , Microsoft.ReportingServices.ReportProcessing.ReportProcessingException: Cannot impersonate user for data source 'datasourcename'. ---> Microsoft.ReportingServices.Diagnostics.Utilities.LogonFailedException: Log on failed. ---> System.ComponentModel.Win32Exception: Logon failure: unknown user name or bad password

The only difference is what you search on in the file.

And finally, many of the techniques I mention here refer to finding a starting location in the file and searching down for the entries with the same thread. Sometimes, the real problem happened before the entry you found, so if you don’t find what you need, you might want to try searching “up” through the previous entries for the thread.

Corellating the Trace Log Files With Other Data

I've mentioned the ExecutionLog table (and its related views) in my other blogs, and my esteemed peer Robert Bruckner covers it thoroughly in his blog here. This information is available in the report server catalog database and it also has a wealth of valuable information about the report executions and other activities in your report server. But when errors are shown there, the only information you have is the status of the report execution, ie "rsProcessingAborted", so you can use the steps above to take the report name and the execution time to search for the real problem in the trace log file.

The opposite is also true, if you happen to be strolling through the log file and come across error entries like those I listed above, you can also go to the ExecutionLog table to see:

- What was executing around the time that the trace log entry was recorded

- Who was executing it and what parameters they used

- Whether it was a subscription or an interactive (live) request

- Was it executed live from a data source or did it come from the cache

and so on. Note that subscription executions themselves are not in the ExecutionLog, only the reports that are run by the subscription are logged there. So if the subscription problem happens before the step where the report is executed, as in the data driven subscription example above, you won't find it in the ExecutionLog. The good news is that you can also cross reference the entries in the trace log file with the Subscriptions table based on the Schedule Id to learn a lot more about the subscription problem.

So Why Can't I QUERY the @@#$% Trace Log File???:(

If you've read my intro blog entry, I'm a database guy. The ever present thought in my mind when I'm searching and scrolling and searching and scrolling through log files is the same question, which I asked the first person who would listen to me about ten seconds after I first opened a trace log file. "Hey, we could write reports against it, make subscriptions on those reports for monitoring, aggregate data over time with cubes and dashboards and this and that and all kinds of other cool database stuff". So if you have these same thoughts, I've got your back.

When I calmed down and turned my hearing back on, the key fact is that we really have to weigh costs of this sort of thing. There is a huuuuge amount of data in the trace log files, and taking the time to log them in a SQL table during report server processing will cause a pretty big hit on the resources of the RS server and its catalog database, not to mention the costs of cleaning up old entries to prevent the catalog database files from filling up the disk and killing the whole server, and a few other issues we probably wouldn't realize and have to address until after it is all in place. We could trim the amount of data sent to the database table, but much of the content really is valuable, so what should we give up?

I did talk to an IS firend about using their features to load the file into the database, he said "Standard format? Check, no problem. Occasional callstacks?...", and that plan fell apart. Finally, I did create a TSQL parser and a snappy database design to process and store it for queries, and my more coding-oriented peers eventually shamed me into doing it with C#. Its pretty cool and useful, but not quite bulletproof enough for prime time yet. I may decide to post this solution at some point in the future, assuming I finish writing performance reviews in time and make space in my deliverables for the next cool release of RS.

Happy Reporting! I hope this helps, please let me know.

Thanks, Dean

As usual, "This blog is provided 'AS IS' with no warranties, and confers no rights."