Changes introduced in CU#7 may impact Analysis Services backup and sync operations

Cumulative Update #7 for SQL Server includes a fix for Analysis Services backups to correct a problem with truncation of metadata files as described in KB Article 950968. The fix involves changes to the locking levels used for backups, thereby avoiding the truncation issue which could have been encountered previously. Unfortunately, the increased locking levels come at a price.

Prior to this fix, a running backup was not treated as a blocking event for a processing commit, and thus would be allowed to run to completion, even if it meant that a processing operation could not commit for an extended period of time. After the fix a backup is treated as a blocker for a commit operation. Due to the way that lock compatibility operates in Analysis Services, once the commit operation is blocked, all incoming requests against objects in the database are also blocked. Analysis Services handles blocked commit requests via the ForceCommitTimeout server setting. As a result after CU #7 is applied a backup operation will be cancelled if a commit of a processing operation is blocked for more than 30 seconds (default ForceCommitTimeout value) by the backup.

If the backup is cancelled the following error message will be returned:

Server: The operation has been cancelled.

  (Microsoft.AnalysisServices)

 

For common situations this scenario can be avoided by scheduling database backups at a time that will not overlap with processing of objects in the database. However, for some scenarios, such as proactive caching enabled partitions that process frequently, the frequent proactive caching processing operations will prevent a backup from ever completing successfully. In this type of scenario you have several options available for generating a successful backup.

The first two options involve adjustments to the ForceCommitTimeout and/or CommitTimeout settings on the server.

One option is to increase the ForceCommitTimeout value so that it is long enough to allow the backup to complete. However, it is important to note that the blocked commit will also block any other operations on the database being backed up. This essentially means that once a processing operation attempts to commit on an object in the database being backed up, the database will be unavailable for the remaining duration of the backup.

A second option is to set the CommitTimeout to a value other than the server default of 0 (infinite), while also increasing the ForceCommitTimeout such that it is greater than the CommitTimeout. With this approach you are essentially setting the maximum amount of time that you are willing to wait for a backup or query to complete before cancelling the commit (CommitTimeout) for a process operation and allowing the backup or long running query to continue uninterrupted. If you choose this approach, the commit of the processing operation will wait for up to the amount of time specified in CommitTimeout to elapse, possibly blocking access to the database for the specified time, and then rollback the processing operation if it was unable to commit within the allotted time. Proactive Caching can automatically handle these processing failures and will reattempt processing at the next specified interval. If you have other processing operations which may be impacted by this approach you will need to make sure that they have sufficiently robust retry logic to handle failures caused by long running queries.

Changes to these settings do not require server restarts to take place, so they can be changed prior to the backup and then reset back to the default values after the backup completes.

Another option is to temporarily disable proactive caching for the proactive caching enabled partitions in your database, making the backup, and then re-enabling proactive caching. Unfortunately, there is no built in AMO command to disable proactive caching for an object. To disable proactive caching you can either issue alter XMLA commands for the partitions to change the proactive caching settings, or alter the datasource for the proactive caching enabled partitions so it points to a non-existent RDBMS server. The drawback to this approach is that the backup will contain objects which have been altered to settings other than those desired for actual production operations.

There may be other options, but these are the ones tested and discovered so far.

The change in locking levels and behavioral changes described above also apply to sync operations. In the case of sync operations the same potential remedies apply.

 

Wayne Robertson

SQL Server Escalation Engineer