I was on-site with a customer last week performing a Team Foundation Server Health Check (TFSHC). While I was there, I noticed that their SQL Server Analysis Services 2008 R2 SP1 instance had been crashing. So, I did what any good PFE would do and grabbed a copy of the crash dumps (SQLDmpr0001.mdmp) and analysed them. You can find these in the \OLAP\Log folder under your instance directory. Normally something like: C:\Program Files\Microsoft SQL Server\MSSQL.2\OLAP\Log.
Here’s the stack trace for the crash:
This stack trace was enough to help me find the bug in the internal Microsoft support systems.
The root cause for this bug is a version mismatch – the old cube object is calling GetInfo based on new info. It was at this point, I realised that I’d seen this same issue before. The SSAS bug was raised by a member of the TFS test team after they found the issue on the internal dogfood servers.
Now, before we move on – let’s get one thing clear: SQL, Analysis Services and TFS should never crash – any time they do, that’s a bug that needs to be addressed.
Although this issue can occur with any version of TFS (2005-2012) and SQL2008 or SQL2008R2, this particular bug is the reason why the TFS Installation Guide for TFS2012 has steps to “Configure Analysis Services to Recover on Failure”.
Now that all those versions of SQL Analysis Services have been patched, there’s no reason to change the SSAS service to automatically restart on failure. In fact, if you do restart automatically on failure AND you don’t have adequate monitoring in place, you will mask subsequent failures.
The conclusion here is very simple. Keep your systems patched and up to date, and you will avoid many of the issues which have already been found and fixed.