In June I introduced you to a new Best Practices Analyzer for SQL Server, SQL Server 2008 R2 BPA:
I’ve seen some mixed reaction to this tool. But when I verbally talk to some about the type of knowledge CSS has put into the rules, they see a greater value. Therefore, I thought I would put together a series of blog posts with some examples of this knowledge.
The first is something I call “Missing Updates”. While Microsoft Update does provide a great service to our customers, one thing it does not do is proactively provide advice to customers on updates that might affect their operation of our product based on the knowledge of CSS customer experiences.
Consider the following scenario. You upgrade to SQL Server 2008 and now a query that ran fine in SQL Server 2005 fails with the following error:
Msg 605, Level 21, State 3, Line 1
Attempt to fetch logical page (1:225) in database 2 failed. It belongs to allocation unit 281474980315136 not to 504403158513025024.
Yikes. Corruption in tempdb? You are not sure exactly the problem is and never run CHECKDB in tempdb (don’t blame you I wouldn’t either). You decide to go ahead and run CHECKDB in tempdb but alas no errors found.
You decide to do a Bing search for “605 and tempdb” and this article is one of the top hits:
This looks promising. You see this problem is fixed in Cumulative Update 3 for SQL Server 2008 RTM and realize you do not have this build installed. Great. You found the answer and didn’t have to ask anyone for help. So you download this build, setup a time to install it during the evening, and the next morning you will be the hero as no one will complain anymore about these new 605 errors.
But when you come in the next morning you get all types of complaints about “this new 605 error is still happening”. Sigh. You thought it was resolved and had an article that explains the problem but the fix from Microsoft didn’t work. You place a call into Microsoft CSS to help you with the problem. The first question you get after describing the situation is “Did you turn on the trace flag?”. What trace flag? I never saw anything about a trace flag? You go back to this article and read the fine print.
So you still may be a hero after all but that was not fun to have to call Microsoft to learn you need a trace flag to enable this fix. You find out the trace flag was needed because the fix involved the Query Optimizer (QP) and it has become a standard practice for SQL Server to enable QP fixes only by a trace flag.
While you now have the solution, it sure would have been nice if:
- I could have run something on my SQL Server 2008 server after upgrading to “let me know about important updates from CSS”
- I could have run something when the original fix didn’t seem to work to “check if I did it right”
If you look carefully down towards the end of this KB article, you will find the answer to these wishes:
The SQL Server 2008 R2 BPA tool has a rule to help detect if you are running a version of SQL Server where this fix is not applied. But the rule just doesn’t check the version. It also ensures the trace flag has been applied.
Let’s say that on this server you run SQL Server 2008 R2 BPA after installing CU3 for SQL Server 2008 RTM but had not applied the trace flag. You might find a Warning from the tool like this:
Notice the comments under the Issue:
[ TF 4135 and 4199 not enabled. We also detected occurrences of Event ID 605 or 824 which might be related to the problem…. ]
Now look at the Resolution:
Resolution: Enable trace flag 4135 or 4199
Trace flag 4199 is a more “general” trace flag to enable multiple QP fixes so either one will work (more on that in later).
So BPA is giving you explicit instructions to enable the trace flag. If you didn’t have the actual version applied it would also tell you that If you select “More Information” you will be directed to the article you first discovered on the web, 960770, which ties this all back together. As a bonus, it also looks for any occurrences of 605 or 824 (824 can also be a symptom) involving tempdb in the event log. This just gives you a stronger tie that you are actually hitting this problem.
Behind the scenes is a fairly complex rule for checks of versions and trace flags. Why? Because for example trace flag 4199 is only applicable in certain versions (this trace flag was not available until a specific build of SQL Server 2008 and SQL Server 2008 R2) and we have to account for the proper versions of this fix for both SQL Server 2008 and SQL Server 2008 R2.
How did CSS come up with this a recommended fix for BPA? We actually polled our engineers internally and looked at customer case experiences. We found out that some customers didn’t know about the fix and even for those who did they almost always forgot the trace flag.
This is one example of how BPA can be used to help you proactively address common problems that come up during the maintenance and operation of SQL Server 2008 and SQL Server 2008 R2.
Our next case involves the mystery of the missing installer cache file. Stay tuned.