SharePoint Foundation 2010 and SharePoint Server 2010 introduced a pretty cool new feature of built-in health rules. These rules will monitor for commonly known issues and helps to alert you if your SharePoint farm may be experiencing issues that either are causing problems or may cause problems in the near future. For more information on these health rules, please see the following links:
I received an email from a customer today inquiring about the SharePoint Server 2010 health rule titled "Drives are running out of free space". They were working on creating a build document and had never thought about disk sizing of their application servers and web-front ends (WFEs). This health rule was indicating to them that some of their servers were running low on free space.
To see the alerts that have fired you can navigate to your Central Administration site and click on Monitoring and then click 'Review problems and solutions'. (Alternatively, from the Central Admin you can click on the "View these issues" hyperlink.)
Clicking on either 'View these issues' from the main Central Admin page or clicking 'Review problems and solutions' takes you to the issue report page and from here you can see a list of all of the health rules that have fired and are showing alerts. In my lab server I have forced the disk space rule to fire by filling up my primary drive*.
Clicking on the rule in question will take you to the details of the alert
There is a lot of information in the detail report above. A couple of things to point out are the Severity and Explanation. The Severity notes this alert as an Error. Some health rules will different levels of reporting. The DiskSpace rule currently has two levels - Warning and Error. The Explanation field includes information such as the Server and specific logical disk which is being reported as a violation.
It's important to understand exactly what this rule is looking for in order to make the most of the information presented. The two severities are defined as follows:
- Error - The logical disk has less than two times the amount of total physical RAM on the server
Text of Error:
Available drive space is less than twice the value of physical memory. This is dangerous because it does not provide enough room for a full memory dump with continued operation. It also could cause problems with the Virtual Memory swap file:
- Warning - The logical disk has between two and five times the amount of total physical RAM on the server
Text of Warning:
Available drive space is less than five times the value of physical memory. This is dangerous because it does not provide enough room for a full memory dump with continued operation. It also could cause problems with the Virtual Memory swap file:
The other item to understand about how this rule works is that it will run this health analysis on EVERY logical disk on the server that is online and defined as a fixed disk (no network drives). What that means -and was the case for this customer- is that even data drives such as a tools drive will be checked and reported if these disks do not have enough free space. The other thing that this means is that if you have a server with 64GB of memory this rule will fire a warning if every disk on the server does not have more than 320GB of free space!
The recommendations that I provided to this customer were as follows:
- Understand what the rule uses as criteria and ignore the alerts that are unimportant. I don't necessarily like this option as that trains administrators to ignore alerts.
- Disable this alert and create a custom health rule that analyzes the desired disks (maybe a configurable option?)
- Disable this alert and use SCOM (Microsoft System Center Operations Manager) to monitor the desired disks.
Creating custom health rules -
I used the built-in tool fsutil.exe to create a large file that filled my disk so that the health rule would fire. http://www.microsoft.com/resources/documentation/windows/xp/all/proddocs/en-us/fsutil.mspx?mfr=true