It’s not unnatural to assume an IIS process hang when web client browsers begin reporting either “Page cannot be displayed . . . cannot find server or dns error” (IE6) or “Internet Explorer cannot display the webpage. . .” (IE7). But when an IISRESET proves insufficient to provide temporary relief, it is time to begin investigating possibilities outside of IIS. When it is clear that a reboot of the server is needed for temporary respite, troubleshooting should be sensitive to the possibility of a NPP (Non-Paged Pooled) memory leak in the kernel. Checking the httperror log for evidence of “connections_refused” should verify whether or not NPP is depleted. Poolmon and Perfmon can be used to confirm the NPP leak and determine root cause. Root cause often points to an outdated third-party driver which, when updated or uninstalled, solves the NPP leak.
The best clue needs to be seen in the little-known httperr log. While there are many possible causes for the “Page cannot be displayed” error, there is only root cause which causes the http.sys driver to begin refusing client connections–a depletion of non-paged pooled memory, an NPP leak. The HTTP.sys driver was new with Windows 2003, is a kernel mode driver, and, at the risk of splitting hairs, is technically not part of IIS 6.0. This distinction is important in troubleshooting. When http.sys refuses to hand connections to IIS a “Connection_refused” or “Connections_refused” will be logged in the httperr log (C:\WINDOWS\system32\Logfiles\HTTPERR) rather than the IIS logs. Also, if adplus.vbs or debugdiag is used to make a memory dump of IIS processes at this time, dump analysis will show no problems (and probably no connections either) going on in the IIS processes.
KB 820729 offers only one possible explanation for a “Connections_refused” error: “The kernel NonPagedPool memory has dropped below 20MB and http.sys has stopped receiving new connections.” I wouldn’t get too set on the 20MB threshold as there seem to be exceptions. Sometimes connections get refused when it appears like there is more than 20MB of NPP left. So it may be more like 20% of the total NPP memory rather than 20MB. Perhaps when NPP has become 80% depletion, http.sys will begin to refuse client connections. But whatever the official answer is there, suffice it to say that if you see “connections_refused” in the httperr log, you can safely assume this is an NPP leak. With its default settings, HTTP.sys is the ‘canary in the coal mine’ in an NPP leak scenario. On an Exchange server, for instance, the OWA website becomes unavailable to clients long before SMTP stops delivering mail. But give the leak enough time to leak further and mail will stop flowing. Allow the NPP leak to go further and the administrator probably won’t be able to connect via RDP to the server.
So once evidence of http.sys refusing client connections in the httperr log, what comes next? Some decisions have to be made.
If immediate relief is needed, perhaps the server needs to be rebooted and troubleshooting can begin after the reboot with perfmon and poolmon utilities. The poolmon and perfmon tools are the best way to find which driver is causing the leak. (I’d say more about this but the Microsoft Product Support Platforms Performance team usually handles this for me.) But if root cause analysis is needed more immediately than relief of the symptoms, the reboot needs to be avoided. Perfmon and Poolmon can be used to track the leak further. No IIS websites will be reachable of course, but other services may continue to function a little longer.
If the server is on SP2 for Windows 2003, it may be a very good idea to try disabling the TCPChimney. This can be done without a reboot per kb 945977. Try disabling the TCPChimney, run an IISRESET, wait a few seconds, and test the websites. If this ‘trick’ works, perhaps your NIC drivers need to be updated and the TCPChimney can be enabled later.
Open Taskmanager, checkmark “Show processes from all users,” expand the View menu, select Show Columns, and add checkmarks beside Non-paged Pool, Username, and PID. This may show you if there are any processes in particular which seem gluttonous with the NPP memory. Such information may or may not provide a good clue. Regardless, it will help you know where your NPP is allocated.
Since the usual cause of an NPP leak is an outdated driver, try stopping various services one at a time (such as antivirus services or backup software services), run IISRESET, wait a few seconds, and test the websites. This method might give you a clue as to what is leaking. If, for example, you stop antivirus services and NPP levels increase and remain stable, perhaps the ultimate solution is simply to contact the Antivirus vendor and ask for recommended updates to the product installed on the server. It may be that simple.
You also might consider whether or not you want to enable aggressive memory usage in the registry (EnableAggressiveMemoryUsage). This allows http.sys to keep accepting and handing off client connections to IIS well below 20 MB. This can give you some added relief but doesn’t solve the NPP leak.
If you have the /3GB switch set, you might want to consider whether it is truly needed or not. Removing it may give you more NPP. But it also may cause other problems.
References and Resources
933844 Error message when you try to view a Web page that is hosted on IIS 6.0: “Page cannot be displayed”
934878 Users receive a “The page cannot be displayed” error message, and “Connections_refused” entries are logged in the Httperr.log file on a server that is running Windows Server 2003, Exchange 2003, and IIS 6.0
http://support.microsoft.com/?id=945977 – Some problems occur after installing Windows Server 2003 SP2
912376 How to monitor and troubleshoot the use of paged pool memory in Exchange Server 2003 or in Exchange 2000 Server
177415 How to Use Memory Pool Monitor (Poolmon.exe) to Troubleshoot Kernel Mode Memory Leaks
248345 How to Create a Log Using System Monitor in Windows 2000
244139 Windows Feature Allows a Memory.dmp File to Be Generated with Keyboard
Event ID 2019 or 2020 or “Insufficient System Resources” error returned when logging on
You experience difficulties when you use an Outlook client computer to connect to a front-end server that is running Exchange Server 2003
820129 Http.sys registry settings for IIS
820729 Error logging in HTTP API
Memory Management – Understanding Pool Resources
Memory Management – Demystifying /3GB