Understanding Pool Consumption and Event ID: 2020 or 2019


 

Hi!  My name is Tate.  I’m an Escalation Engineer on the Microsoft Critical Problem Resolution Platforms Team.  I wanted to share one of the most common errors we troubleshoot here on the CPR team, its root cause being pool consumption, and the methods by which we can remedy it quickly!

 

This issue is commonly misdiagnosed, however, 90% of the time it is actually quite possible to determine the resolution quickly without any serious effort at all!

 

 

First, what do these events really mean?

 

Event ID 2020
Event Type: Error
Event Source: Srv
Event Category: None
Event ID: 2020
Description:
The server was unable to allocate from the system paged pool because the pool was empty.

 

Event ID 2019
Event Type: Error
Event Source: Srv
Event Category: None
Event ID: 2019
Description:
The server was unable to allocate from the system NonPaged pool because the pool was empty.

 

 

This is our friend the Server Service reporting that when it was trying to satisfy a request, it was not able to find enough free memory of the respective type of pool.  2020 indicates Paged Pool and 2019, NonPaged Pool.  This doesn’t mean that the Server Service (srv.sys) is broken or the root cause of the problem, more often rather it is the first component to see the resource problem and report it to the Event Log.  Thus, there could be (and usually are) a few more symptoms of pool exhaustion on the system such as hangs, or out of resource errors reported by drivers or applications, or all of the above!

 

 

What is Pool?

 

First, Pool is not the amount of RAM on the system, it is however a segment of the virtual memory or address space that Windows reserves on boot.  These pools are finite considering address space itself is finite.  So, because 32bit(x86) machines can address 2^32==4Gigs, Windows uses (by default) 2GB for applications and 2GB for kernel.  Of the 2GB for kernel there are other things we must fit in our 2GB such as Page Table Entries (PTEs) and as such the maximum amount of Paged Pool for 32bit(x86) of ~460MB puts this in perspective in terms of our realistic limits per processor architecture.  As this implies, 64bit(x64&ia64) machines have less of a problem here due to their larger address space but there are still limits and thus no free lunch.

 

*For more about determining current pool limits see the common question post “Why am I out of Paged Pool at ~200MB…” at the end of this post.

 

*For more info about pools:  About Memory Management > Memory Pools

*This has changed a bit for Vista, see Dynamic Kernel Address space

 

 

What are these pools used for?

 

These pools are used by either the kernel directly, indirectly by its support of various structures due to application requests on the system (CreateFile for example), or drivers installed on the system for their memory allocations made via the kernel pool allocation functions.

 

Literally, NonPaged means that this memory when allocated will not be paged to disk and thus resident at all times, which is an important feature for drivers.  Paged conversely, can be, well… paged out to disk.  In the end though, all this memory is allocated through a common set of functions, most common is ExAllocatePoolWithTag.

 

 

Ok, so what is using it/abusing it? (our goal right!?)

 

Now that we know that the culprit is Windows or a component shipping with Windows, a driver, or an application requesting lots of things that the kernel has to create on its behalf, how can we find out which?

 

There are really four basic methods that are typically used (listing in order of increasing difficulty)

 

1.)    Find By Handle Count

 

Handle Count?  Yes, considering that we know that an application can request something of the OS that it must then in turn create and provide a reference to…this is typically represented by a handle, and thus charged to the process’ total handle count!

 

The quickest way by far if the machine is not completely hung is to check this via Task Manager.  Ctrl+Shift+Esc…Processes Tab…View…Select Columns…Handle Count.  Sort on Handles column now and check to see if there is a significantly large one there (this information is also obtainable via Perfmon.exe, Process Explorer, Handle.exe, etc.).

 

What’s large?  Well, typically we should raise an eyebrow at anything over 5,000 or so.  Now that’s not to say that over this amount is inherently bad, just know that there is no free lunch and that a handle to something usually means that on the other end there is a corresponding object stored in NonPaged or Paged Pool which takes up memory.

 

So for example let’s say we have a process that has 100,000 handles, mybadapp.exe.  What do we do next?

 

Well, if it’s a service we could stop it (which releases the handles) or if an application running interactively, try to shut it down and look to see how much total Kernel Memory (Paged or NonPaged depending on which one we are short of) we get back.  If we were at 400MB of Paged Pool (Look at Performance Tab…Kernel Memory…Paged) and after stopping mybadapp.exe with its 100,000 handles are now at a reasonable 100MB, well there’s our bad guy and following up with the owner or further investigating (Process Explorer from sysinternals or the Windows debugger for example) what type of handles are being consumed would be the next step.

 

Tip: 

For essential yet legacy applications, which there is no hope of replacing or obtaining support, we may consider setting up a performance monitor alert on the handle count when it hits a couple thousand or so (Performance Object: Process, Counter: Handle Count) and taking action to restart the bad service.  This is a less than elegant solution for sure but it could keep the one rotten apple from spoiling the bunch by hanging/crashing the machine!

 

2.)    By Pooltag (as read by poolmon.exe)

 

Okay, so no handle count gone wild? No problem.

 

For Windows 2003 and later machines, a feature is enabled by default that allows tracking of the pool consumer via something called a pooltag.  For previous OS’s we will need to use a utility such as gflags.exe to Enable Pool Tagging (which requires a reboot unfortunately).  This is usually just a 3-4 character string or more technically “a character literal of up to four characters delimited by single quotation marks” that the caller of the kernel api to allocate the pool will provide as its 3rd parameter.  (see ExAllocatePoolWithTag)

 

The tool that we use to get the information about what pooltag is using the most is poolmon.exe.  Launch this from a cmd prompt, hit B to sort by bytes descending and P to sort the list by the type (Paged, NonPaged, or Both) and we have a live view into what’s going on in the system.  Look specifically at the Tag Name and its respective Byte Total column for the guilty party!  Get Poolmon.exe Here  or More info about poolmon.exe usage. 

 

The cool thing is that we have most of the OS utilized pooltags already documented so we have an idea if there is a match for one of the Windows components in pooltag.txt.  So if we see MmSt as the top tag for instance consuming far and away the largest amount, we can look at pooltag.txt and know that it’s the memory manager and also using that tag in a search engine query we might get the more popular KB304101 which may resolve the issue!

 

We will find pooltag.txt in the …\Debugging Tools for Windows\triage folder when the debugging tools are installed.

 

Oh no, what if it’s not in the list? No problem…

 

We might be able to find its owner by using one of the following techniques:

 

• For 32-bit versions of Windows, use poolmon /c to create a local tag file that lists each tag value assigned by drivers on the local machine (%SystemRoot%\System32\Drivers\*.sys). The default name of this file is Localtag.txt.

 

Really all versions—->• For Windows 2000 and Windows NT 4.0, use Search to find files that contain a specific pool tag, as described in KB298102, How to Find Pool Tags That Are Used By Third-Party Drivers.

From:  http://www.microsoft.com/whdc/driver/tips/PoolMem.mspx

 

 

 

3.)    Using Driver Verifier

 

Using driver verifier is a more advanced approach to this problem.  Driver Verifier provides a whole suite of options targeted mainly at the driver developer to run what amounts to quality control checks before shipping their driver.

 

However, should pooltag identification be a problem, there is a facility here in Pool Tracking that does the heavy lifting in that it will do the matching of Pool consumer directly to driver!

 

Be careful however, the only option we will likely want to check is Pool Tracking as the other settings are potentially costly enough that if our installed driver set is not perfect on the machine we could get into an un-bootable situation with constant bluescreens notifying that xyz driver is doing abc bad thing and some follow up suggestions.

 

In summary, Driver Verifier is a powerful tool at our disposal but use with care only after the easier methods do not resolve our pool problems.

 

4.)    Via Debug (live and postmortem)

 

As mentioned earlier the api being used here to allocate this pool memory is usually ExAllocatePoolWithTag.  If we have a kernel debugger setup we can set a break point here to brute force debug who our caller is….but that’s not usually how we do it, can you say, “extended downtime?”  There are other creative live debug methods with are a bit more advanced that we may post later…

 

Usually, debugging this problem involves a post mortem memory.dmp taken from a hung server or a machine that has experienced Event ID:  2020 or Event ID 2019 or is no longer responsive to client requests, hung, or often both.  We can gather this dump via the Ctrl+Scroll Lock method see KB244139 , even while the machine is “hung” and seemingly unresponsive to the keyboard or Ctrl+Alt+Del !

 

When loading the memory.dmp via windbg.exe or kd.exe we can quickly get a feel for the state of the machine with the following commands.

 

Debugger output Example 1.1  (the !vm command)

 

2: kd> !vm 
*** Virtual Memory Usage ***
  Physical Memory:   262012   ( 1048048 Kb)
  Page File: \??\C:\pagefile.sys
     Current:   1054720Kb Free Space:    706752Kb
     Minimum:   1054720Kb Maximum:      1054720Kb
  Page File: \??\E:\pagefile.sys
     Current:   2490368Kb Free Space:   2137172Kb
     Minimum:   2490368Kb Maximum:      2560000Kb
  Available Pages:    63440   (  253760 Kb)
  ResAvail Pages:    194301   (  777204 Kb)
  Modified Pages:       761   (    3044 Kb)
  NonPaged Pool Usage: 52461   (  209844 Kb)<<NOTE!  Value is near NonPaged Max
  NonPaged Pool Max:   54278   (  217112 Kb)
  ********** Excessive NonPaged Pool Usage *****

 

Note how the NonPaged Pool Usage value is near the NonPaged Pool Max value.  This tells us that we are basically out of NonPaged Pool.

 

Here we can use the !poolused command to give the same information that poolmon.exe would have but in the dump….

 

Debugger output Example 1.2  (!poolused 2)

 

Note the 2 value passed to !poolused orders pool consumers by NonPaged

 

2: kd> !poolused 2
   Sorting by NonPaged Pool Consumed
  Pool Used:
            NonPaged            Paged
Tag    Allocs     Used    Allocs     Used
Thre   120145 76892800         0        0
File   187113 29946176         0        0
AfdE    89683 25828704         0        0
TCPT    41888 18765824         0        0
AfdC    90964 17465088         0        0 

 

We now see the “Thre” tag at the top of the list, the largest consumer of NonPaged Pool, let’s go look it up in pooltag.txt….

 

Thre – nt!ps        – Thread objects

 

Note, the nt before the ! means that this is NT or the kernel’s tag for Thread objects.

So from our earlier discussion if we have a bunch of thread objects, I probably have an application on the system with a ton of handles and or a ton of Threads so it should be easy to find!

 

Via the debugger we can find this out easily via the !process 0 0 command which will show the TableSize (Handle Count) of over 90,000!

 

Debugger output Example 1.3  (the !process command continued)

 

Note the two zeros after !process separated by a space gives a list of all running processes on the system.

 

 

PROCESS 884e6520  SessionId: 0  Cid: 01a0    Peb: 7ffdf000  ParentCid: 0124
DirBase: 110f6000  ObjectTable: 88584448  TableSize: 90472
Image: mybadapp.exe

 

We can dig further here into looking at the threads…

 

Debugger output Example 1.4  (the !process command continued)

 

0: kd> !PROCESS 884e6520 4
PROCESS 884e6520  SessionId: 0  Cid: 01a0    Peb: 7ffdf000  ParentCid: 0124
DirBase: 110f6000  ObjectTable: 88584448  TableSize: 90472.
Image: mybadapp.exe
        THREAD 884d8560  Cid 1a0.19c  Teb: 7ffde000  Win32Thread: a208f648 WAIT
        THREAD 88447560  Cid 1a0.1b0  Teb: 7ffdd000  Win32Thread: 00000000 WAIT
        THREAD 88396560  Cid 1a0.1b4  Teb: 7ffdc000  Win32Thread: 00000000 WAIT
        THREAD 88361560  Cid 1a0.1bc  Teb: 7ffda000  Win32Thread: 00000000 WAIT
        THREAD 88335560  Cid 1a0.1c0  Teb: 7ffd9000  Win32Thread: 00000000 WAIT
        THREAD 88340560  Cid 1a0.1c4  Teb: 7ffd8000  Win32Thread: 00000000 WAIT
 And the list goes on…

 

We can examine the thread via !thread 88340560 from here and so on…

 

So in this rudimentary example the offender is clear in mybadapp.exe in its abundance of threads and one could dig further to determine what type of thread or functions are being executed and follow up with the owner of this executable for more detail, or take a look at the code if the application is yours!

 

 

 

Common Question:

 

Why am I out of Paged Pool at ~200MB when we say that the limit is around 460MB?

 

This is because the memory manager at boot decided that given the current amount of RAM on the system and other memory manager settings such as /3GB, etc. that our max is X amount vs. the maximum.  There are two ways to see the maximum’s on a system.

 

1.)   Process Explorer using its Task Management.  View…System Information…Kernel Memory section.

 

Note that we have to specify a valid path to dbghelp.dll and Symbols path via Options…Configure Symbols.

 

For example:

 

      Dbghelp.dll path:

c:\<path to debugging tools for windows>\dbghelp.dll

 

Symbols path:

SRV*C:\websymbols*http://msdl.microsoft.com/download/symbols

 

2.)The debugger (live or via a memory.dmp by doing a !vm)

 

*NonPaged pool size is not configurable other than the /3GB boot.ini switch which lowers NonPaged Pool’s maximum.

128MB with the /3GB switch, 256MB without

 

Conversely, Paged Pool size is often able to be raised to around its maximum manually via the PagedPoolSize registry setting which we can find for example in KB304101.

 

 

So what is this Pool Paged Bytes counter I see in Perfmon for the Process Object?

 

This is when the allocation is charged to a process via ExAllocatePoolWithQuotaTag.  Typically, we will see ExAlloatePoolWithTag used and thus this counter is less effective…but hey…don’t pass up free information in Perfmon so be on the lookout for this easy win.

 

 

Additional Resources:

 

 “Who’s Using the Pool?” from Driver Fundamentals > Tips: What Every Driver Writer Needs to Know

http://www.microsoft.com/whdc/driver/tips/PoolMem.mspx

 

Poolmon Remarks:  http://technet2.microsoft.com/WindowsServer/en/library/85b0ba3b-936e-49f0-b1f2-8c8cb4637b0f1033.mspx

 

 

 

 

 I hope you have enjoyed this post and hopefully it will get you going in the right direction next time you see one of these events or hit a pool consumption issue!

 

-Tate

 

Comments (41)

  1. Richard says:

    Hi Tate.

    What a great blog entry. Nice detailed and good explanation on what you are doing and why.

    2-3 years ago, I started doing memory dump analysis, but first after attending Mark and David’s 5 day session on trouble shoting windows, I was able to do a good memory dump analysis. This blog entry is at the same level (for me, being an sysadmin and not a dev guy).

    This entry is a sure keeper, and I just subscribed to the blog.

    Hoping to see more entries with the same level of information.

    rg

    Richard

  2. shubhankar says:

    Great explanation. Superb!

    I have done memory dump analysis for customers just on the basis of debugging help file but this entry really helps in understanding pool consumption.

    Thanks

  3. SpodBoy says:

    Thanks for this. A great explanation that helped troubleshoot my issue in next to no time.

    Thanks

    SpodBoy

  4. Greg Kreis says:

    Thanks to your excellent article, with the paged and non-paged memory columns enabled on Taskman, I found that WZCSLDR2.exe was eating up 105k of non-paged memory and growing by 4-6k every second!  No other app was anywhere near that much in non-paged memory usage.  I researched it and many said it is related to the Dlink wireless card. I decided to disable it from startup and the wireless card still works and no more non-paged hogging!  Whew….  THANKS!!!

  5. Santosh says:

    It was of great help.

    Cheers Mate!

  6. Hi again! This is Tate from the CPR team and I’m going to show you how to debug a Server Service hang

  7. Mfartura says:

    Hi Tate, great post!  Thank you.

    Just one thing I would like to verify:  As far as I know Windows XP doesn’t have the pool tagging option enabled by default.  You would need to run the gflags, enable it and boot the system in order to have to be able to use a tool like poolmon.  Windows 2003 will do by default though.  

    Am I wrong?

  8. ntdebug says:

    Yes, Windows XP will require you enable pooltagging with gflags before using tools like poolmon.

    Thank you,  Jeff-

  9. Mike says:

    Great post!

    Thanks a lot. I expirienced unavailability of PDC. After reading this, I discovered  TrendMicro NtListener using over 113k handles.

    Thanks for great help!

  10. Joerg Fischer says:

    with the help of Your clear explanation and hints about pool consumption and event id 2019 i found statusclient.exe from the HP Toolbox with more than 500 000 handles to be the cause of our server trouble after migration from Windows adv. server 2000 to Windows enterprise server 2003 R2. Every 3-5 days a restart was necessary because the server stopped responding and delivered permanently ID 2019.

    Thanks for this helpful contribution !

  11. Martin Necas says:

    Are there any chance to determine, which process(es) is using (and how much) memory allocation through “MmAllocateContiguousMemory”.

    I have an high consumption of nonpaged memory pool and only thing i see in PoolMon is MmCm flag with 40 MB in size.

    I am running almost out of nonpaged memory, because I am running with /3GB (Exchange).

    Thanks

    Home

    [Great question. I will likely follow up later with the debugging method that you can use, but for now, most of the recent issues we have seen on this revolve around the new Scalable Networking Pack release (KB912222) which most customers are getting as a result of installing Windows 2003 SP2. Here is the article to turn off the new functionality that is consuming the contiguous memory (most likely) KB936594 and avoid any potential incompatibility from your NIC drivers. Three major components usually are the primary consumers by the way of this memory, Video Card Driver, Storage Driver, and Network Stack. For the Video, you could also swich to a standard vga driver and that will not only get back the memory but more System PTEs as well. For the storage stack, all you can do here likely is get on the latest recommended driver version from your storage vendor, this memory consumed is usually the lowest of the three. Again, most of these cases we have seen have been because of the SNP and the need to use KB936594 to disable the functionality especially on resource constrained /3GB machines. -Tate]
  12. Martin Necas says:

    Tate, Thank You for soon reply and explanation!

    I am looking forward to see the debugging method you mentioned.

    I tried to apply the patch from KB936594 to the other server (IBM x3655), where the MmCm was at 28 MB. After applying and restarting, the MmCm tag is at the same level 28 MB in size :-(

    Just to make all clear, I am using the latest possible BIOSes, firwares, drivers, … on both servers …

    <DIV class=commentowner>[Sorry Martin, should have specified that it is the registry changes in that article that are most relevant and not the actual binaries mentioned therein…to disable the SNP features essentially is the goal]</DIV> 

     

  13. Hello, my name is Jeff Dailey, I’m an E scalation E ngineer for the Global Escalation Services P latforms

  14. Hello, my name is Jeff Dailey, I’m an E scalation E ngineer for the Global Escalation Services P latforms

  15. Martin Necas says:

    Jeff, I’ve read your article about !htrace, however I think I’m not in the “handle issue”. When I take a look at the processes and show the “handles” columnt, I see: store.exe – ~ 13600 handles, followed by System – ~ 11 120 handles. All other are bellow 4000. Should I search the cause here or not ?

    [So handle count over 10,000 is only an issue if the count continues to grow or you are currently getting 2019 or 2020s. If your machine is running ok with no resource related errors you can most likely ignore an elevated handle count. If however you are seeing low on pool conditons, you should investigate these processes first. Thank you Jeff- ]
  16. Preshan Naidoo says:

    Thanks for a great article. Helps like this is rare and explanations thats given is better. Well done for making me understand.

  17. Andy K says:

    Umm, help.  Spent way too many hours on this problem.  Have a ton more knowledge on the subject but no solution.

    Basically, Ddk continues to grow until the nonpaged pool drops below 20 meg and IIS starts to refuse connections.  It used to take 3 weeks for this to happen.  But after upgrading Trend Micro to 8.0 in an effort to stop the leak, the process of loosing memory has accellerate.

    Sygate, Trend Micro as well as other system drivers use Ddk.  I’m at a loss I can find no webpage that will give me my smoking gun saying this version of this will cause a memory leak.  

    Anybody have any thoughts?

     

    [Yes, this problem is unfortunately all to common (use of the Ddk tag instead of some meaningful one to associate with the driver in question. In this case Driver Verifier should be able to help as it will track pool usage by driver not tag, see the “Using Driver Verifier” section in the post for this.- Tate]
  18. Andy K. says:

    Thanks for the advice.  Driver Verifier was able to prove Trend Micro was the cause of the leak on our dev server.  OfficeScan 8.0 has a known memory loss and has a patch.  We were able to plug that leak.  

    Unfortunately the production server has a different memory leak(it had the patched version of OfficeScan on it).  That leak is more ocasionally and of larger amounts of memory than the dev server leak.  The dev server leak was every couple of seconds and of only 4k at a time.  The Production leak is of random time(ocasional) for about 2 megs a time.

    Very reluctant to turn on driver verifier on the production server.  On the test server we could not monitor(just pool tracking) the sygate drivers as the server would not boot if driver verifier was monitoring them.  Had to boot to safe mode and clear the settings.

    I think we are going to try to eliminate possibilities by shutting off certain services and see if the problem goes away.  Let me know if you have any other suggestions.

    Thanks, Andy

  19. Karan says:

    An amazing blog entry. it is very helpfull and helped me a lot. keep up the good work!!

  20. Matto says:

    [Moderator’s note: Our policy is to avoid naming specific 3rd party applications that behave badly.  So the following comment has been edited.] 

     

    Superb! Used the handle count; method 1 and result was an immediate smoking gun for me pointing at a process called abcdefg.exe which is currently using 98000 odd handles and climbing consistently at 8 per minute. Thank You.

  21. Source: The SoftGrid Team Blog I have been on site with a few customers lately and a common issue I have

  22. Ian says:

    Great and real easy for a novice like me to understand.

  23. Vijay says:

    Nice to know about Non paged Pool issues,

    Good article,

    Thanks,

    Vijay

  24. skp says:

    Is there are maximum handle count?

    [The practical limit is the respective Pool which will be consumed by the objects in question referenced by said handles and if not that then the Pool consumed by the handle table itself. Usually a few thousand handles borders on the abnormal considering this practical limit especially on x86 machines.]
  25. Floyd says:

    [Moderator’s note: Our policy is to avoid naming specific 3rd party applications that behave badly.  So the following comment has been edited.]

    Briliant. Found my problem was with [app] from [company]. Just keeps on eating away at the pool. 115 thousand handles in 2 days… [The company] has no cure, so I turned it off.

  26. Rehan says:

    Hi Tate, great article. I was wondering if you had any advice in troubleshooting MmSt paged pool utilization. I’ve got a server on 2003 sp1 with MmSt using over 65mb in paged pool, with the total hitting 325MB. I’ve implemented the ms registry change in increasing to the maximum paged pool size with trimming at 60% which seems to be buying me time for now.

    However what I would really like to know is which processes are behind the utilization. From what I’ve been able to find out is that MmSt (beyond the brief definition in the pooltag.txt file) holds the system primitives which are responsible for tracking memory mapped files in the system. (the handle count on this machine doesn’t seem to indicate the issue – highest totals are 4000 file handles and 7000 event handles in the system as a whole)

    I can find the memory mapped files for each process on the machine via process explorer for example, but I’m not sure if the MmSt usage related to the size of the memory mapped files or their number.. or both?

    Any advice would be greatly appreciated so I can get back to the appropriate software vendor.

  27. !analyze -v says:

    &quot;이 문서는 http://blogs.msdn.com/ntdebugging blog 의 번역이며 원래의 자료가 통보 없이 변경될 수 있습니다. 이 자료는 법률적 보증이 없으며

  28. Werner Schröter says:

    one of the best articles ever!

  29. Moloy says:

    Indeed, an excellent post. Continue sharing the knowledge :)

  30. Greg Harrington says:

    Any thoughts on how to diagnose the MmCm tag utilizing 80+MB when the SNP features are disabled? Would driver verifier or debug be useful or a waste of time?  Is there any way to pinpoint the root cause?

    [Hi Greg – to discover root cause with the debugger you can use the PoolHitTag method mentioned at http://msdn.microsoft.com/en-us/library/cc267847.aspx and watch for allocations of the same size as you observe populating pool with your !poolfind MmCm when the machine has consumed the 80MB. However, a typically far easier and faster method since this allocation usually happens at boot is to disable the NIC and Storage adapters on the machine to determine if they are indeed allocating this memory at boot and if so reenable only the minimum number of adapters afterwards.]
  31. william huber says:

    absolutely phenomenal.  This resolved the issue on my forest root after it stopped servicing requests.  I restarted the service that was responsible for the executable that had way too many handles and it cleared up.

    Thanks!

  32. Dlink and Linksys are really facing off to see who can get the longest distance routers lately its funyn how far they will go

  33. Don K says:

    As most of the folks who read this have said, it’s an excellent walk through of what I consider to be very complex issue with memory leaks and hangs.

    Much apprecated Tate!

    Thanks

  34. Ryan Fast says:

    Very helpful! Had a 2003 SBS that was crashing after a few weeks. Used poolmon to watch it for a while, and Thre was slowly (but steadily) growing. Used task manager to find process with highest handle count (Belkin UPS software). Removed the offending software, and poolmon does not show Thre hogging page space.

  35. Chris says:

    Wow. We must be long lost brothers Ryan. I just used this article to find the same culprit on an SBS 2003 server – BullyDog Plus 3.2.19.1108. Great article!

  36. Yul says:

    Amazing! Really helped me to solve the issue I was after over six months. I tried many different solutions including all KBs also listed in this thread – didn’t fix it. I keep getting events 2020 in about 24 hours of continuous work until I had to reboot it every day, was very annoying. Yesterday I found this forum. After reading this blog, I looked at the handle counter in the Task Manager, and what I saw was shocking near 960000 handles on ‘System’ process. When I drilled it down in the Process Explorer using Handle Viewer, it turned out to be a Creative WMD USB Driver KSAud.sys for my X-Fi Surround 5.1 external Sound Blaster device that generated about 24 handles per sec. I googled it, a few people had the same exact problem. Creative knew about it, too. So I updated a driver from their side, and solved the problem in about 10 minutes. Thanks a lot!!!

  37. Dominique says:

    Hi, i have a server who stop working the the even ID 2019 in poolmon i can see Thre Tag growin all the time: Tag Type Allocs Frees Diff Bytes Per Alloc Mapped_Driver Thre Nonp 378580 24692 353888 220826112 624 [nt!ps – Thread objects] My question: can i use windbg to fin what process create this tread ?? (sorry for my bad english) Thanks Dominique

    [The growth of this tag means that additional thread objects are being created.  Normally, one of the processes will have a close to matching number of handles allocated to these thread objects and as such you may find out who is either creating these threads and not terminating them or is leaking a handle to the threads.  Try “!process 0 0” and observe the handle counts listed and see if a process has a particularly large one (i.e. over several thousand…if that is how many thread objects you see leaking).  Then you can use it’s EPROCESS address in a “!handle 0 F <EPROCESS> Thread” to confirm they are indeed to handles.  If the threads are in the process, the job is easier.  You can simply “!process 0 7 <EPROCESS>” to figure this out and observe the purpose of the threads or !thread against one of the returned thread handles.  There is a case where there would be no handles to the threads only a positive pointer count.  This is a less often and potentially more challenging debug, here you may want to do a !search to determine who might have a stale pointer to the thread if any and !pool to determine the likely owner of the allocated memory with the reference and thus perhaps the culprit.  There is obtrace (discussed in the debugging help file) but for a frequently allocated tag such as Thre or File, this will only cause the machine to run out of NPP faster in trying to trace the allocations so I wouldn’t recommend it.  Also, if you wanted to get call stacks for handles consumed and outstanding over time in a given process, if you can repro this issue, look at our past post about !htrace. -Tate]

  38. Joseph says:

    Came across this after experiencing event id: 2020's, your advice on handles still works like a charm. Found the real culprit, restarted the service and all our problems went away.

    Thanks for the easy-to-follow steps, too.

    Regards,

    Joseph

  39. Rich C. says:

    Tate,

    Thanks for all your godd information, I ran poolmon.exe and sorted ny paged, as you said if MmSt is at the top there may be problem.

    is there a way to get poolmon to send data to a text file so I can ask you to look at it and make recommendations

    [Rich,

    Unfortunately we are not able to provide one to one support through this blog site.   If you need a Microsoft engineer to help troubleshoot a problem then we recommend you open a support incident.  You can find more information on opening such an incident at http://support.microsoft.com/select/Default.aspx?target=assistance.

    You may find some of our other articles useful for diagnosing MmSt usage, such as http://blogs.msdn.com/b/ntdebugging/archive/2008/05/08/tracking-down-mmst-paged-pool-usage.aspx.

    -Craig]

  40. lajensen says:

    Thanks for this – perfect! Copied,saved, and being put to good use.

  41. sorry to bump this topic, i know it has been ages .. i have the same kind of issue on a virtual server any ideas how do we proceed.

    [The troubleshooting steps for this issue should be the same regardless of the platform being used.  In addition to this article our recent series on pool usage troubleshooting may be helpful, http://blogs.msdn.com/b/ntdebugging/archive/tags/pool+leak+series. In that series all examples were collected on a virtualized system using Hyper-V.]