Pool Fragmentation

Hello! My name is Stephen, an escalation engineer on the Microsoft Global Escalation Services Team. Today I'm going to share my experience of a pool fragmentation issue I came across recently. Let’s jump right in with the dump file.

This is the output of !vm

*** Virtual Memory Usage ***

      Physical Memory:      917368 (   3669472 Kb)
      Page File: \??\C:\pagefile.sys
        Current:   4190208 Kb  Free Space:   4090220 Kb
        Minimum:   4190208 Kb  Maximum:      4190208 Kb
      Available Pages:      649161 (   2596644 Kb)
      ResAvail Pages:       860271 (   3441084 Kb)
      Locked IO Pages:         210 (       840 Kb)
      Free System PTEs:      14629 (     58516 Kb)
      Free NP PTEs:           4230 (     16920 Kb)
      Free Special NP:           0 (         0 Kb)
      Modified Pages:          791 (      3164 Kb)
      Modified PF Pages:       785 (      3140 Kb)
      NonPagedPool Usage:    25463 (    101852 Kb)
      NonPagedPool Max:      32647 (    130588 Kb)
      PagedPool 0 Usage:      8717 (     34868 Kb)
      PagedPool 1 Usage:      6113 (     24452 Kb)
      PagedPool 2 Usage:      6100 (     24400 Kb)
      PagedPool 3 Usage:      6033 (     24132 Kb)
      PagedPool 4 Usage:      6116 (     24464 Kb)
      PagedPool Usage:       33079 (    132316 Kb)
      PagedPool Maximum:     60416 (    241664 Kb)
      Session Commit:         1870 (      7480 Kb)
      Shared Commit:          5401 (     21604 Kb)
      Special Pool:              0 (         0 Kb)
      Shared Process:         8957 (     35828 Kb)
      PagedPool Commit:      33120 (    132480 Kb)
      Driver Commit:          1939 (      7756 Kb)
      Committed pages:      227031 (    908124 Kb)
      Commit limit:        1929623 (   7718492 Kb)

Using the!poolused /t5 2, I dumped out the highest users of nonpaged pool.

   Sorting by  NonPaged Pool Consumed
  Pool Used:
            NonPaged            Paged
 Tag    Allocs     Used    Allocs     Used
 MmCm     3187 17452976         0        0      Calls made to MmAllocateContiguousMemory , Binary: nt!mm
 NDpp     1125  4519648         0        0      packet pool , Binary: ndis.sys
 File    24911  3992376         0        0      File objects
 abcd        8  3305504         0        0      UNKNOWN pooltag 'abcd', please update pooltag.txt
 LSwi        1  2576384         0        0      initial work context
 TOTAL  239570 65912104    200276 66610504

The big difference between the totals reported by !vm(101 MB) and !poolused(65 MB), tells us there is a pool fragmentation issue!

After some research, I found a lot of pool pages with the following allocation pattern:

3: kd> !pool fa808000
Pool page fa808000 region is Nonpaged pool
*fa808000 size:  a20 previous size:    0  (Free)      *MFE0
 fa808a20 size:   18 previous size:  a20  (Allocated)  ReEv
 fa808a38 size:  5c8 previous size:   18  (Free)       NtFs

3: kd> !pool fa550000
Pool page fa550000 region is Nonpaged pool
*fa550000 size:  860 previous size:    0  (Free)      *Io 
 fa550860 size:   18 previous size:  860  (Allocated)  MFE0
 fa550878 size:  788 previous size:   18  (Free)       Irp

3: kd> !pool f8feb000
Pool page f8feb000 region is Nonpaged pool
*f8feb000 size:  648 previous size:    0  (Free)      *Ntfr
 f8feb648 size:   18 previous size:  648  (Allocated)  ReEv
 f8feb660 size:  9a0 previous size:   18  (Free)       MFE0

The page fa808000 has only one pool chunk in use, and its size is about 0x18=24 Bytes. The top and bottom portion of the entire page are freed pool chunks and could be re-allocated for any use. For this page, 24 out of 4096 bytes are in use.

It is the same story on pages at fa550000, f8feb000, etc. So, the question is, how could this have happened and how do we avoid this in the future?

From the dump, I also found many MmCm pool allocations:

fe592000 size:  f18 previous size:    0  (Allocated) MmCm
fe593000 size:  f18 previous size:    0  (Allocated) MmCm
fe597000 size:  f18 previous size:    0  (Allocated) MmCm
fe5ac000 size:  f18 previous size:    0  (Allocated) MmCm
fe5ad000 size:  f18 previous size:    0  (Allocated) MmCm
fe5ae000 size:  f18 previous size:    0  (Allocated) MmCm
fe5af000 size:  f18 previous size:    0  (Allocated) MmCm
fe5b0000 size:  f18 previous size:    0  (Allocated) MmCm

This is most likely how the fragmentation happened

1)  A driver requests a pool block of size 0xF18. Notice the 3 pages I displayed above have enough free space in total. The free blocks inside one page are split in two, one in the top, and the one in the bottom. Neither the top nor the bottom are big enough for the pool request of size 0xF18.

2)  So the OS creates a new pool page, gives the top portion to the driver, and the bottom will be marked as freed pool.

3)  Now there is a request for a small pool allocation. The OS might take the new pool page’s bottom portion to satisfy the request.

4)  Now, the driver frees the MmCm pool usage. The bottom portion is still in use so the whole page could not be freed. As time goes on, it is very possible that some other portion will be re-allocated for some use.

5)  Now, there is another request for a pool block of size 0xF18. The previous pool block is not good because there might be pool allocations in it. So the OS might create another new page again.

6)  If the above things happen repeatedly, it has the potential to contribute to pool fragmentation as evident in this crash memory dump.

Ways to avoid this issue - Instead of requesting an allocation of size 0xf18, the driver should request an entire page. There will be some small wasted portion in the page, but that is the trade-off to avoid this type of fragmentation issue. By the way, MSDN suggests drivers should use the MmCm for long term. In a live debug, you will see the driver continually allocating and freeing MmCm.

Links to related article:




Comments (3)

  1. DebugMachine says:


    I usually use xpoolmap this kind of situation.

  2. Minsung Kim says:

    In this sentance “For this page, 24 out of 4096 bytes are in use.”

    I want to know about meanning of ‘4096’

    ps. thanks for your kind article…^^!

    [ Great question. 4096 is value defined for a Page size which you might see in some references as 0x1000 (the Hex equivalent to 4096) -Ron ]
  3. Syed says:

    Awesome explanation. Can't Pool fragmentation be avoided by defragmenting it? I know defragmention is not possible with Non Paged, however, can some logic be implemented to defragment Paged Pool?

Skip to main content