manually Debugging native memory leaks

So in my previous posts, I explained how you can use DebugDiag tool to capture high memory dumps with leak tracking enabled and also how to use the inbuilt memory analysis scripts to get a report of memory usage. In this post, I discuss how you can do things manually using Debugging tools for Windows or Windbg. Again, I have tried to provide a generic approach, but with an example. It doesn’t apply to each & every situation.

So I have a memory dump which is about 500 MB in size and was captured when web applications started throwing out of memory errors. The first thing to do find out is where most of the memory is. I discussed this a bit in one of my earlier blog posts.

    1: 0:000> !address -summary
    2:  
    3: -------------------- Usage SUMMARY --------------------------
    4:     TotSize (      KB)   Pct(Tots) Pct(Busy)   Usage
    5:     1806000 (   24600) : 01.17%    01.19%    : RegionUsageIsVAD
    6:     14f3000 (   21452) : 01.02%    00.00%    : RegionUsageFree
    7:     23e9000 (   36772) : 01.75%    01.77%    : RegionUsageImage
    8:     2200000 (   34816) : 01.66%    01.68%    : RegionUsageStack
    9:       88000 (     544) : 00.03%    00.03%    : RegionUsageTeb
   10:    78c82000 ( 1978888) : 94.36%    95.34%    : RegionUsageHeap
   11:           0 (       0) : 00.00%    00.00%    : RegionUsagePageHeap
   12:        1000 (       4) : 00.00%    00.00%    : RegionUsagePeb
   13:        1000 (       4) : 00.00%    00.00%    : RegionUsageProcessParametrs
   14:        2000 (       8) : 00.00%    00.00%    : RegionUsageEnvironmentBlock
   15:        Tot: 7fff0000 (2097088 KB) Busy: 7eafd000 (2075636 KB)
   16:  
   17: Largest free region: Base 57818000 - Size 00068000 (416 KB)

So from this output we can see that 94.36% of the entire virtual address space is in RegionUsageHeap, which means heap memory. We can also see the size – 1,978,888 KB or 1.88 GB! Remember I indicated a few moments back that our dump file itself is just 500 MB in size. So what this most likely means is that this value is the reserved memory vs. committed bytes plus other information that the dump file contains. We can also see that the largest contiguous free region is just 416 KB, which explains why this process ran into out of memory errors. There is just no large contiguous free block to satisfy allocation requests.

A process will have at least one heap, the default process heap which is created by the operating system for you when the process starts. This heap is used for allocating memory if no other heaps are created and used. Components loaded within the process can create their own heaps. For eg the C Runtime heap. Many of you will remember it as MSVCRT.dll – that’s our C Runtime library.

OK, so how many heaps and which heap has the most number of allocations? The trick I usually use is to look at all the heaps and check how many segments each heap has. I think the maximum number of segments a heap can have is 64. Segments are contiguous blocks of memory which hold smaller memory ranges of various sizes. These ranges are of various sizes are handed out to applications when they request memory. Thus, if a segment does not have enough memory to satisfy an allocation request, a new segment is created. The more number of segments, the more the chances are that it is our problem heap. Recommended reading. To view the segments, you can use the inbuilt Windbg extension command !heap

From this example: !heap 0

 115:   02990000 <------------- Heap Handle
    Segment at 02990000 to 029d0000 (00030000 bytes committed)
    Segment at 0bc10000 to 0bd10000 (00037000 bytes committed)
    Segment at 0e350000 to 0e550000 (00007000 bytes committed)
    Segment at 15fe0000 to 163e0000 (00002000 bytes committed)
    Segment at 59530000 to 59d30000 (00001000 bytes committed)
    .
    .
    .
    Segment at 5e980000 to 5e997000 (00001000 bytes committed)
    Segment at 60040000 to 60057000 (00001000 bytes committed)
    Segment at 611e0000 to 611f7000 (00001000 bytes committed)


117:   02a10000 <------------- Heap Handle
    Segment at 02a10000 to 02a50000 (00040000 bytes committed)
    Segment at 0fc90000 to 0fd90000 (000b7000 bytes committed)
    Segment at 17640000 to 17840000 (0000e000 bytes committed)
    Segment at 21ba0000 to 21fa0000 (00001000 bytes committed)
    Segment at 58530000 to 58d30000 (00001000 bytes committed)
    Segment at 5e9c0000 to 5f9c0000 (00001000 bytes committed)
    .
    .
    .
    Segment at 7fe70000 to 7ff23000 (00001000 bytes committed)
    Segment at 23de0000 to 23e3a000 (00001000 bytes committed)
    Segment at 52770000 to 527ca000 (00001000 bytes committed)
    Segment at 52900000 to 5295b000 (00001000 bytes committed)
    Segment at 584c0000 to 5851b000 (00001000 bytes committed)
    Segment at 5a270000 to 5a2cb000 (00001000 bytes committed)

I have truncated the above entry for brevity, but essentially there were many segments. An easier way to see how many segments are there in a heap is to use the !heap command again with the –s switch (for statistics) followed by heap handle. Thus: !heap –s 02a10000

Take a look at lines #12 & # 13 in the following output.

    1: 0:000> !heap -s 02a10000
    2: Walking the heap 02a10000 .......................................................................
    3:  0: Heap 02a10000
    4:    Flags          00001003 - HEAP_NO_SERIALIZE HEAP_GROWABLE 
    5:    Reserved memory in segments              184708 (k)
    6:    Commited memory in segments              18014398506966656 (k)
    7:    Virtual bytes (correction for large UCR) 1252 (k)
    8:    Free space                               254 (k) (45 blocks)
    9:    External fragmentation          0% (45 free blocks)
   10:    Virtual address fragmentation   201004% (77 uncommited ranges)
   11:    Virtual blocks  0 - total 0 KBytes
   12:    Lock contention 2989
   13:    Segments        64
   14:    896 hash table for the free list
   15:        Commits 0
   16:        Decommitts 0
   17:  
   18:                     Default heap   Front heap       Unused bytes
   19:    Range (bytes)     Busy  Free    Busy   Free     Total  Average 
   20: ------------------------------------------------------------------ 
   21:      0 -   1024       64    142      0      0          0      0
   22:   1024 -   2048      279     24      0      0       2280      8
   23:   2048 -   3072       21      3      0      0        176      8
   24:   3072 -   4096        2      3      0      0         16      8
   25:   4096 -   5120       69      6      0      0        560      8
   26:   5120 -   6144        6      0      0      0         48      8
   27:   6144 -   7168       35      3      0      0        280      8
   28:   7168 -   8192        0      2      0      0          0      0
   29:   8192 -   9216        0      1      0      0          0      0
   30:   9216 -  10240        2      1      0      0         16      8
   31:  12288 -  13312        2      0      0      0         16      8
   32:  13312 -  14336        0      1      0      0          0      0
   33:  19456 -  20480        2      0      0      0         16      8
   34:  24576 -  25600        2      0      0      0         16      8
   35:  36864 -  37888        0      1      0      0          0      0
   36: ------------------------------------------------------------------ 
   37:   Total              484    187      0      0       3424      7

From the above output, you can also see the ranges of memory and their utilization. We can also obtain worst offender byte sizes and worst offender count size by using the –stat parameter of !heap command. Here’s the output.

    1: 0:000> !heap -stat -h 02a10000
    2:  heap @ 02a10000
    3: group-by: TOTSIZE max-display: 20
    4:     size     #blocks     total     ( %) (percent of total busy bytes)
    5:     1008 45 - 45228  (26.78) <----------- Worst offender Bytes (WOB)
    6:     19f8 21 - 358f8  (20.75)
    7:     418 c6 - 32a90  (19.62)
    8:     5e8 21 - c2e8  (4.72)
    9:     6008 2 - c010  (4.65)
   10:     5a0 21 - b9a0  (4.49)
   11:     4c08 2 - 9810  (3.68)
   12:     1440 6 - 7980  (2.94)
   13:     808 d - 6868  (2.53)
   14:     3008 2 - 6010  (2.33)
   15:     2708 2 - 4e10  (1.89)
   16:     1808 2 - 3010  (1.16)
   17:     a18 4 - 2860  (0.98)
   18:     6c0 5 - 21c0  (0.82)
   19:     408 6 - 1830  (0.59)
   20:     c08 2 - 1810  (0.58)
   21:     ac0 2 - 1580  (0.52)
   22:     a90 2 - 1520  (0.51)
   23:     778 2 - ef0  (0.36)
   24:     450 1 - 450  (0.10)

All values are in hex in the above output except the percent column. So from the above output (line # 5) we can say that worst offender bytes [WOB - allocation size that is using the most bytes in the heap] is 0x1008 Bytes [4,104 Bytes or 4K] and it adds up to a total of 0x45228 Bytes [283,176 Bytes or 276 KB]

Similarly, you could group by block size if you want to figure the worst offender count size [WOC - allocation size that has the most duplicates in the heap] and count of worst offender count by using the –grp switch.

    1: 0:000> !heap -stat -h 02a10000 -grp B
    2:  heap @ 02a10000
    3: group-by: BLOCKCOUNT max-display: 20
    4:     size     #blocks    total     ( %) (percent of totalblocks)
    5:     418 c6 - 32a90  (47.26)  <----------- Worst offender count (WOC)
    6:     1008 45 - 45228  (16.47)
    7:     19f8 21 - 358f8  (7.88)
    8:     5e8 21 - c2e8  (7.88)
    9:     5a0 21 - b9a0  (7.88)
   10:     808 d - 6868  (3.10)
   11:     1440 6 - 7980  (1.43)
   12:     408 6 - 1830  (1.43)
   13:     6c0 5 - 21c0  (1.19)
   14:     a18 4 - 2860  (0.95)
   15:     6008 2 - c010  (0.48)
   16:     4c08 2 - 9810  (0.48)
   17:     3008 2 - 6010  (0.48)
   18:     2708 2 - 4e10  (0.48)
   19:     1808 2 - 3010  (0.48)
   20:     c08 2 - 1810  (0.48)
   21:     ac0 2 - 1580  (0.48)
   22:     a90 2 - 1520  (0.48)
   23:     778 2 - ef0  (0.48)
   24:     450 1 - 450  (0.24)

Thus in this case, the most duplicates are of allocation size 0x418 bytes [1048 Bytes] and there are 0xc6 [196] of them. You could also dump the allocations in the 1 K – 4 K range and then dump out the contents using the address value in UserPtr column. To do that execute: dc <address value in UserPtr column>

Warning: This command can generate a huge output as it dumps allocations in the specified range from all heaps.

    1: !heap -flt r 418 1008
    2:     _HEAP @ 2a10000
    3:       HEAP_ENTRY Size Prev Flags    UserPtr UserSize - state
    4:         02a14a58 0084 00f0  [01]   02a14a60    00418 - (busy)
    5:         02a15000 0084 0084  [01]   02a15008    00418 - (busy)
    6:         02a20330 0144 0102  [01]   02a20338    00a18 - (busy)
    7:         02a25a40 0102 0144  [01]   02a25a48    00808 - (busy)
    8:         02a265f0 00be 0102  [01]   02a265f8    005e8 - (busy)
    9:         02a2d1b0 00b5 00be  [01]   02a2d1b8    005a0 - (busy)
   10:         5e9c0040 0084 00b6  [00]   5e9c0048    00418 - (free)
   11:         7c1d0040 0084 0084  [00]   7c1d0048    00418 - (free)
   12:     .
   13:     .
   14:  
   15: dc 02a14a60

So our story so far…

  • There are a couple of heaps that have lots of segments.
  • A lot of memory is reserved – in MBs and there are few committed blocks – in KBs.
  • The allocations appear to be in the range of 1 KB to 4 KB.

Next questions: So what are these heaps and who is allocating here?

If you want to see the stack back trace for the allocation, you can dump out the page heap information for a given address [UserPtr], but stack back trace is only displayed when available. If I remember correctly, it is available when page heap is enabled for the process.

    1: 0:000> !heap -p -a 7c1d0048
    2:     address 7c1d0048 found in
    3:     _HEAP @ 2a10000
    4:       HEAP_ENTRY Size Prev Flags    UserPtr UserSize - state
    5:         7c1d0040 0084 0000  [00]   7c1d0048    00418 - (free)
    6:         Trace: 0025
    7:         7c96d6dc ntdll!RtlDebugAllocateHeap+0x000000e1
    8:         7c949d18 ntdll!RtlAllocateHeapSlowly+0x00000044
    9:         7c91b298 ntdll!RtlAllocateHeap+0x00000e64
   10:         102c103e MSVCR90D!_heap_alloc_base+0x0000005e
   11:         102cfd76 MSVCR90D!_heap_alloc_dbg_impl+0x000001f6
   12:         102cfb2f MSVCR90D!_nh_malloc_dbg_impl+0x0000001f
   13:         102cfadc MSVCR90D!_nh_malloc_dbg+0x0000002c
   14:         102db25b MSVCR90D!malloc+0x0000001b
   15:         102bd691 MSVCR90D!operator new+0x00000011
   16:         102bd71f MSVCR90D!operator new[]+0x0000000f
   17:         4113d8 MyModule1!AllocateMemory+0x00000028
   18:         41145c MyModule1!wmain+0x0000002c
   19:         411a08 MyModule1!__tmainCRTStartup+0x000001a8
   20:         41184f MyModule1!wmainCRTStartup+0x0000000f
   21:         7c816fd7 kernel32!BaseProcessStart+0x00000023

The above output is just an example but you get the idea of how you can use this technique to help track the source of leaks in your application.

When memory at a given address is de-allocated, the heap manager checks how many contiguous bytes are free around that address. After that check is complete, the heap manager can do one of two things:

  • Keep the contiguous memory block committed.
  • De-commit the contiguous memory block and mark it as reserved only.

There is a registry key that controls the de-commit behavior. That key is:

  • Path: HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Session Manager
  • Key: HeapDeCommitFreeBlockThreshold
  • Value Type: Reg_DWORD

For sake of completing this blog post, adjusting the value for this registry key was the resolution in my example. It could be something else in your case depending on the circumstances under which this occurs. Once a software developer has enough information about the pattern and source of the memory consumption, he will be able to recommend changes or make a suitable fix to resolve the issue.

In this blog post, I made an attempt to show how you can track down native memory leaks manually vs. using DebugDiag scripts as discussed in this blog post. Again, this doesn’t apply to every situation as there are umpteen possibilities to the cause of a leak. Hopefully this blog post is a good starter and a future reference.