Memory Management in Native Code

Memory management is a core task in native world, careless usage of dynamic memory may cause the following problems:

- 1. Heap Fragment, this will introduce performance penalty since it breaks data locality
- 2. Memory Leak, it's a prgm correctness problem and a horrible defect for long-run software

Here I summarized some tips related to the two issues: Mem Optimization & Mem Correctness

Part I - Mem Optimization

General Principles
1. Ensure mem layout cohesion (aka. improve data locality)
2. Avoid frequent alloc/free (aka. batch mem ops, prefer few & bulk over large & small mem ops)

How to implement them?
- Redesign your data structure to make them live in large blocks
- Make unrelated data structure in different region
- Use memory pool to manage mem

Part II - Mem Correctness

One of the challenges when doing native code development is avoiding memory leak. It's so easy (also difficult to avoid) to forget releasing each memory block/object that has been allocated explicitly.

Types of Mem Leaks
1. Constant Leak, allocated mem are totally forgotten to release
2. Casual Leak, allocated mem are not release under some conditions
3. One-Time Leak ,
allocated mem is not released but that line of code only get executed
once (for example, mem allocated in ctor of singleton objec)
4. Implicit
Leak, mem blocks are hold too long (released too late in application
life cycle, this kind of mem leak happens even in GC enabled language
such as Java/.Net, for example, unused objects are still reachable
through Root Set in GC)

To deal with Mem Leak problem, you have two choices:
- Avoid it
- Detect & Fix it

Sec. I - How to Avoid Mem Leak?

1. Adopt Resource Acquisition Is Initialization (RAII) Mechanism in C++

std::auto_ptr
is a good choice if RAII is semantic right for your problem. If you
want to ensure your object/mem get released whenever the control goes
out of some scope (for example, multiple exit path, potential exception
etc.), RAII can be used to solve your problem.

But it can't be passed as return value, can't be put in STL containers.

2. stack based allocation

_alloca() will allocated mem from stack rather than heap. The mem returned will be released when function returns.

But there is potential stack overflow exceptions, since stack is far more smaller than heap.

3. Reference Counting (aka, share/smart pointer)

Use
some data structure to track how many owners are referencing the mem
block or objects. When reference counting is zero, the mem/object is
released.

std::tr1::shared_ptr and boost::shared_ptr are all
based on Reference Counting and RAII concepts. They resolved the
problem of not being able to be put in container, can't be passed as
parameter and return value etc.

But, if your objects have cyclic
reference, this mechanism doesn't work. The fundamental problem behind
is that - the semantic of "useless", should be defined as "Can't Be Reached", not "No One References".

Another
draw back of smart pointer style reference counting is that, it can't
handle pointers that should be put into a union structure. Because
union can't consists of any member fields that has user defined
ctor/non-trivial default ctor/dtor/copy ctor etc.

4. Garbage Collection (yes, gc for C/C++)

Most modern GC uses Mark & Sweep
algorithm to implement GC. The idea behind is that, GC has pointer list
for all heap-allocated objects and a Root Set object pointer list. When
garbage collection is triggered, it traverse from root set to find all
reachable objects and mark them. Those unmarked objects are garbage
that can be deleted. GC for C/C++ is a huge topic, [7] is a very good
reference doc.

General purpose GC for C/C++ is difficult, but
for your specific application requirement, it may not that challenge.
According to my own GC implementation experience, the most difficult
part is define your object ownership policy.

GC is great, but it
still can't handle some "semantic garbage objects". That is to say, if
you hold references to some objects that are in fact you will never use
again, GC won't collect mem and other resources owned by these objects.

Essentially, Memory Management is all about Consistency of Ownership.
- Each mem block should have an owner
- Each mem block should have only 1 owner
- Only the owner of the mem block is responsible for its life cycle

So, the most important design principle about C/C++ memory management is - consider carefully about the ownership of an object/mem block: when and who should be responsible for releasing it.

Sec. 2 - How to Detect Mem Leak

1. Use Debug Version C RunTime library

1.1 Use _CrtDumpMemoryLeaks()

Step 1. include the following directives into each cpp source file
#define _CRTDBG_MAP_ALLOC
#include "crtdbg.h"
#include "stdlib.h"

Step 2. call _CrtDumpMemoryLeaks() at the line where you want to check memory leaks.

This
method has a drawback that mem objects that are released after
_CrtDumpMemoryLeaks() invocation will be treated as leaked mem. (This
happens when mem is released in global object's dtor) It's a false
negative.

1. 2 Use _CrtSetDbgFlag()

Add the following code at the entry point of your application

int nFlag = _CrtSetDbgFlag( _CRTDBG_REPORT_FLAG );
nFlag |= _CRTDBG_LEAK_CHECK_DF;
_CrtSetDbgFlag( nFlag );

This method don't have the drawback of 1.1, but you have no control when the mem leak action performs.

1. 3 Use CrtMemState

_CrtMemState cms1, cms2, cms3;
_CrtMemCheckpoint(&cms1);

/* code to check */

_CrtMemCheckpoint(&cms2);
if(_CrtMemDifference(&cms3, &cms1, &cms2))
{
_CrtMemDumpStatistics(&cms3);
}

This code will dump heap statistics info about the changes happened in the "code to check".

_CrtSetReportMode() can be used to control where to output these diagnose information.

_crtBreakAlloc / {,,msvcrtd.dll}_crtBreakAlloc / _CrtSetBreakAlloc() can be used to control the debug break condition.

More info on CRT mem debug routines, please see reference [1] and [2]

2. Monitor Process Working Set

The Win32 API GetProcessMemoryInfo()
can query process working set size. You can use this api to check
whether the working set size is changed after calling some suspicious
functions.

It can't tell you where mem leak happens, but it a good way to write unit test to track mem leak problems.

3. Use Professional Tools

BoundsChecker
IBM Purify
LeakTracer(Linux)
Windows Leaks Detector
Valgrind(Linux)
MemWatch
Insure++
Visual Leak Detector
User Mode Dump Heap

Part III - Other Tips

1. use "#define SAFE_DELETE(ptr) if (ptr != NULL) { delete ptr; ptr = NULL; }" to avoid redeleting the same object.
2. remember to delete objects pointed by elements in container that contains pointers.
3. pair delete/new delete[]/new[] malloc/free correctly.
4. use "new (std::nothrow)" to eliminate exceptions raising in low mem situation.

NOTE: (Lessons learned from topic investigating)

When solving hard problems, Be Sure To:
1. Use well-known idioms and well-understood mechanisms
2. Keep things as simple as possible

Memory Management:
1. It's another subsystem/component of your whole system
2. Design this component with care
3. Avoid using new/delete directly

Techniques
discussed here apply to not only memory blocks, but also any type of
"resource" that needs explicit requesting/releasing.

[Reference]

Mem Leak
1. Mem Leak Detection https://www.ddj.com/cpp/204800654
2. Microsoft CRT Debug Routines https://msdn.microsoft.com/en-us/library/1666sb98(VS.71).aspx
3. Microsoft CRT Debug Tech https://msdn.microsoft.com/en-us/library/zh712wwf.aspx
4. Mem Debugger List https://en.wikipedia.org/wiki/Memory_debugger
5. Purify from IBM - Use Purify for C code
6. Mem Leak in Java/.Net https://www.agiledeveloper.com/articles/MemoryLeak092002.pdf

Garbage Collection
7. C/C++ GC from HP https://www.hpl.hp.com/personal/Hans_Boehm/gc/
8. GC for C/C++ https://blog.codingnow.com/2008/06/gc_for_c.html
9. How .Net GC works, GC in OO Language, Auto GC

Understanding Mem Management
10. C++ Mem Management https://www.slideshare.net/reachanil/c-memory-management
11. Inside Mem Management https://www.ibm.com/developerworks/linux/library/l-memory/
12. C++ Memory Management: From Fear to Triumph (Part 1, Part 2, Part 3)
13. https://www.cantrip.org/wave12.html
14. Mem Optimization https://www.codingnow.com/2008/memory_management.ppt
15. Mem Mgmt 4 Sys Coder https://www.enderunix.org/simsek/articles/memory.pdf

Misc
16. C++ smart pointers https://www.onlamp.com/lpt/a/6559
17. Mem Leak Definition
18. Is Mem Leak Ever OK?