Allocating and freeing memory across module boundaries


I'm sure it's been drilled into your head by now that you have to free memory with the same allocator that allocated it. LocalAlloc matches LocalFree, GlobalAlloc matches GlobalFree, new[] matches delete[]. But this rule goes deeper.

If you have a function that allocates and returns some data, the caller must know how to free that memory. You have a variety of ways of accomplishing this. One is to state explicitly how the memory should be freed. For example, the FormatMessage documentation explicitly states that you should use the LocalFree function to free the buffer that is allocated if you pass the FORMAT_MESSAGE_ALLOCATE_BUFFER flag. All BSTRs must be freed with SysFreeString. And all memory returned across COM interface boundaries must be allocated and freed with the COM task allocator.

Note, however, that if you decide that a block of memory should be freed with the C runtime, such as with free, or with the C++ runtime via delete or delete[], you have a new problem: Which runtime?

If you choose to link with the static runtime library, then your module has its own private copy of the C/C++ runtime. When your module calls new or malloc, the memory can only be freed by your module calling delete or free. If another module calls delete or free, that will use the C/C++ runtime of that other module which is not the same as yours. Indeed, even if you choose to link with the DLL version of the C/C++ runtime library, you still have to agree which version of the C/C++ runtime to use. If your DLL uses MSVCRT20.DLL to allocate memory, then anybody who wants to free that memory must also use MSVCRT20.DLL.

If you're paying close attention, you might spot a looming problem. Requiring all your clients to use a particular version of the C/C++ runtime might seem reasonable if you control all of the clients and are willing to recompile all of them each time the compiler changes. But in real life, people often don't want to take that risk. "If it ain't broke, don't fix it." Switching to a new compiler risks exposing a subtle bug, say, forgetting to declare a variable as volatile or inadvertently relying on temporaries having a particular lifetime.

In practice, you may wish to convert only part of your program to a new compiler while leaving old modules alone. (For example, you may want to take advantage of new language features such as templates, which are available only in the new compiler.) But if you do that, then you lose the ability to free memory that was allocated by the old DLL, since that DLL expects you to use MSVCRT20.DLL, whereas the new compiler uses MSVCR71.DLL.

The solution to this requires planning ahead. One option is to use a fixed external allocator such as LocalAlloc or CoTaskMemAlloc. These are allocators that are universally available and don't depend on which version of the compiler you're using.

Another option is to wrap your preferred allocator inside exported functions that manage the allocation. This is the mechanism used by the NetApi family of functions. For example, the NetGroupEnum function allocates memory and returns it through the bufptr parameter. When the caller is finished with the memory, it frees it with the NetApiBufferFree function. In this manner, the memory allocation method is isolated from the caller. Internally, the NetApi functions might be using LocalAlloc or HeapAllocate or possibly even new and free. It doesn't matter; as long as NetApiBufferFree frees the memory with the same allocator that NetGroupEnum used to allocate the memory in the first place.

Although I personally prefer using a fixed external allocator, many people find it more convenient to use the wrapper technique. That way, they can use their favorite allocator throughout their module. Either way works. The point is that when memory leaves your DLL, the code you gave the memory to must know how to free it, even if it's using a different compiler from the one that was used to build your DLL.

Comments (55)
  1. Anonymous says:

    This is the main problem when building plugins for 3dstudio max – you must use the same version of visual studio or rely on almost-working wrappers to alloc/free memory.

  2. Anonymous says:

    I suppose I should point out how ac’s example of using FILE from the C standard library has the same problem as using malloc/free. If I write a program with MSVC and link to a module compiled with Borland C, there’s no guarantee that the linked module’s idea of a FILE will be the same as mine.

    The solution to this is to export fprintf() and fclose() functions, or simply pass around OS file handles.

  3. Btw, the strongest reason for using the wrapper technique is that the wrapper technique allows for finer grained control over the memory allocations.  For instance you might decide that due to heap fragmentation issues, you need to drop in a fixed size allocator at some point in the future.  

    Or you might decide you need to use a private heap (like the low fragmentation heap).  If you use a wrapper, you can change these behaviors without breaking clients.

  4. Anonymous says:

    To Gabe:

    Well I thought we were talking about how to develop a set of our own functions, which are to be used in our applications or applications of somebody else, and make sure that they are "consistent" to the rest of the world. I gave "fopen, fclose" just as the example of the method.

    So if you make such functions, in "fopen, fclose" style, you have to export them if others are supposed to use them. Of course, if you let everybody in your project to compile his own instance of the body of your functions , you’ll have more things to worry.

    But as you said, once they are exported, then our fclose equivalent is always the one that matches our fopen.

  5. Ryan Bemrose says:

    I think this whole series is leading up to a rule that I put in place for my own projects long ago.

    <b>Memory should be deallocated as closely in scope as possible to the place that it was allocated.</b>  This means the same function where you can get away with it.  If not, then same class, or at the very broadest, same module.

    This neatly avoids all of the issues that Raymond has brought up about selection of compiler, API, and instance, and as well guarantees that when the allocator suffers a maintenance change, the deallocator can easily be changed simultaneously.

  6. benkaras says:

    Another execption to the rule about using CoTaskMemFree across COM boundaries is that pointers embedded inside [in,out] parameters must be allocated using MIDL_user_allocate/MIDL_user_free in the server.  See http://msdn.microsoft.com/library/default.asp?url=/library/en-us/rpc/rpc/embedded_out_only_reference_pointers.asp for details.

  7. Anonymous says:

    This is one of the reasons why the Boost shared_ptr always carries its own deallocator around. No matter which module allocated the shared_ptr, no matter which module destructs the last shared_ptr, you’re guaranteed that the correct deallocation function is called, even across module boundaries.

    Of course this comes with an overhead, but I believe it is worth it.

  8. Anonymous says:

    Mm…

    Or you can use Java.

  9. BryanK says:

    > Switching to a new compiler risks exposing a subtle bug, say, forgetting to declare a variable as volatile or inadvertently relying on temporaries having a particular lifetime.

    That is true.  However, there is another pitfall: switching to another compiler also (at times) introduces bugs that didn’t exist in the source, due to bugs in the compiler.  Some versions of Red Hat-patched gcc come to mind (broken inline assembly in obscure cases), as do some recent versions of VC++ (broken loop optimization for abnormal but legal loops is what I *think* it was).

    This whole concept is another of those things that make it hard to port some Unix/Linux libraries to Windows.  For instance, libpcap has a couple functions that allocate and return memory.  The libpcap docs basically say "free() this memory".  When libpcap was ported to Windows (WinPcap), they used the C library malloc(), so that callers could continue to use free().  But since there’s only one C library on Linux, and there are many on Windows, they ran into problems when people tried to free the memory (these people were linking to different C libraries than the WinPcap DLL linked to).

    I don’t know whether they’ve fixed that or not, or how they fixed it if they have.  It would be nice to have Only One C Library on Windows, but I don’t know how possible that even is.  (Given how much software is distributed as binaries, probably not very…)

  10. Anonymous says:

    For me, the C standard library f* functions are a classic example of how complex things can be kept simple. Sadly, a lot of programmers didn’t learn from this, so often there were popularized so many supposedly "cooler" things.

    So, you have to work with a pointer to something (here, file, since in the original Unix everything was supposed to be a file):

    FILE* f;

    You get a pointer to the file by opening it:

    f = fopen( "somefilename", "a+" );

    You don’t care who allocated it, or if the library even took a preallocated object. You just have it, and can use it.

    First, error handling: was the "construction" successful? Instead of horrible but "oh so modern" trying and catching, you just check for success:

    if ( !f ) {

      return;

    }

    Then you can use it…

    fprintf( f, "%.2fn", 2 * r * pi );

    And finally "release it":

    fclose( f );

    What is used to delete f? Maybe the same memory behind f can be reused for the next opened file. You don’t care, fclose knows what to do.

    "Those who do not learn from the past are doomed to use much worse than the past solutions". In 2006 we have to learn people to avoid the hype and look to something made in 1973, to prevent them shooting themself (and many affected by their programming) too much in the feet.

  11. Mike Dimmick says:

    BryanK: One CRT to rule them all? Microsoft tried it. From Visual C++ 4.2 through to 6.0, the same DLL name was used: MSVCRT.DLL. The result? DLL hell. Some applications couldn’t cope with changes in the new DLL, and some older installers erroneously installed an old version over a new version, breaking new applications.

    Windows 2000 put MSVCRT.DLL under Windows File Protection (although I think this is mostly because WordPad and Microsoft Management Console were written with MFC and hence used MSVCRT.DLL) and this is still true in Windows XP and Server 2003. Windows Vista still ships MSVCRT.DLL (with a shiny new version number) and now it cannot be written to by any process except for an OS update.

    Before someone brings up the versioning inherent in *nix dynamic library linking, let me point out that simply changing the name of the DLL has pretty much the same effect in Windows – which is of course what Microsoft have done. This still doesn’t solve DLL Hell problems between different minor versions of the same major-version DLL, on either OS. That’s what Win32 Side-By-Side (SXS) assemblies are for. All subsequent versions of the CRT (7.0, 7.1, 8.0) so far have included a manifest and been installed to the side-by-side folders for explicit binding on OSs which support it (XP, 2003 and later).

    I don’t think it’s possible for two side-by-side versions of the same DLL to be loaded into the same process – I believe the choice is governed by the EXE’s manifest. If it is possible that would even cause problems with assuming that you can use ‘delete’ from a module with the same name as the one you called ‘new’ from! Windows will allow you to load DLLs with the same name from different paths, of course.

  12. Anonymous says:

    “…all memory returned across COM interface boundaries must be allocated and freed with the COM task allocator.”

    Does this mean that SysAllocString() uses that allocator? I can’t find any specifics in MSDN.

    Obviously, we should put this out of our minds when actually *using* BSTRs.

    [Liberally insert the phrase “as a general rule, but there may be exceptions” into all of my posts. All memory returned across COM interfaces must be allocated and freed with the COM task allocator unless there are special rules that override this ground rule. -Raymond]
  13. Anonymous says:

    Why is it that these external allocators are more stable than malloc? Are the maintainers of those routines more meticulous than the vcrt maintainers? Also, what exactly does “external” mean here? Perhaps the answer to that question answers the rest.

  14. Anonymous says:

    Brian, think of it this way.

    You have a EXE statically linked to CRTL and a DLL statically linked to CRTL.  This means that the EXE and DLL not only don’t share the same malloc and free code, but they don’t share the same data structures used to support these routines.  

    Let us assume that malloc and free are just wrappers around HeapAlloc where CRTL uses a private heap.  In that case, if you allocated memory in the DLL, one heap would be used.  If you tried to free that memory in the EXE, then it would try to free that memory to a DIFFERENT heap.  Thus bad things start to happen.

  15. Anonymous says:

    External basically means an allocator that isn’t part of the programmer’s EXE or DLL.  For example, CoTaskMemAlloc is a external allocator that just lives in another DLL that will be shared between the EXE and all the DLLs.  So you avoid the whole multiple malloc/free issue.

  16. BryanK says:

    And then what happens when your Java code has to use JINI to call a native function, because Sun didn’t give you whatever you need to get at in the JRE?  How do you free memory that that function may have allocated?

    (In other words:  This is a problem in *any* language that allows the programmer to call into native OS DLLs.  Even languages with full GC.)

  17. Anonymous says:

    BryanK: (re: JNI allocation) "How do you free memory that that function may have allocated?"

    It’s actually less complicated in a Java/JNI case.  You are *forced* to also export a deallocation function, since it’s simply not possible to otherwise free the memory.  That is, there’s never any danger of the user calling the *wrong* deallocation function; the only possible danger is that the user won’t call any at all (which is easily avoided via dispose/finalize).

  18. Anonymous says:

    Uhh… A pretty dumb question: if all of those allocators run inside the same process, how do they manage not to hurt each other, so to speak? Who keeps track of what parts of the VM space are in use? How do they manage not to collide, and leave enough space for the others (if they need continuous space)? How do they request/release memory from the OS?

  19. Anonymous says:

    At the end, all allocators get their memory from the OS via VirtualAlloc, and then divide up these larger pages to satisfy requests from the caller. No magic involved.

  20. Anonymous says:

    To go off on a slight tangent: I’ve always thought the *six* versions of libc offered by VC++ is a bit gratuitous (libc, libcd, libcmt, libcmtd, mscrt, msvcrtd).  This seems like bad design.  Design is all about making decisions and accepting the relevant tradeoffs.  I think ‘msvcrt’ (multithreaded DLL) should have been the only option.  But instead, VC++ punts:  they refuse to make a decision, and thereby force the issue on the user as “options”.

    Four of the options — libc(mt)(d) — don’t even make sense.  Is an application ever really “single-threaded” on Windows?  You press Control-C, and Windows creates a thread.  Does “static linking” have any meaning on Windows?  On Unix, it means creating an executable which is *completely* self contained.  But on Windows, you will still depend upon kernel32.dll at a minimum… in other words, by using a static version of libc, you have avoided your “DLL hell”/versioning problem only for that *one* library: libc.

    I’ve seen this cause problems for users again and again.  Forget about binary compability and versioning issues… because of the 6 versions of libc, it is possible to get conflicts when *building everything from source*.  Poll: how many times have you downloaded & built a third party (static) library, and discovered at link time that it uses /ML(d)/MT(d)/MD(d) flags that are incompatible with the rest of your sub-projects?  I’ve seen people who don’t know any better use /nodefaultlib and /force to jam an EXE togther in this scenario… in once case, someone managed to get a static copy of free() and a DLL-imported malloc() in the same EXE, resulting in a self-contained malloc/free conflict.  Of course, you routinely get conflicts when going across DLL boundaries, with no linker kludgery whatsoever.

    When programming on *nix, “Which flavor of libc should I use?” isn’t a decison I have to make, and I feel much better off.

    [Thus is the paradox of design. Give people no choice and they demand one. Give people a choice and they complain that you should have decided for them. -Raymond]
  21. Anonymous says:

    The problem with free and malloc having multiple versions isn’t an issue with just Windows, it is an issue with any operating system where free and malloc isn’t an external allocator, is versioned or isn’t the standard allocator.  

    DOS, CP/M, VMS, Windows, etc all have issues with memory allocation.  Even with VMS where the CRTL was a DLL, you have to be careful when allocating memory in a C module and then expecting a Fortran module to be able to free it.  Oh, and VAX Fortran doesn’t really support memory allocation or pointers directly. You have to trick it into handling pointers.

    Even the grand old Unix/Linux would have the same problem if you were using a NON C language that used memory allocation that either didn’t use malloc/free or augmented then in some way.  (The same way that most Window’s mallocs/frees use gross allocations from the operating system and then subdivide them.)

    So basically, to say that Unix/Linux doesn’t have this problem shows a simplistic understanding of the issue.

  22. Anonymous says:

    The important thing Raymond missed out in the post is the fact that cross module allocation/deallocation done wrongly would lead to memory leaks/subtle bugs or even crashes is due to the fact Tim mentioned earlier in one of his posts, that is there is a per module heap and the malloc/free happens on the module heap from within which the calls were made.

  23. Anonymous says:

    Leif, I’m VERY glad you weren’t the one making the decisions then.

    You have basically the same options on *nix: linking against libc.a (static single-threaded), libc.so (dynamic single-threaded), libc_r.so (dynamic thread-safe).

    The other 3 versions are the debug versions that add extra error checking and such for helping the developer find and debug errors in their programs.

    As for Windows not having true static applications, do you really think that you can take a linux binary and run it on a PC without linux installed on it?  The application has to be run on an OS.  The kernel is part of that OS.

  24. Anonymous says:

    Maybe the same memory behind f can be reused for the next opened file. You don’t care, fclose knows what to do.

    And then because fopen() and fclose() aren’t mystical sentient spirits in heaven but instead are actually functions implemented in code that has to be stored somewhere, if you use fopen() in one DLL and return the result acrss a DLL boundary, you don’t know which fclose() to call, becaus if you call the wrong one, then it will not work correctly.

    > In 2006 we have to learn people to avoid the hype and look to something made in 1973, to prevent them shooting themself (and many affected by their programming) too much in the feet.

    However, I’m more worried about the rocket launcher you’re planning to use to blow off every leg in the building in a misguided and doomed attempt to save a single foot.

  25. Anonymous says:

    "Even the grand old Unix/Linux would have the same problem if you were using a NON C language that used memory allocation that either didn’t use malloc/free or augmented then in some way."

    Yep. You also see the problem in C language libraries that for whatever reason need to implement their own malloc() . It’s much less common and much less useful now that most system malloc implementations are much better, but it was once common and frequently necessary. It can also happen if an executable uses different versions of libstdc++ (same issue as the CRT problem in win32) with the new' anddelete’ operators.

    In general, though, it’s much less problematic than on win32. This means that most *nix apps and especially apps with plugin interfaces tend to ignore the issue completely, resulting in incredible frustration when moving to win32.

  26. Anonymous says:

    The problems with memory (de)allocating and dll-hell is much more prominent on windows than other os, can’t deny that. It does however exist on other os but usually isn’t a concernable problem at all there. Can not expect sloppy BASIC programmers on Windows to be attentive to low level stuff.

  27. Anonymous says:

    Whoa… so you’re telling me that I can call FindMimeFromData from Java, using JNI (or whatever it’s called), and somebody will have magically created a deallocation function for me to use?

    Of course not.  The author of the JNI code is responsible for matching allocation and deallocation functions.  But the problem that is the topic of Raymond’s post ("Allocating and freeing memory across module boundaries") is not an issue, as the allocation and deallocation do not occur in different modules.  Coding in a language with GC doesn’t magically make the whole OS process garbage collected, and I don’t think that was anybody’s claim.  But it is true that some problems (like the one that is the subject of this post) would not occur.

  28. Anonymous says:

    C often is just glorified assembler, which is how the problem can exist. If Microsoft had decided to use C++-style name-mangling on their DLLs, their FILE* could have been a _MS_CRT_20_FILE*. Obviously you can’t pass that to Borland’s fclose(FILE*). The competition doesn’t need to compete.

    Even better, if VCx+1 was really compatible for these functions, you could keep the same names for these types and functions. Versioning can then be decided at function level. You can even have two fclose(FILE*) implementations in a single DLL, if needed – one for each FILE* flavour.

    IIRC, VC8 fixed the multiple-heap problem.

  29. Anonymous says:

    C often is just glorified assembler, which is how the problem can exist. If Microsoft had decided to use C++-style name-mangling on their DLLs, their FILE* could have been a _MS_CRT_20_FILE*. Obviously you can’t pass that to Borland’s fclose(FILE*). The competition doesn’t need to cooperate for this to work. And it’s backwards-incompatible, too! (as intended)

    Even better, if VCx+1 was really compatible for these functions, you could keep the same names for these types and functions. Versioning can then be decided at function level. You can even have two fclose(FILE*) implementations in a single DLL, if needed – one for each FILE* flavour.

    IIRC, VC8 fixed the multiple-heap problem.

  30. Anonymous says:

    >> But on Windows, you will still depend upon kernel32.dll at a minimum… in other words, by using a static version of libc, you have avoided your "DLL hell"/versioning problem only for that *one* library: libc.

    Nonsense. You can easily code for the lowest common kernel32.dll you want to support and it’s easy to keep compatibility with Windows95 that way. And really there is no versioning to be worried about.

    But if the system you’re running your program on doesn’t have msvcr71.dll you’re screwed.

    If you want to build an application of a single exe file in C++ you have to use static linking.

    OH! and you don’t have to use the multithreaded libc version as long as your threads do not use any libc function. It’s lighter and faster than the mt one (nowadays is probably insignificant but it can have its uses).

  31. BryanK says:

    > BryanK: (re: JNI allocation) "How do you free memory that that function may have allocated?"

    > It’s actually less complicated in a Java/JNI case.  You are *forced* to also export a deallocation function, since it’s simply not possible to otherwise free the memory.

    Whoa… so you’re telling me that I can call FindMimeFromData from Java, using JNI (or whatever it’s called), and somebody will have magically created a deallocation function for me to use?  Even though FindMimeFromData uses its own internal allocator, and there *IS* no deallocation function, even if you’re calling it from C?  (See http://blogs.msdn.com/oldnewthing/archive/2006/09/07/744430.aspx for some of the details of that one.)

    > the only possible danger is that the user won’t call any at all (which is easily avoided via dispose/finalize).

    I’m not talking about writing a full Java object to wrap an API.  I’m talking about just the act of calling that API, from *any* code.  If there is no deallocator, you can’t free the memory, GC or no.  Or if you use the wrong deallocator in your wrapper, you’re not going to get the right result, GC or no.  Java is not a magic bullet.  (Neither is any other language, of course.)

    > [Thus is the paradox of design. Give people no choice and they demand one. Give people a choice and they complain that you should have decided for them. -Raymond]

    Now, I haven’t talked to a majority of Linux programmers.  But I’ve never heard *anyone* complaining that there’s no choice in Linux C libraries.  Everyone’s happy enough with glibc that they at least don’t try to distribute their own copy of it, to "preserve compatibility" or some such hogwash.

    Of course, it helps that glibc has versioned symbols, so if you asked for an old version of a function when your code was compiled, you’ll get it when your code runs.  So you have full backward compatibility by default.  And it helps that the kernel never breaks backward compatibility at the syscall level, so even if you did distribute your own glibc for some crazy reason (or you skip glibc and make system calls directly, which is more common), it’d work.

  32. BryanK says:

    And if the programmer that mismatched allocate and free is the same programmer that wrote JNI code instead, the situation is no better.  They’ll still use the wrong deallocator for the allocator they used, because they’ll still go through the same thought processes when they decide which deallocator they need.

    In other words, if your JNI code calls FindMimeFromData, then calls the C library free() on the resulting buffer, Java didn’t actually help.  Or if you call whichever winpcap function allocates the list, then try to free() that list with a different C library, Java still didn’t help.  You’re still using the wrong deallocator in both cases.

  33. Anonymous says:

    > you have to be careful when allocating memory in a C module and then expecting a Fortran module to be able to free it

    Don’t even get me started on Fortran ;-) I don’t think  the mixed-language argument detracts from my point, though:  in VC++, if you build a bunch a DLLs — all written in the same language and built with the same version of the same compiler — with the default options, you get broken, non-intuitive behavior:  crash at runtime when passing heap blocks, STL strings, C++ exceptions, etc. across DLL boundaries.

    > You have basically the same options on *nix: linking against libc.a (static single-threaded), libc.so (dynamic single-threaded), libc_r.so (dynamic thread-safe).

    This is true, I admit.  (In fact, on Linux at least, there is no threaded/non-threaded choice: threads are handled in a sneaky way.)

    But in any case, the effect of having a choice on Unix is not as disastrous as it is on Windows.  Since Unix uses a global symbol namespace, only one malloc() implementation will prevail:  if I choose to link my  main executable against libc.a, everyone in my process space will be forced use libc.a’s malloc(), as there can be only one.  On Unix, is not easy to get the kind of malloc/free conflict which Raymond describes.  On Windows, it is easy to get it by accident.

    Not that I think a single, shared global symbol namespace is a good idea, mind you.  It can be a disaster (esp. with respect to libstc++ — just ask the Autopackage folks).  Windows DLLs are much closer to the "right way" to do dynamic linking.  But given the Windows DLL model, a single libc DLL should *at least* be the default, and in my opinion, the only option.

    > The other 3 versions are the debug versions

    If I were making the decisions, the debug version would be a runtime option.  It is trivial to swap-in a different DLL with a matching interface at runtime (although the NT loader’s "KnownDLLs" mechanism complicates this for a subset of system DLLs).  Or, you could do it like kernel32.dll does it:  the debugging features are dynamically activated when running under a debugger.

    > do you really think that you can take a linux binary and run it on a PC without linux installed on it?

    Yeah, that’s what I think.  You mean you can’t? :-)

    > Nonsense. You can easily code for the lowest common kernel32.dll you want to support and it’s easy to keep compatibility with Windows95 that way. And really there is no versioning to be worried about.

    I said you’d depend upon kernel32.dll *at a minimum*.  Any non-trivial Windows app depends upon lots of DLLs, not just kernel32.  On Unix, you have side-by-side static and non-static versions of most libraries.  Static linking *makes sense* on Unix.  As I said, on Unix, static linking enables you to create a self-contained executable — i.e., one that depends only on the kernel and has no dynamic dependencies.  On Windows, what sense does static linking make?  All of the Windows system libraries are DLL-only — and rightly so, because of the way the dynamic linker works (*).  By statically linking libc — just  *one* library — what problem is solved?  Great — I don’t have to worry about "DLL hell" for msvcrt… but I still have to worry about it for comctl32 and everything else. :-/  You’ve only solved one instance of a much more general problem.

    –Leif

    (*) Imagine the disaster if Microsoft decided to give developers the "option" of linking against a static kernel32.lib.  (Funny how no one ever complains about not having a static version of kernel32…)  "Free memory allocated using LocalAlloc() by calling LocalFree()… oh, and by the way, if you statically linked against kernel32, you must call LocalFree() from the same EXE/DLL from which you allocated the memory."  In order for static linking to make sense in general, your runtime linker must have a global namespace (like Unix).  On Windows, the sane approach is to force users to use DLLs — *especially* if the library in question controls a "global" resource — in this case, the heap.  I think the Win32 team realized this, but the VC++ team did not… or at least, they didn’t give it nearly enough weight.

  34. Anonymous says:

    >  Great — I don’t have to worry about "DLL hell" for msvcrt… but I still have to worry about it for comctl32 and everything else. :-/  You’ve only solved one instance of a much more general problem.

    Wrong. You don’t link statically to avoid dll hell. You link statically to avoid yet another resitributing burden.

    Let’s say you offer a demo of something on the internet. I can guess that more people will download your demo if :

    1) runs without installing

    2) doesn’t require other redist to be installed

    3) it’s a single exe file.

    You said comctl32. Why you decided that the program has a GUI ? It might be a small 3dgame. Or a screen saver. Or a console app. Or a small puzzle game "winmine" style. Or a GUI using the subset of COMCTL32 available in Win95. Or an installer. A self extractor.

    You find a feature useless and so you scream you want it to be removed. Quite egocentrist

  35. Anonymous says:

    Raymond wrote:

    >> many people find it more convenient to use

    >> the wrapper technique.

    Ac’s fclose( f ) example illustrates a reason why: the function that deallocates memory can also do other stuff, in this case close the OS file handle. It’s a higher level, more object oriented approach.

  36. Anonymous says:

    Am I the only one here who thinks that DLLs have been used in the wrong way?

    Everyone has been talking about reusable code for quite some time but what exactly is reusable? Applications are getting bigger and bigger with more and more DLLs.

    Instead of using one system DLL we now have dozens of applications each using their own version of the same thing.

    That leads to users having to depend on each application vendor to update their version of a DLL to get bugfixes, improved functionality or speed boost.

    On a side note, has anyone notice how each and every application has their own copy of strlen(), strcpy(), malloc(), free()?

    I mean why is it so hard to use VirtualAlloc()?!? It pisses me off every time I see malloc() because I know that the memory is not properly aligned for any SIMD operations.

    What I would do in next C runtime header update would be this:

    #define malloc(size) VirtualAlloc(NULL, (size), MEM_COMMIT, PAGE_READWRITE)

    #define free(mem) VirtualFree(mem, 0, MEM_RELEASE)

    That is what I am using all the time anyway.

  37. Anonymous says:

    define malloc(size) VirtualAlloc(NULL, (size), MEM_COMMIT, PAGE_READWRITE)

    Oh boy.  That’s the last time I click the “Comments” link.

    [Leif sadly walks away as the virtual address space is exhausted in 64K increments, and the available I.Q. shown in the Task Manager slowly drops…]

    [Good thing you stopped before you read the definition of “free”. -Raymond]
  38. Anonymous says:

    > possibly even new and free.

    i hope no one takes the original article seriously in this example :)



    as for libcs on unix, there’s a ulibc floating around, and there *are* problems w/ glibc, specifically the transition from libc5 to libc6 (glibc) which actually resulted in lots of classic binary apps just not running anywhere.

    in a certain way, the engineers would have been *better* off statically linking with libc5 because it would mean that the apps would run on modern linuxes (they don’t, and i believe there are some linuxes which don’t even bother providing a way for the user to easily install a libc5 — yes, you can rebuild libc5 yourself, but an average end user can not do that). — and yes, i’m talking about a closed source product which is no longer being rebuilt, but which is still better than certain other alternatives (certainly it’s a lot more stable).

  39. Anonymous says:

    Although I personally prefer using a fixed external allocator, many people find it more convenient to use the wrapper technique. That way, they can use their favorite allocator throughout their module. Either way works.

    In the C++ world, even a wrapper can still mess things up. If the wrapper function that is supposed to free memory is an inline one, it will call memory routines (e.g. free or delete) of the caller module, not the routines in the dll. So, if you’re developing such a wrapper in C++, you must not make it inline explictly or implicitly.

  40. BryanK says:

    Well, even C has inline functions.  (Or at least, GCC does.  I suppose I shouldn’t say that C does, because I don’t know for sure.)  So that could still be a problem in C.

    But if you’re providing a library, you should be explicitly exporting your functions anyway (using a .def file); see some of Raymond’s previous posts on the subject of name mangling, etc.

    I would hope that the compiler/linker would be smart enough to *not* inline functions that are marked for export.  Or, basically, to use gcc’s "extern inline" mode ("inline this if possible, but if not, call the extern, non-inlined version that I have defined somewhere else").

  41. Anonymous says:

    Leif sadly walks away as the virtual address
    >space is exhausted in 64K increments, and the
    >available I.Q. shown in the Task Manager slowly
    >drops…

    Hahahah… seriously, “in 64K increments”! :)

    Even though page size on most machines is 4K?

    Or you think I am calling VirtualAlloc() in a loop for some reason?

    No, wait… you think that C/C++ runtime allocator (malloc, new, etc) use some psychic memory access which somehow evades paging and virtualization services provided by OS?

    Anyway, the stack trace of a windows console application named some.exe compiled with Visual C++ which uses malloc() would look something like this:

    some.exe!__heap_init()
    kernel32.dll!HeapCreate()
     ntdll.dll!RtlCreateHeap()
      ntdll.dll!NtAllocateVirtualMemory()
    some.exe!__mtinit()
    some.exe!__ioinit()
    kernel32.dll!GetCommandLine()
    some.exe!___crtGetEnvironmentStrings()
    some.exe!__setargv()
    some.exe!__setenvp()
    some.exe!__cinit()
    some.exe!_main()
    some.exe!malloc()
     kernel32.dll!HeapAlloc()
      ntdll.dll!RtlAllocateHeap()

    And Windows GUI application stack trace would probably look the same save for the _WinMain() instead of _main().

    So, the mother of all allocators in user space is kernel32!VirtualAlloc() which is just a wrapper around ntdll.dll!NtAllocateVirtualMemory().

    If you need high performance in your application then I advise you to use VirtualAlloc() (optionally followed by VirtualLock() if you bothered to adjust your process working set) instead of malloc() when possible.

    malloc() has higher overhead, doesn’t return aligned memory needed for SIMD operations and you have to worry which free() to call.

    [It’s one thing to use VirtualAlloc to grab large chunks of memory; it’s another to use it as a malloc replacement. Using VirtualAlloc to allocate the memory for a strdup(“hello”) is probably overkill, allocating 4KB of memory (and reserving 64KB of address space) when you only needed six bytes. (And I assume you’ll fix the memory leak in your free macro.) -Raymond]
  42. Anonymous says:

    It’s one thing to use VirtualAlloc to grab

    >large chunks of memory; it’s another to use

    >it as a malloc replacement.

    I agree. But I am always using large chunks.

    >Using VirtualAlloc to allocate the memory

    >for a strdup(“hello”) is probably overkill

    I was convinced that strdup() allocates its own memory.

    >(And I assume you’ll fix the memory leak in

    >your free macro.)

    I will Raymond but only if you tell me where it is :)

    Seriously, from MSDN:

    dwSize

    [in] The size of the region of memory to be freed, in bytes.

    If the dwFreeType parameter is MEM_RELEASE, this parameter must be 0 (zero). The function frees the entire region that is reserved in the initial allocation call to VirtualAlloc.

    So what is wrong there? Has the API changed?

    [Sorry, I missed the MEM_RELEASE part. If you’re using it only for large allocations, that’s fine, but lots of people use malloc for small allocations, in which case your version is extremely wasteful. My remark about strdup was merely one example where people use malloc() to allocate small amounts of memory. -Raymond]
  43. Anonymous says:

    Sorry, I missed the MEM_RELEASE part.

    No problem, except that you got me worried that something in the API has changed.

    I completely agree that using VirtualAlloc() for small allocations is wastefull.

    I only use it for large datasets when I do not know the size of the dataset in advance and when I have to make multiple passes over the data.

    However, when I need to load the dataset into the memory for one-shot processing I prefer not to allocate memory at all but to use CreateFileMapping()/MapViewOfFile() instead.

    OS is hell of a lot better in managing memory resources than I am, especially if the dataset doesn’t fit in available RAM and in many cases it is much simpler to access the data this way instead of using memory allocating and file reading/writing APIs.

    Finally, if something is small enough like that string you mentioned it would be better off using some dedicated C++ or STL string class which will offer superior performance and save you from reinventing the wheel.

    I am surprised how many Windows programmers (or should I say "developers"?) don’t know how to use basic Win32 APIs and still stick to C runtime for everything they need. No wonder software is so slow and bloated when so much wrappers and levels of indirection exist.

  44. Anonymous says:

    And we forgot that using malloc() and free() to allocate structures in C++ is a bad thing because if you later add say CString to a structure and allocate an array of structures malloc() won’t call CString constructor resulting in garbage/crash nor will free() call CString destructor resulting in a memory leak.

  45. schabi says:

    wr/t http://blogs.msdn.com/oldnewthing/archive/2006/09/15/755966.aspx#758832 :

    In fact, you have the choice between different libc implementations on some unices ( e. G. libc5, libc6, dietlibc, klibc for Linux), but they’Re different major versions or libc vendors, and currently, the glibc is the one that’s wildly used, the other ones are for special purpose. So, in rality, you’re view is correct.

  46. schabi says:

    wr/t http://blogs.msdn.com/oldnewthing/archive/2006/09/15/755966.aspx#758832 :

    In fact, you have the choice between different libc implementations on some unices ( e. G. libc5, libc6, dietlibc, klibc for Linux), but they’Re different major versions or libc vendors, and currently, the glibc is the one that’s wildly used, the other ones are for special purpose. So, in practice, you’re view is correct.

  47. schabi says:

    wr/t http://blogs.msdn.com/oldnewthing/archive/2006/09/15/755966.aspx#760640 :

    >> Everyone’s happy enough with glibc that they at least don’t try to distribute their own copy of it, to "preserve compatibility" or some such hogwash.<<

    Not everyone’s happy with glibc. But most of those unhappy with glibc write their own replacement (like dietlibc), instead of distributing their own copy.

    And you’ve got one big advantage (as distributor, as well as developer): You’ve got the source of virtually everything, so you can compile it against the same versions of libs with the same toolchain (minus fixing some bugs).

    Whenever some binary only software comes in, Hell breaks loose.

  48. Anonymous says:

    It is good to see Microsoft release a fix when they make the same mistake.

    http://support.microsoft.com/kb/867855/en-us

    > A previous update implements a new memory

    > heap in NDIS and then calls a new allocator

    > function. However, this previous update does

    > not use the matching de-allocator.

    > Therefore, a free is requested for the wrong

    > memory pool.

    I think end users can download a package that includes this hotfix, but end users can’t install it.  Someone has to persuade vendors to provide flashable firmware.

    [As I already noted, Microsoft software is hardly immune to issues I raise. What’s your point? -Raymond]
  49. Anonymous says:

    What’s your point?

    (1):

    >> It is good to see Microsoft release a

    >> fix when they make the same mistake.

    That is in contrast to the enormous number of times when your company refuses to release a fix.  It was good to see this one.

    (2):

    >> Someone has to persuade vendors to

    >> provide flashable firmware.

    As you already noted….  Network Attached Storage devices aren’t the only ones that need flashable firmware.  I do hope vendors can be persuaded some day to let customers make needed updates.

  50. Anonymous says:

    When VS2005 SP1 has been released because of manifests it become more visible what version of VC++ DLLs

  51. Anonymous says:

    When VS2005 SP1 has been released, because of manifests it become more visible what version of VC++ DLLs

Comments are closed.