Memory allocation functions can give you more memory than you ask for, and you are welcome to use the freebies too, but watch out for the free lunch

Memory allocation functions like Heap­Alloc, Global­Alloc, Local­Alloc, and Co­Task­Mem­Alloc all have the property that they can return more memory than you requested. For example, if you ask for 13 bytes, you may very well get a pointer to 16 bytes. The corresponding Xxx­Size functions return the actual size of the memory block, and you are welcome to use all the memory in the block up to the actual size (even the bytes beyond the ones you requested). But watch out for the free lunch.

Consider the following code:

BYTE *GetSomeZeroBytes(SIZE_T size)
 BYTE *bytes = (BYTE*)HeapAlloc(GetProcessHeap(), 0, size);
 if (bytes) ZeroMemory(bytes, size);
 return bytes;

So far so good. We allocate some memory, and then fill it with zeroes. That gives us our zero-initialized memory.

Or does it?

BYTE *bytes = GetSomeZeroBytes(13);
SIZE_T actualSize = HeapSize(GetProcessHeap(), 0, bytes);
for (SIZE_T i = 0; i < actualSize; i++) {
 assert(bytes[i] == 0); // assertion fires!?

When you ask the heap manager for 13 bytes, it's probably going to round that up to 16, and when you call Heap­Size, it may very well say, "Hey, I gave you three extra bytes. Don't need to thank me."

The problem comes when you try to reallocate the memory:

BYTE *ReallocAndZero(BYTE *bytes, SIZE_T newSize)
 return (BYTE*)HeapReAlloc(bytes, GetProcessHeap(),
                           HEAP_ZERO_MEMORY, newSize);

Here, you said, "Dear heap manager, please make this memory block bigger, and zero out the new bytes. Kthxbai." And, assuming the heap manager was successful, you will indeed have a larger memory block, and the new bytes will have been zeroed out.

But the memory manager won't zero out the three bonus bytes it gave you when you called Heap­Alloc, because those bytes aren't new. In fact, the heap manager assumes that you knew about those three extra bytes and were actively using them, and it would be rude to zero out those bytes behind your back.

Those bytes you didn't know about since you didn't check.

You might think the problem is that you mixed zero-allocation modes. You allocated the memory as "Go ahead and give me garbage, I'll zero it out myself", and then you reallocated it as "Can you zero it out for me?" The problem is that you and the heap manager disagree on how big it is. While you assume that the size of it is "the exact number of bytes I asked for", the heap manager assumes that the size of it is "the exact number of bytes I gave you." Those bytes in the middle fall through the cracks.

Therefore, you might try to fix it by changing your function like this:

BYTE *ReallocAndZero(BYTE *bytes, SIZE_T newSize)
 SIZE_T oldSize = HeapSize(GetProcessHeap(), bytes);
 BYTE *newBytes = (BYTE*)HeapReAlloc(bytes, GetProcessHeap(),
                                     0, size);
 if (newBytes && newSize > oldSize) {
  ZeroMemory(newBytes + oldSize, newSize - oldSize);
 return newBytes;

But this doesn't work, because of the reason we gave above: Your call to Heap­Size will return the actual block size, not the requested size. You will therefore forget to zero out those three bytes you didn't know about.

The real problem is in the Get­Some­Zero­Bytes function. It decided to manually zero out the bytes it received, but it zeroed out only the bytes that were requested, not the actual bytes received.

One solution is to make sure to zero out everything, so that if it is reallocated, the extra bytes gained in the reallocation will also be zero.

BYTE *GetSomeZeroBytes(SIZE_T size)
 BYTE *bytes = (BYTE*)HeapAlloc(GetProcessHeap(), 0, size);
 if (bytes) ZeroMemory(bytes,
                       HeapSize(GetProcessHeap(), bytes));
 return bytes;

Another solution is to take advantage of the memory manager's HEAP_ZERO_MEMORY flag, which tells the memory manager to zero out the entire block of memory when it is allocated:

BYTE *GetSomeZeroBytes(SIZE_T size)
 return (BYTE*)HeapAlloc(GetProcessHeap(),
                         HEAP_ZERO_MEMORY, size);

… and to use the same flag when reallocating:

BYTE *ReallocAndZero(BYTE *bytes, SIZE_T newSize)
 return (BYTE*)HeapReAlloc(bytes, GetProcessHeap(),
                           HEAP_ZERO_MEMORY, size);

Most of the heap functions let you specify that you want the heap manager to zero out the memory for you, and that includes the bonus bytes. For example, you can use GMEM_ZERO­INIT with the Global­Alloc family of functions, and LMEM_ZERO­INIT with the Local­Alloc family of functions. The annoying one is Co­Task­Mem­Alloc, since it does not provide a flag for zero-allocation. You have to zero out the memory yourself, and you have to do it right. (The inspiration for today's article was a bug caused by not zeroing out the memory correctly.)

There are other implications of these bonus bytes. For example, if you use Create­Stream­On­HGlobal to create a stream on an existing HGLOBAL, the function uses Global­Size to determine the size of the stream it should create. And that value includes the bonus bytes, even though you may not have realized that they were there. Result: You create a stream of 13 bytes, but somebody who tries to read from it will get 16 bytes. You need to make sure that the code which reads from the stream won't get upset by those extra bytes. (For example, if you passed it to a function that concatenates streams, you just inserted three bytes of garbage between the streams.) You also need to be careful that those extra bytes don't leak any sensitive information if you, say, put the memory block on the clipboard for everyone to see.

Bonus chatter: It appears that at some point, the kernel folks decided that these "bonus bytes" were more hassle than they were worth, and now they spend extra effort remembering not only the actual size of the memory block but also the requested size. When you ask, "How big is this memory block?" they lie and return the requested size rather than the actual size. In other words, the free bonus bytes are no longer exposed to applications by the kernel heap functions. Note, however, that this behavior is not contractual; future versions of Windows may start handing out free bonus bytes again. Note also that not all heap managers have done the extra work to remember the requested size, and they will continue to hand out bonus bytes. Therefore, you must continue to code defensively and assume that bonus bytes may exist (even if they usually don't). (And note that heap debugging tools may intentionally generate "bonus bytes" to help flush out bugs.)

Double extra bonus chatter: Note that this gotcha is not specific to Windows.

// resize a block of memory originally allocated by calloc
// and zero out the new bytes
void *crealloc(void *bytes, size_t new_size)
 size_t old_size = malloc_size(bytes);
 void *new_bytes = realloc(bytes, new_size);
 if (new_bytes && new_size > old_size) {
  memset((char*)new_bytes + old_size, 0, new_size - old_size);
 return new_bytes;

Virtually all heap libraries have bonus bytes.

Comments (44)
  1. Joshua Ganes says:

    I remember reading a Joel on Software article where Joel describes fixing a bug related to this issue:…/fog0000000306.html  Thanks to your description, I think I finally understand what he was talking about.

  2. Jon says:

    What appalling advice today.   If you ask for 13 bytes then you can use 13 bytes.  No more.

    And if Create­Stream­On­HGlobal uses more than the allocated size then the bug is there,  not later on.

  3. tobi says:

    Man, this is nasty API behavior. Very easy to misuse.

  4. Ben Karas says:

    Wow.  I think I have written that mistake before.  I am glad to have usually relied on helper functions as at least they provide a central place to fix things.  But ouch that is a nasty gotcha.

  5. @Adam Rosenfield.

    I was thinking it had a nice Welsh ring to it.

  6. jader3rd says:

    One more reason I prefer managed code over native.

  7. anonymouse says:

    Argument passed to HeapReAlloc is insufficient.

  8. Anonymous Coward says:

    @jader3rd: Why? Is it so terrible to get more bytes than you requested? Or that if you ask for the size of a block, you can use all bytes within that size? Or that you should be consistent when asking for zeroed memory? This is all well-documented and the runtime libraries for managed code contain similar contracts and rules that you have to stick to.

  9. Gabe says:

    Anonymous Coward: My guess is that the answer is "all of the above". That doesn't mean that there's anything terrible about the rules, or even that they're difficult to follow.

    However, every moment spent thinking about memory allocation is a moment that could otherwise be spent thinking about business logic (you know, the reason the program exists in the first place). Writing managed code drastically reduces the cognitive load of housekeeping. Instead of perhaps 5% of code dedicated to resource management in a language like C (where getting it wrong likely means long debug sessions and potential security disasters), a typical C# program might have 0.05% of its code doing resource management.

    Of course, the fact that Raymond has to write posts like this means that maybe there is something wrong with the rules, and that they are probably hard to follow.

  10. Random832 says:

    The clipboard data is an HGlobal. I'm not aware that there is any way to find out how large it is except to call GlobalSize. Have I missed something?

  11. Andrew says:

    @Jon: The system is telling you how much it gave you. Though it would be safe to -not- use those bytes, depending on how you use your memory exactly, it's not like you're in unallocated memory land. The memory manager is in on those extra bytes, and even wants you to know. Just keep some cases around to account for when you get those "bonus bytes" and move on with life.

  12. Mason Wheeler says:

    I've never understood why someone would create a heap allocator that does not contractually zero out all memory it allocates before returning.  That's just asking for all kinds of trouble.

    (And before anyone even thinks about replying and saying "performance," please test your assumption. Write up a simple routine that allocates a GB or two of RAM, then uses a simple FOR loop to zero it.  Time it, and realize that this is a very small amount of time for a very large amount of memory.  And keep in mind that a simple FOR loop is a very naive way to zero memory, and there are optimization tricks that can make it even faster.)

    Second attempt at posting this…

  13. Evan says:

    @Mason: "And before anyone even thinks about replying and saying "performance," please test your assumption. Write up a simple routine that allocates a GB or two of RAM, then uses a simple FOR loop to zero it."

    I'll do one better.

    I've written an LD_PRELOAD library that wraps malloc() and calls memset(). I'm measuring the performance of a program/library I've worked on with (1) no interposition, (2) interposing but not doing anything, and (3) interposing and memsetting the region to 0.

    I actually don't have much of an idea of what I'll find… I would guess that I'll be able to measure a noticeable performance hit though. We'll see.

  14. Adam Rosenfield says:

    @Mason: Suppose a program allocates a large scratch buffer but only ends up using a small portion of that.  If you don't zero out the scratch buffer, then the pages for most of that buffer never get committed and it's just like you allocated a much smaller buffer.  If you do zero it out, you might cause something else to page to disk that wouldn't otherwise, and paging is definitely not fast.

  15. John says:

    @Mason:  Today I would agree with you, but back in the time before time the performance implications were measurable.

  16. Mason Wheeler says:

    @Adam: In that case, your problem isn't the allocator; it's the programmer who went and allocated a buffer that was far larger than he actually needed.

  17. Not a complaint says:

    The blog software now requires scripts from to show comments…

  18. If you give a mouse a cookie…

  19. JM says:

    @Mason: would you care to test *your* assumption on an early C compiler and repeated allocations of small amounts of memory, instead of a big block once? Preferrably on a processor where there are no "optimization tricks" that allow for anything faster than a simple loop?

    C sacrifices all manner of things any modern language takes for granted, because it was conceived in an era much different from ours. Although a lot of allocators are from later days, a lot aren't, and even those that are are just following the philosophy. I for one wouldn't bet money on the cost of zeroing out memory being insignificant on an 80386 running Windows 95.

    I am not advocating that returning uninitialized memory is a good idea in modern times, by the way — if nothing else, security concerns should override the performance concern. I'm also not saying anything about the HeapAlloc() way of returning more bytes than you asked for, because my mother always told me I should say nothing if I have nothing nice to say.

  20. cheong00 says:

    @Mason: Reading random bytes from non-zeroed memory allocation is a common way for cryptographic functions to gain "salt" in their algorithm for better randomization. Making the default behaviour to zero out memory "will" break these applications.

  21. Cesar says:

    @cheong00: It is a bad way, since that non-zeroed memory can be quite predictable (computers are deterministic), and what non-predictability it has can mask bugs where the RNG is not being seeded properly (as in the infamous Debian OpenSSL fiasco, where the only source of randomness left was the process ID). It made sense in the old days before we could ask the kernel for good-quality randomness (/dev/random, CryptGenRandom, and similar functions), but nowadays it is plain bad coding.

    @Evan: your memset is prefaulting the pages. Depending on the allocation and access patterns, this could make a difference. Try reading a single byte every 4k (the most common page size) instead of a memset, and see what effect it has. Also try MADV_WILLNEED instead and see what happens.

    It could be the processor caches, but I think prefaulting is a more likely explanation.

  22. Mason Wheeler says:

    @John: I'm not so sure.  Bear in mind that memory and CPU speed tend to increase together.  Back in the day, when it would have taken a long time to zero out a gig or two of memory, computers didn't *have* that much memory, and it wouldn't have taken long to zero out what they did have.

    And now I'm curious as to what the actual effect would have been.  Maybe when I get home I'll have to pull out an ancient Turbo Pascal install and test it out on a DOSBox set up to emulate an 8086…

  23. Evan says:

    OK, so here are my performance numbers, which are… surprising. There are two highly-related programs under test, doing two very different operations on related data structures. I ran each program with two inputs. Each was run in three different configurations: (1) normally, (2) with an LD_PRELOAD library wrapping malloc() but not doing anything, and (3) with the wrapper calling memset() after each allocation. This was done on Linux because I have it conveniently available right now and I'm not sure how to interpose on allocations in Windows.

    The programs themselves are C++. I'm not positive, but I think most allocations are done by the default STL allocators (which apparently call down to malloc eventually, fortunately), usually by things like sets and maps. The allocations tend to be small but not tiny, being in the area of 60 bytes.

    The main number reported below is the median of 5 runs. There was more variation between runs (within a configuration) than I'd expect, so I also give the min and max.) They are the 'user' component of what's reported by the time utility, but the amount spent in other components is trivial.

    I don't claim that this program is representative of all, but I definitely did *not* pick it because I thought it would give any particular output.


    Program A, input A:

    1. No interposition: 6.40 sec. (Range: 6.17-7.26)

    2. Null interposition: 7.49 sec. (Range: 6.21-8.01)

    3. Zeroing interposition: 6.49 sec. (Range:6.36-7.19)

    (This allocates a total of 559 MB in 9.5 million allocations.)

    Program A, input B:

    1. No interposition: 160 sec (Range: 133-166)

    2. Null interposition: 163 sec (Range: 145-173)

    3. Zeroing interposition: 134 sec (Range: 134-166)

    (Total: 6.473 GB across 105 million allocations.)

    Program B, input A:

    1. No interposition: 15.4 sec (Range: 14.7-16.0)

    2. Null interposition: 15.3 sec (Range: 14.8-16.0)

    3. Zeroing interpositon: 16.2 sec (Range: 15.3-16.7)

    (Total: 4.120 GB across 55 million allocations.)

    Program B, input B:

    1. No interposition: 19.8 sec (Range: 19.7-20.2)

    2. Null interposition: 19.9 sec (Range: 19.6-21.2)

    3. Zeroing interposition: 20.2 sec (Range: 20.0-20.6)

    (Total: 982 MB across 15 million allocations.)

    So for program A, the version that called memset was actually *faster* than the null interposition configuration. For program B, it was only slightly slower (by 5% in the worse of the two inputs.)

    My guess as to what is going on in program A is that there's some cache effect. The memset could be causing cache lines to be prefetched (due to the nice, predictable linear pattern) which would cause cache misses under the no interposition version.

    If anyone wants the programs and inputs used, I can make them available. If you want to play with this yourself, the wrapper library is at (There are some vestigal headers and stuff from another project I copied it from.) You can uncomment the modifications to num_allocations and size_allocations if you want those statistics, and comment out the call to memset() if you want the null interposition configuration.

  24. No says:

    Raymond, I'm not going to write programs that assume "bonus bytes" will ever be possible again. They were a bad idea to start with, and they're an even worse idea today. We'll never re-enable the feature, and we both know it. I'm not going to waste my time writing code that reacts to cases that won't happen in reality, and you're doing disservice to your readers suggesting that they waste their own time. You might as well ask them to prepare for a big-endian Windows, or to maintain the elaborate fiction that CloseHandle doesn't work on sockets.

    No. My code assumes that the OS slowly transitions from insane to sane, and I'm not going to bother preparing programs to run on an OS less sane than the present one. If the OS becomes more insane, that's a _bug_.

    [Then make sure never to use the BSD library, or more generally, any other heap library, because if you look closely, they all have bonus bytes. -Raymond]
  25. Evan says:

    @cheong00: "Making the default behaviour to zero out memory "will" break these applications."

    No it won't. What it *will* do is highlight that those programs are already broken.

  26. Evan says:

    By the way, let me justify that statement a bit.

    First, there are already in the field several allocators which zero memory. Windows's fault-tolerant heap does, I think OpenBSD's allocator does (or can), and I think the more researchy DieHard does (or can).

    Second, there are several situations where even if your allocator doesn't normally memzet to zero, it can still give you a bunch of zeroes. One obvious one is if that's what was there last, but arguably that's just entropy. But one less obvious one is if you're on Windows and malloc is grabbing actual, fresh memory. When your program requests a new page, it's at least somewhat likely (I'm not sure *how* likely… would be an interesting measure) to get a page of all zeroes. Windows has the zero-page thread which does basically what it sounds (memsets unmapped physical pages to 0 when the system isn't bothering to do anything else), and will prefer to give you a page which has been zeroed out. Voila; your allocation is all zeros with reasonably high priority.

    (To my knowledge, Linux doesn't do this. Personally, I think it's absurd that OSes will hand out non-zeroed memory to processes. Even Windows will do this if the ZPT can't keep up with demand. I wouldn't be surprised if you could come up with a reliable exploit based on the fact that they don't. In fact, I'd be surprised if you couldn't.)

    Third, even things which are arbitrary are often still deterministic. How much entropy does that uninitialized block of memory have, after all? There's *no way* to know that. It could be completely-deterministically set! And then it's buying you nothing.

    Fourth, the C standard makes no guarantee about what malloc gives you. An implementation is entirely within its rights to zero out the blocks. If you assume that it doesn't, you're not portable. (And as the first point says, this is more than a theoretical concern.)

  27. Neil says:

    [Blog software ate my comment]

    I always thought it was odd of C not to let you know how many bonus bytes you got (realloc seems to need to know).

    I notice that sqlite actually wants to know the number of bonus bytes in advance, although it only seems to need it to avoid computing memory statistics for a realloc which is likely to return the original memory block.

  28. Evan says:

    OK, one more post for me for the night. Hopefully this isn't a double post.

    It looks like I was sort of off base with this claim: "To my knowledge, Linux doesn't do this. Personally, I think it's absurd that OSes will hand out non-zeroed memory to processes. Even Windows will do this if the ZPT can't keep up with demand. I wouldn't be surprised if you could come up with a reliable exploit based on the fact that they don't. In fact, I'd be surprised if you couldn't."

    Linux *does* only give out zeroed pages, which is good. This is definitely true for mmap() (at least the manpage says so anyway), and by experimentation is true for malloc() as well. In my defense, the man page for mmap() explicitly says that memory is initialized to zero because of the security implications. :-) (There is a configuration to turn off the zeroing, but it's not on by default.) I just think it probably does it on allocation, rather than in the background like Windows's ZPT.

    I'm not sure what the story is on Windows. You can definitely call malloc() and get back non-zero memory, but I'm not sure it's coming through from another process unmodified. I was looking at some memory dumps and there are patterns in it that I wouldn't expect if that were the case. I was also unable to read some stale data from another process (which I'd have expected to be able to do), even if I allocated and searched all physical memory.

    @@Cesar: "what non-predictability it has can mask bugs where the RNG is not being seeded properly (as in the infamous Debian OpenSSL fiasco, where the only source of randomness left was the process ID)"

    That's sort of accurate, but not quite. What happened was that OpenSSL was using some uninitialized buffers to possibly increase entropy. (However, at no point did they *rely* on that being the case, and their implementation would have been fine with a zeroing malloc.)

    Valgrind whined about this, at a couple calls to a function MD_Update(). One of those calls could be safely removed, and it was actually surrounded by '#ifndef PURITY' for that reason. However, the Debian maintainer removed *both* calls… and the other call could *not* be safely removed. But that's exactly what happened.

    (The reason the second call could not be removed has nothing to do with the fact that it was reading uninitialized memory some of the time.) See for a good description of the problem.

    So it wasn't really that the uninitialized reads were masking a problem, it was that the Debian maintainer went too far in removing the uninitialized reads and removed critical code as well. This removed the uninitialized reads, but also removed the actual randomization. :-)

    @Cesar: "our memset is prefaulting the pages. Depending on the allocation and access patterns, this could make a difference"

    I'll check out your options too.

    If anyone has ideas of how to interpose on Windows programs too, I can try it there as well. (Detours maybe?)

  29. "I'm not going to bother preparing programs to run on an OS less sane than the present one."

    I'm not looking forward to Windows 8 either.

  30. Deduplicator says:

    "I'm not going to bother preparing programs to run on an OS less sane than the present one."

    So you program by happenstance? Or are you the recognized ultimate authority on sanity? There might be tradeoffs you are unaware of, or which don't touch you because you dedicated your whole computer to running your freecell-clone. Even if there was no tradeoff involved, you expect to figure everyone to be aware of all the ripples a slight optimization/security fix might have somewhere 'obviously' unrelated, which then still work according to contract, maybe even better, but not quite the same? There might be sound reason behind those restrictions.

    By the way, what about covariant return types? Want to forbid that too? Those objects don't react quite the same… Isn't that insane? And that works in VM-languages too.

  31. Cesar says:

    @Evan: Great link. The point I was trying to make is also expressed in it: "Throwing the pid into the entropy pool on each call to RAND_bytes isn't actually helping create entropy, but it does keep the buggy Debian version from being completely deterministic. If it had been completely deterministic, the bug would likely have been noticed much sooner". The same could happen if you add the junk you got in memory from a malloc() call; it is quite possible that this junk has only a few bits of entropy, but these few bits would prevent it from returning the same result every time, thus masking the bug (it would be quite obvious if the same result happened every time).

    And yeah, the Linux kernel will always return a zeroed page when you actually access pages newly allocated from user space. In kernel space, of course, it is a different story; there you have to request a zeroed page explicitly (not unlike what you have to do with malloc versus calloc, or going back to the original subject with HeapAlloc).

  32. Cesar says:

    @Raymond: "Note that this gotcha is not specific to Windows."

    A lot of the things you talk about are not specific to Windows. That is probably why you seem to have so many non-Windows developers following your blog. Even when you talk about a Windows function, the underlying issues often also exist elsewhere.

    But this particular problem is probably less common on Unix-style systems. At least as far as I can see, the POSIX standard does not have a malloc_size() or malloc_usable_size() function. People wanting to be portable will thus avoid depending on these functions.

    A quick search for malloc_usable_size tells me, on the first page, that it is buggy on glibc when MALLOC_CHECK_ (a debugging aid) is enabled, that it does not exist on uclibc, and that it used to exist but does not anymore on Android's libc (all three are different implementations of the standard C library for Linux). So depending on it will only bring you headaches.

    [But it does exist in BSD, and BSD is a pretty common baseline. Many other heap implementations expose bonus bytes. If your policy is "do not call functions that expose bonus bytes", then you can add this article to your collection of reasons why. it also means you can tell people who ask "How can I figure out the size of an allocated heap block?" that they're screwed. -Raymond]
  33. voo says:

    @Evan Interesting results for your runs. And yes I'd be interested in your programs, although personally getting the runtimes and not only min-max-avg ranges would be good enough for me too (5 runs per config is better than most people bother, but the usual recommendation is more 20-30; but obviously that takes lots of time). Seems to me your std will probably be pretty high and the differences aren't that big.

  34. No says:

    Raymond, while other heap libraries provide "bonus bytes", none of them is insane enough to make these bonus bytes affect user-visible behavior. That's the insane part, and that's what I refuse to accommodate in my programs.

    [Um, malloc_size exposes the bonus bytes, and once the bonus bytes are exposed, you have this problem. Or is malloc_size just a figment of my imagination? -Raymond]
  35. nobugz says:

    Oh, nice trap!  One optimization too many, but sure, hind-sight is 20-20.  Been running into that a lot lately, the C++ compiler committed that sin and it seems hard to un-sin it.  /vm has been quite troublesome again.

  36. Simon Buchan says:

    @Evan: Windows guarantees that freshly accessed private pages are demand-zeroed – the zero page thread is filling the zero page *cache* so demand-zeroed pages can be satisfied faster. I would be horrified if Linux didn't do the same: it's obviously a huge security hole.

    I like the free bytes in theory, but I hardly ever bother with *Size() except as an optimization for implementing (my)vector::reserved() (in toy projects, not production!) Is nice to point out that *ReAlloc() doesn't zero the spare bytes for you, but I *would* lay the blame at not matching memory was zeroed myself.

  37. Jolyon Smith says:

    @Gabe – managed code encourages you to think less about things that sometimes you cannot avoid having to think about.  The overall effect of managed code is therefore to increase the opportunity for accidental error, NOT to reduce it.

  38. Someone says:

    If Raymond is right, then the description of HEAP_ZERO_MEMORY in the HeapRealloc documentation (…/aa366704%28v=vs.85%29.aspx) is misleading or incomplete. The MSDN descriptions of the heap allocation functions all refer only to the size the application requests, so my expectation is that HeapRealloc will zero out any new memory outside the size given before to HeapAlloc() or HeapRealloc(). There is no hint that you are supposed to not "mix" the usage of HEAP_ZERO_MEMORY.

  39. GWO says:

    [Um, malloc_size exposes the bonus bytes, and once the bonus bytes are exposed, you have this problem. Or is malloc_size just a figment of my imagination? -Raymond]

    But the HeapRealloc() behaviour does not use a user-provided size. HeapRealloc() does the equivalent of a HeapSize() when deciding what to zero, and that's wrong. The programmer should only be bitten by the existence of bonus bytes if they've explicitly asked about them, and used the result.  

    If I call malloc_size() and use the result, I'm acknowledging that I want the bonus bytes, if I don't call malloc_size(), the existence of those bonus bytes should be invisible to me.  In not honoring that HeapRealloc() breaks that principle.  If I ask for 13 bytes, the allocator can assign me as big a block as it likes, but must behave in a way consistent with having given me 13.  HEAP_ZERO_MEMORY does not do that, and the documentation does not mention that "original size" means "size originally allocated" and not "size originally requested". It's a bug. And if OSX offers a reallocation function with the same semantics and documentaion, that's a bug too.

    [So you're saying that HeapRealloc should change its behavior depending on whether malloc_size has ever been called on that block? That seems awfully strange. "I have a bug that goes away when I turn on debugging." -Raymond]
  40. Tsss says:

    To use HeapSize() or malloc_size() is a WTF. If some application needs to use the allocation overhead: How will it work when an allocator provides very small overhead, or no overhead at all? What is a good usage of this unpredictable, but small amount of extra memory? Sounds complicated to me, and makes automatic validation of dynamic memory usage hard (after all, the application is expected to access only the requested memory, but not to touch memory beyond that size).

  41. Random832 says:

    @Evan IIRC, what Linux does is it has a "master" zero page, and all allocated pages are born as copy-on-write from that.

  42. Anonymous Coward says:

    In the code

    BYTE *ReallocAndZero(BYTE *bytes, SIZE_T newSize)


    SIZE_T oldSize = HeapSize(GetProcessHeap(), bytes);

    BYTE *newBytes = (BYTE*)HeapReAlloc(bytes, GetProcessHeap(),

                                        0, size);

    if (newBytes && newSize > oldSize) {

     ZeroMemory(newBytes + oldSize, newSize – oldSize);


    return newBytes;


    should size be newSize?

  43. Worf says:

    I believe Windows also gives out zeroed pages. If the ZPT is empty, then allocations manually zero it.

    This is for security and other reasons. However, it's still possible to get a non-zeroed buffer since the page may have been given to you earlier and recycled. So Windows will give a process a new zeroed page. But that page can be scribbled upon by the process long before the code using it checks.

  44. Stefan Kanthak says:


    | I'm not sure what the story is on Windows. You can definitely call malloc()

    | and get back non-zero memory, but I'm not sure it's coming through from

    | another process unmodified.

    malloc() is a routine provided by the C runtime library, not the Win32API.

    For the latter, see…/aa366781.aspx

Comments are closed.