Why does a corrupted binary sometimes result in "Program too big to fit in memory"?


If you take a program and corrupt the header, or just take a large-ish file that isn’t a program at all and give it a “.exe” extension, then try to run it (Warning: Save your work first!), you will typically get the error “Program too big to fit in memory”. Why such a confusing error message? Why doesn’t it say “Corrupted program”?

Because the program isn’t actually corrupted. Sort of.

A Win32 executable file begins with a so-called “MZ” header, followed by a so-called “PE” header. If the “PE” header cannot be found, then the loader attempts to load the program as a Win16 executable file, which consists of an “MZ” header followed by an “NE” header.

If neither a “PE” nor an “NE” header can be found after the “MZ” header, then the loader attempts to load the program as an MS-DOS relocatable executable. If not even an “MZ” header can be found, then the loader attempt to load the program as an MS-DOS non-relocatable executable (aka “COM format” since this is the format of CP/M .COM files).

In pictures:

MZ PE Win32
NE Win16
else MS-DOS relocatable
else MS-DOS non-relocatable

Observe that no matter what path you take through the chart, you will always end up at something. There is no exit path that says “Corrupted program”.

But where does “Program too big to fit in memory” come from?

If the program header is corrupted, then various fields in the header such as those which specify the amount of memory required by the program will typically be nonsensical values. The loader sees an MS-DOS relocatable program that requires 800KB of conventional memory, and that’s where “Out of memory” comes from.

An MS-DOS non-relocatable program contains no such information about memory requirements. The rule for loading non-relocatable programs is simply to load the program into a single 64KB chunk of memory and set it on its way. Therefore, a program with no “MZ” header but which is larger than 64KB in size won’t fit in the single 64KB chunk and consequently results in an “Out of memory” error.

And since people are certain to ask:

  • “MZ” = the legendary Mark Zbikowski.
  • “NE” = “New Executable”, back when Windows was “new”.
  • “PE” = “Portable Executable”, because one of Windows NT’s claims to fame was its portability to architectures other than the x86.
  • “LE” = “Linear Executable”, used by OS/2 and by Windows 95 device drivers.
Comments (44)
  1. Phylyp says:

    Will Vista or one of its successors have an RC honoring you? :)

  2. Nate says:

    Well, shouldn’t the loader identify any such program that requires greater than 64K of memory as a corrupt program? Unless I’m missing something, it isn’t like there isn’t a way to discover this situation before hand.

  3. Mike Dunn says:

    Trivia tidbit: "ZM" is also a valid header for the beginning of the file.

  4. 8 says:

    How about only going that last route if the file ends with ‘.com’? Makes a lot of sense to me, whereas naming an old style 64k program anything other then *.com sounds silly. Who does that?

  5. Jeff Robertson says:

    I assume the "ZM" has something to do with portability to systems with different endian-ness ?

  6. 8 says:

    Yeah, for people running dniWswod or something :D

  7. DavidE says:

    Nate and 8, does this really need a solution? I mean, this seems to be a case where the user has to do something really dumb on purpose to make it happen, or it’s an indication that the user’s system is beginning to destroy itself.

    Sadly, people still have a hard time learning the lesson of putting headers in file formats. How many of us have had to deal with the lack of version information in the first version of a file format?

  8. James MAstros says:

    Is there a back-compat reason for allowing executables named .exe to load as DOS non-relocatables, or allowing .com files, .pif files, etc, to load PE executables?

    This has, in the past, been a factor in the spread of email worms — people (and sometimes antivirus writers) don’t recognize all the combinations of extensions and executable code.

  9. orcmid says:

    Raymond, your RSS feed is on fire, and this is the only way I have to contact you.

    With your postings this morning, your feed is now downloading as a refresh of all posts back to "From Doom to Gloom: … ." They all show the same timestamp in NewsGator Outlook 2.5 and they all have the same content: the URL of the top of the blog with no content whatsoever.

    Fortunately I caught it before allowing these to replace the full-content ones I’ve filed for any of those.

    Larry Osterman and Michael Kaplan’s blogs and a couple of other blogs.msdn.com blogs display the same phenomenon. It only happens if there have been new posts today. Michael says he’s reporting the problem.

  10. Jeff: as I recall, the "ZM" is used when the are multiple sections in the file. I can’t remember if such a file consists of multiple-"ZM" sections followed by a final "MZ" section or vice-versa; I’d have to see if I still have my old reference books in the attic…

  11. oldnewthing says:

    orcmid: The blog server administrators installed a new version of the server software. It reset a lot of stuff.

  12. Nate says:

    DavidE: While this situation is hardly a pressing flaw in Windows, it is something that I have seen happen on occasion, in my case some (weird) malware. It just seems so trivial to do very basic sanity checking to me.

  13. KJK::Hyperion says:

    Hey, wonderful, the new version of the blog software is broken. I get the wrong sender and no link to the full article

  14. Maks Verver says:

    I would say that one of the characteristics of a ‘.COM file’ is that it fits in a 64KB segment. If the binary is larger than that, it cannot possibly be a valid file of this type. Compare this with the relocatable binary: it has the constraint that it must start with an MZ header, otherwise it’s not a valid file. Why can non-relocatable binaries not have a similar constraint on file size?

    The error message suggests there exist systems (presumably with more free memory) that can load the binary in question, while as a matter of fact it cannot, because the file will never fit in a 8086 64KB segment.

    The diagram should probably have been:

    MZ present: …

    MZ absent:

    Filesize <= 64KB: assume MS-DOS non-relocatable

    Otherwise: Invalid file!

  15. James says:

    Isn’t this begging the question? "Why can’t Windows tell me that the program is corrupted?" "Because Windows dorsn’t have code to do that." It doesn’t explain why it’s technically infeasible to add that exit path. (Not that I’m saying it isn’t technically infeasible–I don’t know anything about the structure (or possibly lack thereof) of COM executables–but this article doesn’t really explain.)

  16. Stu says:

    So what happens on Win64, where the old .com format isn’t supported?

  17. CN says:

    James: A COM file has no structure. None at all. The only way to tell that something is wrong is if it’s larger than 64 k.

    BTW, for those who know, is it really 65536 bytes? Isn’t it more like 655280 or something? I have some vague memory of the COM file actually starting at CS:0100, with some kind of process control block (command line/DOS housekeeping/whatever) at CS:0000. That would also be why debug loads any file at 0100, unless it’s a recognizable EXE. What good is a debugger if it doesn’t give you the correct offsets? :-)

  18. CN says:

    Ooops, that should of course be 65 280, not 655 280. (0xFEFF)

  19. CN: you’re right, a COM file is indeed loaded starting as CS:0100 — the first 256 bytes are the Program Segment Prefix (PSP). I guess that puts an upper limit on size at 63.75KB.

    The PSP contains, amongst other things, bits and pieces for CP/M compatibility (at CS:0000 you’ll find CD 20 == INT 20h, which is the old "terminate program" call and was superseded by INT21h function 4Ch, with your errorlevel in AL; at CS:0005 is a JMP to the INT 21h handler (?) since CP/M programs used CALL 5 to access the OS). There should be the skeletons of two FCBs in there as well (overlapped; if you only wanted to use one you didn’t need to copy the second one out of the way), plus the complete command line… Been a long time since I needed to think about any of this, though!

  20. Unrequired Name says:

    Well why doesnt Window require a digital signature on the file to verify if it valid or not. Stupid Microsoft. Then you could have forced your DRM on us too.

  21. Doug says:

    James MAstros:

    "Is there a back-compat reason for allowing executables named .exe to load as DOS non-relocatables, or allowing .com files, .pif files, etc, to load PE executables?"

    I believe that the Windows 98 COMMAND.COM is actually an "MS-DOS relocatable" (i.e. DOS "exe"). I could be wrong, though… I don’t have a Win98 computer at the moment to check this.

  22. Stu says:

    Doug: Just confirmed, in Windows 95 no less, that COMMAND.COM is an MZ executable.

    However, there still doesn’t appear to be a reason that Windows doesn’t check the file size before assuming that it is a .com type executable.

  23. J. Edward Sanchez says:

    Doug: Yes, Windows 98’s COMMAND.COM is actually a MS-DOS relocatable ("EXE") file.

  24. 8 says:

    Aha, so the BC is with old programs assuming it’s command.COM not command.EXE. That explains the com filesize > 64kB feature, but not the exe-file-without-mz-or-zm-header-loaded-as-com one. What’s the point in that? Are there old programs who do that?

  25. John Elliott says:

    I suppose the MZ header (and 8086 stub) are still present even in versions of Windows for other CPU architectures.

    CN: "What good is a debugger if it doesn’t give you the correct offsets?" Well, according to the Interrupt List, MS-DOS DEBUG sets up the CALL 5 entry point incorrectly, so any DOS program that uses CALL 5 won’t run under DEBUG.

    One other point: If a program starts MZ but is too short to have an EXE header, it’s treated as a COM file (at least by XP). Thus: 4D 5A B2 43 B4 02 CD 21 CD 20

  26. jtas says:

    Just FYI. If you you rename a Win32 EXE program to .COM it still runs just fine.

    I’ve seen programs that do this (devenv.com and devenv.exe) so that the .COM file is just a Console Win32 program that if it parses any non-command line parameters, runs the EXE, a GUI Win32 applcation. If there are both an ABC.COM and an ABC.EXE in the same directory, the command prompt will execute the .COM, I believe, for compatibility reasons. But, it may also have something to do with the PATHEXT environment variable.

  27. Martin says:

    Couldn’t the error message read "Program too big to fit in memory or corrupt"?

    That would then prompt us to check both options instead of scratching our heads.

  28. emmenjay says:

    I have memories of Mr Z. Back in the dark ages I worked on a program that walked the MSDOS heap. From memory (pun not intended) each memory block header began with an "M" except the last block which began with a "Z". He was obviously a busy gentleman. :-)

  29. vince says:

    (Warning: Save your work first!)

    So it’s possible to crash Windows by just renaming a data file and executing it?

  30. So it’s possible to crash Windows by just renaming a data file and executing it?

    Entirely possible depending on what user you are logged in as. Just as a thousand monkeys with typewriters and infinite time will eventually produce the complete works of shakespeare what is in the random file may be instructions to overwrite crucial files, terminate programs etc.

  31. vince says:

    > So it’s possible to crash Windows

    > >by just renaming a data file and

    > > executing it?

    > Entirely possible depending on what

    > user you are logged in as. Just as a

    > thousand monkeys with typewriters

    > and infinite time will eventually

    > produce the complete works of

    > shakespeare what is in the random

    > file may be instructions to

    > overwrite crucial files, terminate

    > programs etc.

    Well yes, if the data file magically somehow calls a command that erases all your files, that would be bad. But in that case "Saving all your files first" won’t help.

    As you said, a better suggestion would be to run it as a temporary user, and still that shouldn’t crash the OS or harm anything

  32. Erbo says:

    OS/2 2.0+ (after the MS-IBM split) used an .EXE file format similar to the "LE" format but not identical; it used the signature "LX" for its headers.

  33. Michiel says:

    If you wondered why a .COM file bigger than 64K is still loaded: It’s because the extra bits (that aren’t loaded by the OS) can be loaded on-demand by the program itself. Simply open argv[0], seek to 64Kb and read from there.

    The advantage is that an application could fit in a single file. Install=copy 1 file. Unistall=delete 1 file.

  34. Dmitri Chatokhine says:

    So the same mistake was made at least twice: calling Windows "NT" for "New Technology", and also this "New Executable". The Old New Thing indeed.

  35. Cheong says:

    [quote]

    How about only going that last route if the file ends with ‘.com’? Makes a lot of sense to me, whereas naming an old style 64k program anything other then *.com sounds silly. Who does that?

    [/quote]

    ".COM" file extension means nothing to the loader. You can rename "regedit.exe" to "regedit.com" and see it’ll run. This is sometime useful to fool virus that’ll run each time you run an ".EXE" file.

    btw, have anyone seen virus that name itself as thing such as "www.microsoft.com" in System32 directory to fool people that are lazy enough to just type the domain name without prefixing "http://&quot; to run it from the "Run…" menu?

  36. Neil says:

    I was always under the impression that a position independent executable (i.e. "nonrelocatable") could be > 64K but then it would have to do fancy segment arithmetic to access the rest of memory (it would load into a single memory block consuming available conventional memory, whereas a relocatable EXE only loads into necessary memory). I don’t have a suitable executable handy but I can persuade DEBUG to load a file of over 64K to result in the same memory "layout".

  37. Hiei says:

    aha

    I cannot understand what you said,

    but I really want to leave a message on your blog.

    heihei~

  38. Dawno, dawno temu, jeszcze w czasach, gdy dominującym systemem był Windows 2000, prowadziłem szkolenie Hardening Windows 2000. W jego ramach pokazywałem kilka "sztuczek", między innymi wykorzystanie Alternate Data Streams w NTFS.Jedną ze "sztuczek",

Comments are closed.