A history of GlobalLock, part 1: The early years


Once upon a time, there was Windows 1.0. This was truly The Before Time. 640K. Segments. Near and far pointers. No virtual memory. Co-operative multitasking.

Since there was no virtual memory, swapping had to be done with the co-operation of the application. When there was an attempt to allocate memory (either for code or data) and insufficient contiguous memory was available, the memory manager had to perform a process called "compaction" to make the desired amount of contiguous memory available.

  • Code segments could be discarded completely, since they can be reloaded from the original EXE. (No virtual memory - there is no such thing as "paged out".) Discarding code requires extra work to make sure that the next time the code got called, it was re-fetched from memory. How this was done is not relevant here, although it was quite a complicated process in and of itself.
  • Memory containing code could be moved around, and references to the old address were patched up to refer to the new address. This was also a complicated process not relevant here.
  • Memory containing data could be moved around, but references to the old addresses were not patched up. It was the application's job to protect against its memory moving out from under it if it had a cached pointer to that memory.
  • Memory that was locked or fixed (or a third category, "wired" -- let's not get into that) would never be moved.

When you allocated memory via GlobalAlloc(), you first had to decide whether you wanted "moveable" memory (memory which could be shuffled around by the memory manager) or "fixed" memory (memory which was immune from motion). Conceptually, a "fixed" memory block was like a moveable block that was permanently locked.

Applications were strongly discouraged from allocating fixed memory because it gummed up the memory manager. (Think of it as the memory equivalent of an immovable disk block faced by a defragmenter.)

The return value of GlobalAlloc() was a handle to a global memory block, or an HGLOBAL. This value was useless by itself. You had to call GlobalLock() to convert this HGLOBAL into a pointer that you could use.

GlobalLock() did a few things:

  • It forced the memory present (if it had been discarded). Other memory blocks may need to be discarded or moved around to make room for the memory block being locked.
  • If the memory block was "moveable", then it also incremented the "lock count" on the memory block, thus preventing the memory manager from moving the memory block during compaction. (Lock counts on "fixed" memory aren't necessary because they can't be moved anyway.)

Applications were encouraged to keep global memory blocks locked only as long as necessary in order to avoid fragmenting the heap. Pointers to unlocked moveable memory were forbidden since even the slightest breath -- like calling a function that happened to have been discarded -- would cause a compaction and invalidate the pointer.

Okay, so how did this all interact with GlobalReAlloc()?

It depends on how the memory was allocated and what its lock state was.

If the memory was allocated as "moveable" and it wasn't locked, then the memory manager was allowed to find a new home for the memory elsewhere in the system and update its bookkeeping so the next time somebody called GlobalLock(), they got a pointer to the new location.

If the memory was allocated as "moveable" but it was locked, or if the memory was allocated as "fixed", then the memory manager could only resize it in place. It couldn't move the memory either because (if moveable and locked) there were still outstanding pointers to it, as evidenced by the nonzero lock count, or (if fixed) fixed memory was allocated on the assumption that it would never move.

If the memory was allocated as "moveable" and was locked, or if it was allocated as "fixed", then you can pass the GMEM_MOVEABLE flag to override the "may only resize in place" behavior, in which case the memory manager would attempt to move the memory if necessary. Passing the GMEM_MOVEABLE flag meant, "No, really, I know that according to the rules, you can't move the memory, but I want you to move it anyway. I promise to take the responsibility of updating all pointers to the old location to point to the new location."

(Raymond actually remembers using Windows 1.0. Fortunately, the therapy sessions have helped tremendously.)

Next time, the advent of selectors.

Comments (33)
  1. ac says:

    This was before my time, so I am curious as to how the standard C library functions like malloc worked at this point?

    In fact, could you use the standard C library in Windows applications at this time?

  2. Don’t forget that the GMEM_MOVEABLE semantics for GlobalReAlloc still apply in Windows today – I got burned on this a while ago, I had figured that Windows would ignore the GMEM_Xxx flags, when in fact it doesn’t.

    ac: There was a version of the C library for Windows applications, but I don’t know what it did about malloc/free. I suspect that it allocated fixed blocks of memory.

  3. Cooney says:

    A quick google would suggest that malloc wasn’t present until win32. Depending on the sort of code, std C stuff was likely to work, so long as it operated on fixed blocks.

  4. Did malloc and free work on near pointers? If so, the memory manager could’ve changed the selector behind the scenes; your memory would’ve moved around but you wouldn’t have known it.

    Is it wrong to say I kinda miss this stuff?

    What I really want to know is: are there gonna be similar pitfalls with 32bit/64bit?

  5. As Moishe says, in the old days, malloc returned a near pointer (i.e. to something in your data segment).

    Windows was free to move your data segment around in memory, but you wouldn’t notice.

    There was also an _fmalloc function (or something — I forget the exact name), which allocated GMEM_FIXED memory, but that was generally a bad idea.

    – Roger (who never programmed on Windows 1.0, but does have Windows 3.0 Real Mode flashbacks).

  6. John Elliott says:

    Searching for GlobalWire() in the MSDN library archive gives a few clues about what wired memory is – but why’s it called "wired"?

  7. James Risto says:

    I am thinking about very small memory devices. These techniques could be used again today for programming those? Cool, in a way.

  8. You know, when Win32 was rolling around, I was wonder if were going to get 32-bit memory handles from GlobalAlloc that could be used either for "smart discardability" like GMEM_DISCARDABLE, or truly huge chunks of memory could be held onto as long as you didn’t try to lock so many that you filled up 32-bit address space.

    GMEM_DISCARDABLE was cool. A slightly clumsy way to implement virtual memory by asking the application to take care of "paging" something back in.

  9. Carlos says:

    Visual C++ 1.52 (the last 16-bit version) had malloc. The memory manager in Windows 3.1 had paging; it was a lot more sophisticated than Windows 1.0.

    When you compiled an app you had to choose a "memory model", which determined whether the default pointer size was near, far or huge (a far pointer accessing > 64K memory). This also controlled what the default malloc returned.

  10. Anonymous Coward says:

    I remember the design target for Windows 1.0 being 512KB (not 640KB), a CGA or Hercules mono card and two floppy drives.

    I only used Windows 1.0 once. In Paint, I went to save a file and the save fiel dialog was just a text field. I wondered what happened if I just typed lots of junk. After holding down various keys for several minutes, I hit Ok. The hard drive light blinked every few seconds but nothing else happened. After a few minute I Ctrl-Alt-Del’ed the machine. Most directory entries were missing from the drive. Fortunately it was easy to recover DOS deletions in those days!

    I was an avid user of Windows 286 (a variant of Windows 2.0) and the rest is all history.

    What is also history is that until Windows 3.1, Microsoft pathologically refused to have a standard file selection box. Every single programmer wrote their own. Consequently you never quite knew what abomination of UI design a program would throw up! (That is still the case on Unix today :-)

  11. Nicholas Allen says:

    I think that the early Windows memory manager was cleverer than it needed to be in terms of discarding and moving code. It was unpredictable and fairly mysterious in the way it seemed to behave. An explicit overlay system probably would have been simpler to build and easier to understand. I think that building overlays, or buying a compiler to do it for me, would have been easier than trying to fight Windows.

  12. Raymond Chen says:

    Explicit overlays means that you know which segment to toss out when you need a new one. But in a multitasking environment, how do you know that nobody will ever need the dialog manager and the atom manager at the same time?

  13. Nicholas Allen says:

    Well, multitasking shouldn’t matter in this case since we haven’t gotten to multiprocessor support yet. Two processes may need different working sets, but they don’t need them at the same time.

    I’m not sure how overlays would be different from compaction in that particular case. If we need to load the dialog manager and the atom manager is marked freeable, what stops compaction from tossing the atom manager to make space?

  14. Raymond Chen says:

    Windows 1.0 supported co-operative multitasking, remember.

    Suppose there are three modules, the dialog manager, the atom manager, and the menu manager, and you have only enough memory for two. Which two do you pick? If you are using overlays, you have to decide ahead of time. Under the compaction model, everything gets tossed out, and then the things that are actually used get loaded back in.

  15. Ben Cooke says:

    One thing that is fun is getting hold of a copy of Windows 1.0 from somewhere and running the tools from it on Windows 2000. (or XP, I guess. I’ve not upgraded yet.)

    Most of the applications create Windows at the minimum size allowed and require you to drag them out to a sensible size. I assume this is because Windows 1.0 windows were always full-screen, or some other such environment change.

    Windows 2.0 applications, aside from a whole bunch of UI quirks, work just like modern applications, opening in a sensible-sized window and everything.

    I missed Windows 1.0 and Windows 2.0 when they first came out because I was still in Commodore Land. Windows 3.1 was current by the time I got my first PC. It was a 286 so it ran in Standard Mode. that was quite fun while it lasted.

    Computers are pretty boring these days.

  16. Mike Dunn says:

    Ben> It’s funny, from time to time I get the urge to dust off my old C=128D and try some GEOS programmimg. Back then I was a BASIC guy, no clue how to do GUIs, but now I get that nostalgia feeling and I wonder what I could do with a GEOS app.

    A few things stop me though. My 128’s keyboard got damaged in an earthquake, and I have no GEOS dev tools. :~(

  17. Nicholas Allen says:

    Ok, so let’s take the situation where someone wants to use the dialog, atom, and menu managers. We are extraordinarily low on memory and nothing can be discarded to make all three fit. Only two at a time can be brought into memory.

    The application calls a method in the dialog manager overlay. It’s not in memory so we’ll need to load it. The performance team has done an analysis of the static call graph and decided that the dialog and atom managers are frequently used together. So we bring in the dialog and atom manager overlays.

    Now, the application calls a method in the menu manager overlay. It’s not in memory so we’ll need to load it. Since nothing else is freeable in the system, we’ll have to discard either the dialog manager or atom manager.

    We decide to get rid of the atom manager since it hasn’t been used recently. Then, the application calls a method in the atom manager overlay. It’s not in memory so we’ll need to load it…

    In other words, we end up thrashing. But that’s a problem that hasn’t been solved yet today. We stipulated that there simply wasn’t any way to make the space for all three to be loaded at once. If your working set spills out of your high speed storage space, be it registers to cache, cache to main memory, main memory to disk, disk to archive, you’ll take a performance hit with some access patterns.

  18. Raymond Chen says:

    Yes, if all three are used simultaneously then you’re thrashing. But if say only the dialog manager and menu manager are being used, then you can leave the atom manager discarded and things keep running smoothly.

    Whereas if you used overlays you had to have already decided that the dialog manager and (say) menu manager are overlays of each other. Then a program that uses both will thrash, even if there is room for the menu manager if only it had been overlaid with the atom manager instead.

    Or maybe you’re talking about a different type of overlay where the thing that gets kicked out is determined dynamically? In which case that’s the same as the Windows 1.0 method (except that Windows 1.0 has an extra compaction step as a poor man’s MRU).

  19. Nicholas Allen says:

    Yes, the particular strategy picked for the overlay implementation would have a big impact. I don’t know if overlay schemes that supported dynamic discarding where well known when Windows was originally developed. If not, that’s probably a big mark against using overlays. I know a short while later they were in common use- Turbo Pascal’s overlay system could dynamically discard, backed by a heap buffer (and later an EMS buffer) to avoid hitting the disk. And a few years after that was probably the pinnacle of overlay usage in the DOS version of Nethack.

  20. Raymond Chen says:

    As I noted, the Windows 1.0 method is basically the same as dynamic overlays. (Just with a compaction step thrown in so that data could be tossed out of working set too.)

  21. Chris Becke says:

    My number 1 gripe with any open source or unix application is the atrocious totally different per app open/save dialog im presented with.

  22. keithmo [exmsft] says:

    If you want a trip down memory lane, check out this page: http://www.kernelthread.com/mac/vpc/win.html. This guy has screenshots of Windows 1.x and 2.x (and others) running in VPC on a Mac.

  23. Cooney says:

    Chris – take a look at Gnome and KDE for a markedly improved X experience.

  24. Btw, malloc() wasn’t always near. It depended on which version of the C runtime library you used:

    Small, medium, large, and huge

    I’m not sure, but I believe that they only had small and medium model versions of the CRT for windows however (which avoids the movable segment problem, since it only would allocate relative to DS)

  25. Ben Cooke says:

    Chris,

    As Cooney kinda half-said, if you standardize on either a KDE or a Gnome environment you’ll find things a lot more consistant. Fortunately, these days almost all apps have a version on each side and some lucky apps have a sensible application where the guts are separated from the UI and so both environments can be catered for.

    Unfortunately, much like many Windows developers, many linuxy developers are lazy and mess all of their UI code in with their "business logic" so making it work on the other side becomes almost as much work as rewriting the entire application.

  26. doynax says:

    Mike Dunn,

    Don’t worry. The CC65 guys are currently working on GEOS support =)

  27. That’s really interesting! I’m looking forward for more posts about Windows 1.0 and the ancient history.

  28. Michael says:

    Tiny model was for DOS COM programs only. They did not have relocation table and all addressing was relative to its load base. You had to be able to squeeze your app into 64K (both code and data). It worked really fast and the programming model was the easiest. Most resident drivers were COM programs.

  29. Michael J Smith says:

    > Btw, malloc() wasn’t always near. It depended

    > on which version of the C runtime library you

    > used: Small, medium, large, and huge

    Don’t forget "Tiny" and "Compact". Tiny required the whole app (including the data) to fit into 64K. It was *very* fast. (Though it might have just been for DOS – I can’t remember if it was available for Windows.)

  30. Michael, you’re right, I forgot about tiny model :)

    Alexey, ancient? We’re only talking 10 years here.

  31. "Alexey, ancient? We’re only talking 10 years here."

    The problem however is that you have to count years with respect to Moore’s law in order to make the transition from human years to computer years. :)

    So, for every 1.5 human years, computer power doubles. Therefore, in computer years, 15 years have actually passed. :)

    James

  32. foxyshadis says:

    No, you can’t mix apples and oranges that way. You have to figure that a computing generation occurs every 3-4 years, compared to the ~30 years for human generations. Operating systems evolve more slowly, major generations about every 5-6 years or so. So if you divide 30 years by the smaller tech time, you come out with the multiplier you use for comparing human time to percieved computer time. Almost a century ago in hardware terms and half a century ago in software terms. Man, no wonder I remember it so dimly. ^_~

  33. Not bad, foxy, not bad at all. :)

Comments are closed.

Skip to main content