If control-specific messages belong to the WM_USER range, why are messages like BM_SETCHECK in the system message range?

When I discussed which message numbers belong to whom, you may have noticed that the messages for edit boxes, buttons, list boxes, combo boxes, scroll bars, and static controls go into the system range even though they are control-specific. How did those messages end up there?

They didn’t start out there.

In 16-bit windows, these control-specific messages were in the control-specific message range, as you would expect.

#define LB_ADDSTRING      (WM_USER + 1)
#define LB_SETSEL         (WM_USER + 6)
#define LB_SETCURSEL      (WM_USER + 7)
#define LB_GETSEL         (WM_USER + 8)
#define LB_GETCURSEL      (WM_USER + 9)
#define LB_GETTEXT        (WM_USER + 10)

Imagine what would have happened had these message numbers been preserved during the transition to Win32,

(Giving you time to exercise your imagination.)

Here’s a hint. Since 16-bit Windows ran all programs in the same address space, programs could do things like this:

char buffer[100];
HWND hwndLB = <a list box that belongs to another process>
SendMessage(hwndLB, LB_GETTEXT, 0, (LPARAM)(LPSTR)buffer);

This reads the text of an item in a list box that belongs to another process. Since processes ran in the same address space, the address of the buffer in the sending process is valid in the receiving process, so that when the receiving list box copies the result to the buffer, it all works.

Now go back and imagine what would have happened had these message numbers been preserved during the transition to Win32.

(Giving you time to exercise your imagination.)

Consider a 32-bit program that does exactly the same thing that the code fragment above does. The code probably was simply left unchanged when the program was ported from 16-bit to 32-bit code, since it doesn’t generate any compiler warnings and therefore does nothing to draw attention to itself as needing special treatment.

But since processes run in separate address spaces in Win32, the program now crashes. Well, more accurately, it crashes that other program, since it is the other program that tries to copy the text into the pointer that it was led to believe was a valid buffer but in fact was a pointer into the wrong address space.

Just what you want. A perfectly legitimate program crashes because of somebody else’s bug. If you’re lucky, the programmers will catch this bug during testing, but how will they know what the problem is, since their program doesn’t crash; it’s some other program that crashes! If you’re not lucky, the bug will slip through testing (for example, it might be in a rarely-executed code path), and the experience of the end user is “Microsoft Word crashes randomly. What a piece of junk.” (When in reality, the crash is being caused by some other program entirely.)

To avoid this problem, all the “legacy” messages from the controls built into the window manager were moved into the system message category. That way, when you sent message 0x0189, the window manager knew that it was LB_GETTEXT and could do the parameter marshalling for you. If it had been left in the WM_USER range, the window manager wouldn’t know what to do when it gets message 0x040A since that might be LB_GETTEXT, or it might be TTM_HITTESTA or TBM_SETSEL or any of a number of other control-specific messages.

Theoretically, this motion needed to be done only for legacy messages; i.e., window messages that existed in 16-bit Windows. (Noting that Windows 95 added some new 16-bit messages, so this remapping had to continue at least through Windows NT 4 with the shell update release.) Nevertheless, the window manager team added the *_GET*INFO messages in the system message range even though there was no need to put them there from a compatibility standpoint. My suspicion is that it was done to make things easier for accessibility tools.

Note however that placing new messages in the system message range is more the exception than the rule for the edit box and other “core” controls. For example, the new message EM_SETCUEBANNER has the numeric value 0x1501, which is well into the WM_USER range. If you try to send this message across processes without taking the necessary precautions, you will crash the target process.

(Note: Standard disclaimers apply. I won’t bother repeating this disclaimer on future articles.)

Comments (9)
  1. joel says:

    While I’d have to do some reading up on this (so take it with a grain of salt), my understanding is that the toolkits do provide ways to send messages beyond the basic events and requests that you get with the core X protocol.  Some are very high level, like KDE’s DCOP, through which applications publish an interface which can be invoked either through an API or through a command line program (or even a GUI one, which is kinda fun ;).  On the lower level, which is where I get more shady, I’m sure there is a way, because when you, e.g., change themes, somehow the toolkit is able to notify all of its applications about the theme change.  And then there’s the ICCCM (Inter-Client Communication Conventions Manual), which specifies how clients are to behave in an X environment and also how they communicate with each other, often using ATOMs set on Windows.  That’s how the Window Manager knows what the title of a window is and whether it should stay on top, etc.

  2. Nawak says:

    This makes me wonder… are there Window Managers on other platforms that makes marshalling of user messages possible, for instance the EM_SETCUEBANNER *in the receiving process* would be marked as "LPARAM is a buffer that should be copied" so that these bugs are more easily avoided?

    I don’t know if the need would justify the cost since it is not a security problem (hatchways and all)

    Are all Window Managers from the same mold with the same gotchas?

  3. BryanK says:

    Not sure about all other platforms, but X in particular doesn’t have a separate notion of a "tooltip".  All X cares about is a "window"; a tooltip is a construction of the X toolkit you’re using (e.g. gtk or qt).  It’s implemented with a separate X window, of course, but all the details are taken care of by the toolkit.

    As for marshalling user messages: X does that for *everything* already, not just messages that copy data.  (X is network-transparent, so it has to marshal everything.)  The low-level X library obviously shares data directly with the user program, but when talking to the X server (and via the X server to other programs, e.g. for a paste), it uses a network message.  So it’s already copying all the required data over the socket to the server.  (Or it’s using the MIT-SHM shared-memory extension.  Either way, it has to copy the data into some buffer where it gets either copied to or shared with the X server.)

    So no, not all window managers are from the same mold.  Some had explicit marshalling requirements built in from the beginning.  (I’d guess that’s because X was designed on an OS that always required a separation between processes.  There was never a way for processes to accidentally share memory (barring a bug in the OS), so this type of message-send operation where the target could copy directly into the source’s buffer was never used.)

  4. Anon says:


    "Unfortunately Windows doesn’t always know the size of the message so it can’t do automatic marshalling the way X does.  Design flaw?  Yes, but without the handy time machine not much can be done at this point (except coming up with something new)"

    You know I wonder if this sort of thing is inevitable in popular standards.

    E.g. Windows 3.x was popular because it could run well on underpowered systems. E.g. 8086 and 80286 systems with not much memory, dire graphics capabilities and so on.

    And it does that because it exploits lots of features that underpowered systems lack. It would have been possible to do a better windowing system (and X isn’t it) that would have been more scalable, but that windowing system would have crawled on the underpowered systems where it would have competed with Windows.

    And the hackish roots of Windows doesn’t seem to have limited it the long run, as this example of adding marshalling for control specific messages for Win32 shows. So you’re better off getting a hack out of the door and gaining market share than worrying too much about long term scalability. Because market share means money and the ability to hire smart people to clean things up later.

    Actually, the IBM PC standard was originally a short term ugly hack even if now x64, PCI express, SMP and so on have made it a killer workstation platform these days. To the point where companies that used to make much more elegant machines back in the PC-AT days have adopted it instead.

  5. Miral says:

    @BryanK: Unfortunately Windows doesn’t always know the size of the message so it can’t do automatic marshalling the way X does.  Design flaw?  Yes, but without the handy time machine not much can be done at this point (except coming up with something new).

    @Raymond: Maybe I just haven’t poked into enough corners of the Win32 API yet, but what *is* the recommended way of passing cross-process messages like this?  I’d imagine you’d have to get your process to allocate some memory that’s visible to both processes (perhaps with different pointers, although ideally with the same ones), and I’m not sure how you could do that without the cooperation of the other process.

  6. kopi says:

    Miral, I believe WM_COPYDATA should be used to send messages (including additional user defined data) between processes.

  7. Marc K says:

    @Miral: If you really need to send a message specifying a buffer to a window owned by another process you can allocate a buffer inside the process using VirtulaAllocEx (and specify the default value with WriteProcessMemory), then read the result using ReadProcessMemory and clean up the buffer in the remote process with VirtualFreeEx.

    For win9x, just create and map a memory mapped file.  It’ll be placed in the section of address space that is shared by all processes.

Comments are closed.