The compatibility constraints of even your internal bookkeeping


The Listview control when placed in report mode has a child header control which it uses to display column header titles. This header control is the property of the listview, but the listview is kind enough to let you retrieve the handle to that header control.

And some programs abuse that kindness.

It so happens that the original listview control did not use the lParam of the header control item for anything. So some programs said, "Well, if you're not using it, then I will!" and stashed their own private data into it.

Then a later version of the listview decided, "Gosh, there's some data I need to keep track of for each header item. Fortunately, since this is my header control, I can stash my data in the lParam of the header item."

And then the application compatibility team takes those two ingredients (the program that stuffs data into the header control and the listview that does the same) to their laboratory, mixes them, and an explosion occurs.

After some forensic analysis, the listview development team figures out what happened and curses that they have to work around yet another program that grovels into internal data structures. The auxiliary data is now stored in some other less convenient place so those programs can continue to run without crashing.

The moral of the story: Even if you change something that nobody should be relying on, there's a decent chance that somebody is relying on it.

(I'm sure there will be the usual chorus of people who will say, "You should've just broken them." What if I told you that one of the programs that does this is a widly-used system administration tool? Eh, that probably wouldn't change your mind.)

Comments (51)
  1. What is the system administration tool?

  2. DrPizza says:

    "(I’m sure there will be the usual chorus of people who will say, "You should’ve just broken them." What if I told you that one of the programs that does this is is a very popular system administration tool?) "

    Why should that change the answer?

  3. Bdoserror says:

    I have to agree with DrPizza: My naive view (and I do work in the retail software market) is "if it hurts when you do that, don’t do that" — you should "out" these products.

    Unfortunately, I understand that realistically, users don’t think it’s the program’s fault, they blame windows ("It worked in 98, but when I upgraded to XP it broke. Stupid Windows"). Moral highground doesn’t do you any good here. This job would be easy if it wasn’t for the customers. :)

  4. Shog9 says:

    IMHO, it’d be one thing if developers were grabbing the header by searching for a child window and then doing Bad Things to it. But when you provide a nice API for getting the thing, it sorta just screams "use me! abuse me!".

  5. Vince says:

    Just curious. If backwards compatibility is truly the all-encompassing goal of Microsoft, why was Netscape-plugin support dropped from IE 5.5? I know it caused a lot of annoyance at the time, and that customers were forced to go out and try to update most of their plugins from various software companies.

  6. Cooney says:

    Was that before or after Judge Jackson’s penalty was overturned?

    My opinion: Microsoft lives by Conan’s 3 rules. Backward compatibility is about not giving your users another reason to look elsewhere.

  7. DeepIce says:

    Unless the system administration tool is from microsoft

  8. Anon says:

    The documentation is inadequate (it doesn’t seem to say what can be done with the header control) and you’ve ignored one of the most basic rules of programming – encapsulation, so are you really surprised at the consequences? Do you expect a prize for this great feat of backwards compatibility?

    There is the argument that people could FindWindow to get the header control, so you might as well give it to them. I still think that a note on the message which said "Although the header control is accessible, applications should not attempt to X, Y or Z it" is necessary to make your complaint valid or excuse your faux shock.

  9. DrPizza says:

    "Unless the system administration tool is from microsoft "

    All the more reason to break it, really, as it’s that much easier for them to fix.

  10. Mike Dunn says:

    C’mon people, you’re blaming MS for not hiding this LPARAM in a header control? Puh-leeze. If they had hid it, you would be whining about how MS (sorry, I mean "M$") is conspiring against other companies to keep them from using those 4 magical bytes (which, of course, make their apps so much better than everyone else’s). Pfft.

    The real blame lies on the programmer who was too lazy to keep his own vector<CMyData>. It’s common sense not to muck around in windows and data structures that you don’t control. Alas, not all programmers exercise common sense…

  11. Tim Smith says:

    Give them hell Mike. :)

  12. Raymond Chen says:

    (What Mike Dunn said.)

    Suppose you’re a system administrator. You are evaluating whether to upgrade your 10,000-employee company to Windows XP. Uh-oh, your critical system administration tool doesn’t work any more. What do you think your recommendation is going to be to the company’s management?

  13. Nate says:

    Frankly it seems odd to me that the lParam would be considered internal. I’ve always seen things like LVITEM lParams, GWL_USERDATA and all of those things provided explicitly so user programs could put their stuff in them. The fact that it happens to be in a list view child control doesn’t seem to change anything to me.

    Am I alone in my thinking?

  14. Raymond Chen says:

    The header control is not yours. It belongs to the listview. You can use the handle value to, as Anon points out, detect that a drag notification came from the header control. But you can’t do anything to the header itself – the listview controls it. Don’t add columns directly to the header control, edit item text, etc. Go through the listview for that.

    (Personally, I’m surprised that LVM_GETHEADER exists at all.)

    The LPARAM member in the HDITEM structure is for the header’s owner to use. The header’s owner is listview.

  15. Anonymouse says:

    My opinion: Microsoft lives by Conan’s 3 rules. Backward compatibility is about not giving your users another reason to look elsewhere.

    But what does backwards compatibility have to do with Crushing your enemies, Seeing them driven before you, and Hearing the lamentations of their women? You are talking about Conan, right? :)

  16. EP says:

    Moral of the story:

    If you’re writing a control that might be used by millions of programmers, add a "OwnerData" property to it. Whether it’s the listview, the header, the header columns – whatever.

  17. Raymond Chen says:

    The header does have an OwnerData property (lParam). But the header’s owner is Listview! (Not the adminstration app.)

  18. EP says:

    "But the header’s owner is Listview! (Not the adminstration app.)"

    That’s ok. Simply allow me to saveretreive header specific data through the ListView_* functions. If the control is there – someone will want to associate data with it.

  19. IUnknown says:

    With which function can I read the lparam value that you talk about?

  20. I believe I’ve done this myself. I don’t recall (at the time) that there was anything in the documentation that warned against this specific scenario.

    Essentially, if the ListView team didn’t want people grovelling around with their header control, it shouldn’t have allowed the user a documented way to get at the handle, at least not without a *large* warning. I mean, the lParam is _always_ where I want to stash useful things.

    Instead, it should have had ListView_GetColumnData-like macros which I could have used instead.

  21. Satan's Cursemonger says:

    Hm… I was on the classic Mac OS toolbox team at Apple, and we had similar issues, often with major applications such as those from vendors such as the five-letter one starting with "A". cf "Why We Added The Window and Control Property APIs in Carbon, by the Apple High Level Toolbox team…" We learned that when you expose anything to an app, they will take advantage of it, whether or not they should.

  22. Eric TF Bat says:

    Now of course the real problem is that programmers are using C and its variants to write software. Of course it will blow up in your faces! If you’d only use a real language like LISP, you’d never have this problem…

    Now of course the real problem is that programmers are using C and its variants to write software. Of course it will blow up in your faces! If you’d only use a real language like APL, you’d never have this problem…

    Now of course the real problem is that programmers are using C and its variants to write software. Of course it will blow up in your faces! If you’d only use a real language like Intercal, you’d never have this problem…

    (rinse, repeat)

  23. .dan.g. says:

    sorry if this is the wrong forum raymond, but i’ve been wondering for a while how the listctrl offsets the origin of its client area to take account of the tab control when the tabcontrol also exists in its client area.

    is this just some magic within the listctrl or is there a generic technique that can be applied elsewhere?

    rgds

  24. Raymond Chen says:
    1. There is a suggestion box.

      2. I don’t know what you’re talking about. There is no tab control in the client area of the listview. (I assume that’s what you mean by listctrl?)
  25. Anon says:

    Mike Dunn: I reject the idea that it’s common sense not to have user-provided per-column data in a list control. In fact, I think the purpose of the LPARAM data in shell-provided controls is *exactly* for clients to use, even when exposed through another control. Since there’s no documentation to the contrary and it’s easier and more robust for Microsoft to maintain a separate pile of bits (which they’re probably already doing for the other LVCOLUMN stuff), they should have expected to do that all along.

    We don’t complain that there’s no LPARAM in the LVCOLUMN structure (which supersedes itself, according to the documentation – probably they need an underscore in there, another place where the documentation is lacking quality). We simply assume that we’re intended to use the one we have so readily accessible.

    Your assertion that people would be complaining if the documentation said "The LPARAM data of the header may be used by the list control and should not be modified or interpreted by user code" is as flawed as the rest of your argument. We don’t spend all day whining about the dwReserved’s, or all the other places where we’re asked not to poke, do we?

  26. Anon says:

    It should be mentioned that I don’t write GUI code, but I think I can read the documentation.

    Right now, I’m wondering what the point in the LVM_GETHEADER message is, if all the functionality we’re allowed to use is provided by the LVM_*COLUMN* messages. I’m assuming the intent is to let reasonable applications do reasonable things: e.g., don’t destroy it, do use it for functionality missing in the list control functions, do use it to handle HDN_* stuff.

    Your argument, then, is really about whether using LPARAM data in columns is a reasonable thing… sounds like Raymond’s just given it the blessing.

  27. Anonymous Coward says:

    Various CPUs got bitten by this in the past. Various address bits would be ignored, so various programs used them as tags. As new versions of the CPU came out and memory sizes increased, things crashed.

    So in the CPU world, they now verify the various bits are zero/one as appropriate even if they are not currently used.

    I guess the lesson for API designers is that if you have areas for future expansion, zero them out (or put in 0xdeadbeef or whatever your favourite pattern is) and verify the pattern is still present in every API call, causing an error if not.

    Unfortunately it is commercially impossible to go back and fix this kind of thing. You end up having to have special modes (eg on Macs to deal with the address bit issues) or just deal with it, as Raymond describes.

    I asked about the ACT in another thread, and I guess it applies here too. Does the ACT detect this? If a program does this, can it still get Windows certified?

  28. Anon says:

    Given how many headaches Microsoft has with compatibility, does it do anything to prevent developers using the APIs in "broken but works" ways? Example, in this case the first version of the Listview control could have checked the lparam hadn’t been changed, and throw an exception if it had. Then it would have been impossible for any developers to abuse this particular part of the API.

    Okay, you can’t do that retroactively, but is this being done for new APIs? This might save you a lot of headaches in five years time. If the doc says that certain members are reserved and to be left zero for forwards compatibility, *check* they are zero, so developers have to store their own data in a future-proof way. If the doc says that the order of a list is unspecified, randomise the order so that developers cannot rely on whatever order your internals put them in. That sort of thing. A pain in the short term, but it would product massive savings in the long term.

  29. Raymond Chen says:

    "… checked that the lparam hadn’t been changed…"

    Are you saying that Listview should keep two copies of all its data and raise an exception if it finds that the two copies ever fall out of sync? (And that still won’t stop people from going in and hacking *both* undocumented members simultaneously, once they find where the second copy is kept.)

    And where do you keep the second copy of things that have only one copy? There is only one GWLP_USERDATA. Where do you keep the backup copy to check whether the first copy has gone bad?

  30. Raymond Chen says:

    "Simply allow me to saveretreive header specific data through the ListView_* functions"

    Why does there have to be a ListView_* function for something you can do yourself? Create a little array, save your header-specific data there. You added the columns yourself after all.

  31. Raymond Chen says:

    The app compat toolkit can’t catch everything. If your program parties on another window, the app compat toolkit doesn’t know whether it’s doing it with permission or not.

    In the case discussed in this article, the app found a window and started messing with it. Even though the window semantically doesn’t belong to that program directly; it belongs to the listview.

    How would you write an ACT rule to catch apps messing with windows they "shouldn’t be messing with"?

  32. Fox Cutter says:

    The lesson I take from this is to set all my unused but exposed data (assuming I know about them, one would hope that I did) to a well known value, then return an error if that changes.

    It’s not a perfect thing, but it should at least provide two hands of protection.

  33. asdf says:

    .dan.g: They didn’t do anything fancy here, the header is a child of the listview and gets clipped during painting automatically because of WS_CLIPCHILDREN. All you have to do is offset the y-coordinate during the mouse messages and adjust the origin during the paint messages with SetWindowOrgEx/OffsetWindowOrgEx by header.height(). There may be other places they had to touch but those are the main ones most people have to worry about (it’s better if you plan for that ahead of time instead of tacking it on later though).

  34. Tony says:

    So, now that they worked around the problem, are we actually now allowed to use the LPARAM, i.e., the ListView is no longer the real owner of the header, but rather the app that owns the ListView?

  35. Whatever says:

    The problem is/was the documentation. If something is not marked RFU, must have a value of X, or something like that then it is fair game.

  36. Ben Cooke says:

    Tony,

    Since MS has committed to not overwriting it now (to keep the important admin app running) I see no reason at all why you can’t clobber it to your heart’s content.

  37. Petr Kadlec says:

    Although it really is amusing to read all those stories about "bad guys" who did something wrong and nowadays we have a few megs of code only to catch them, what is the real moral in that?

    The only lesson I have learned is that I can use any dirty trick I want, if my app would be important enough that MS would test it (and then make sure it won’t break). Why should I use ACT if MS will do that for me (and "repair" Windows accordingly)? (And if my app is not important enough, I’ll just post some info about the trick to a newsgroup, someone else’s important app will use it.)

  38. Tim Dawson says:

    As Raymond said, as a programmer, you are adding the columns so you can create yourself a little array, or hashtable, or whatever takes your fancy to associate that little piece of data with each column as you add them.

    I’m sick of people using (or trying hard to use) user interface controls as data storage mechanisms. They’re just not meant for this. My theory is that 99% of the time this comes from laziness. Store all your data in your own structures/classes, and only enough in the presentation layer to refer back to those structures.

  39. Raymond Chen says:

    "Since MS has committed to not overwriting it now (to keep the important admin app running) I see no reason at all why you can’t clobber it to your heart’s content."

    Well except that the compatibility shim probably won’t be applied to your app by default.

  40. IUnknown says:

    About which LPARAM do you speak the hole time? With which function you can modify it?

  41. Anonymous Coward says:

    I agree that it would be difficult to write an ACT rule now. This whole example is something that is real easy with hindsight, but would have been really hard to predict, and it happens to everyone’s products and APIs from complex cases like this, to trivial cases like depending on undocumented command line flags.

    However updating ACT to allow rules covering this case would solve the problem :-)

  42. Raymond Chen says:

    Even today, how would you write the rule? Suppose you had it to do over again. What ACT rule would you write that would catch this and yet not fire false positives?

  43. Cooney says:

    re: conan’s 3 rules:

    Crush your enemies, see them driven before you. This means that you do your best to drive your competitors out of business. Pissing off your customers by breaking their toys gives advantage to your enemies that you can avoid. Therefore, app compat is useful in maintaining your position only when a significant number of users use a particular toy.

    As for hearing the lamentation of the women, once you’re sitting on a fat stack of cash, it’s fairly easy to find women. Or something like that.

  44. James Summerlin says:

    Raymond,

    Do you ever get the idea that some people are out to disagree with you ONLY because you work for Microsoft?

    People, get real, some of you are going way the hell out of your way to JUSTIFY CRAPTACULAR PROGRAMMING MENTALITIES!!!

    James

  45. lowercase josh says:

    "Are you saying that Listview should keep two copies of all its data and raise an exception if it finds that the two copies ever fall out of sync?"

    If you’re not using part of an internal data structure, but reserving it for later use, periodically reset it to some arbitrary value. People won’t store data somewhere if it doesn’t stay there. (of course, they’ll curse at Microsoft for not letting them store data there.)

    But then that only applies to this case, and is of no help now. The concept is good though: If you reserve the right to break something later, break it now. Assuming it’s fast enough that nobody else notices.

  46. Raymond Chen says:

    But it’s not an internal data structure. It’s an *external* data structure of the Header control. (The lParam member of the HDITEM structure.) Are you saying that the listview should intentionally go around and reset every single external property of the Header control periodically, just to make sure nobody else uses them?

  47. Anonymous Coward says:

    The only way I can think of to get ACT to work would involve the Listview cooperating (ie marking when it changes the value of the field). Detecting when it has changed is also very hard. You could probably check the value of the field on every call to Listview methods, so you’ll catch changes after they are made, but it requires keep track of which ones were ever returned etc. Yuck!

    With hindsight, the answer is to zero out/set to known value space reserved for future use and check it is that way in all other relevant calls.

    The other alternative strategy is to name and shame the relevant companies, but that is ultimately unlikely to work either technically or commercially.

  48. Tony says:

    What do I need to do to get my app’s list control into that state? Does it automatically use this (LPARAM) location when using XP Visual Styles, or is it only in certain views at the moment?

    Also, what is it using it for (out of interest)?

    Finally, if I do want access, presumably, I can create my list control without a header, create my own header, and party all over the LPARAM as it’s mine, mine, mine!

  49. foxyshadis says:

    "The problem is/was the documentation. If something is not marked RFU, must have a value of X, or something like that then it is fair game."

    This only applies if developers read the docs, and care – if they just poke around data structures while debugging and find usable space, it’s all over. I once inherited code that used C++ pointer tricks and a pliable compiler to stuff data into crevices of windows’ data structures that I’d never heard of, most of which had held system or interface data that the coder decided weren’t absolutely required. Crashes and colorful corruptions were quite regular. Naturally, it survived the transition to NT about as well as poland survived the nazis.

  50. Don’t run around dorking with somebody else’s stuff.

Comments are closed.