What was that story about the WinHelp pen-writing-in-book animation?


The first time you open a WinHelp file, you get this pen-writing-in-book animation while WinHelp does um something which it passes off as preparing Help file for first use or something similarly vague.

I remember a conversation about that animation. The Windows shell team suggested to the author of WinHelp that the program use the shell common animation control to display that animation. After all, it's a short animation and it met the requirements for the animation common control. But the author of WinHelp rejected the notion.

(Pre-emptive "I can't believe I had to write this": This conversation has been exaggerated for effect.)

"Your animation control is so huge and bloated. I can do it much smaller and faster myself. The RLE animation format generates frames by re-rendering the pixels that have changed, which means that at each frame of the animation, a new pen image would be recorded in the AVI file. The pen cycles through three different orientations at each location, there are ten locations on each row, and there are four rows. If I used an RLE animation, that'd be 3 × 10 × 4 = 120 copies of the pen bitmap. Instead, I have just three pen bitmaps, and I manually draw them at the appropriate location for each frame. Something like this:

// NOTE: For simplicity, I'm ignoring the "turn the page" animation
void DrawFrame(int frame)
{
  // Calculate our position in the animation
  int penframe = frame % 3; // 3 pen images per location
  int column = (frame / 3) % 10; // 10 columns per row
  int row = (frame / 30) % 4; // 4 rows
  int i;
  POINT pt;

  DrawBlankPage(0, 0); // start with a blank sheet of paper

  // Draw the "text" that the pen "wrote" in earlier rows
  for (i = 0; i < row; i++) {
    DrawTextScribble(i, 0, 9);
  }

  // Draw the partially-completed row that the pen is on now
  DrawTextScribble(row, 0, column);

  // Position the pen image so the pen tip hits the "draw" point
  GetTextScribblePoint(column, row, &pt);
  DrawPenBitmap(penBitmaps[penframe], pt.x - 1, pt.y - 5);
}

"See? In just a few lines of code, I have a complete animation. All I needed was the three pen images and a background bitmap showing a book opened to a blank page. This is way more efficient both in terms of memory and execution time than your stupid animation common control. You shell guys could learn a thing or two about programming."

"Okay, fine, don't need to get all defensive about it. We were just making a suggestion, that's all."

Time passes, and Windows 95 is sent off for translation into the however many languages it is localized for. A message comes in from some of the localization teams. It seems that some locales need to change the animation. For example, the Arabic version of Windows needs the pen to write on the left-hand pages, the pen motion should be right to left, and the pages need to flip from left to right. Perhaps the Japanese translators are okay with the pen motion, but they want the pages to flip from left to right.

The localization team contacted the WinHelp author. "We're trying to change the animation, but we can't find the AVI file in the resources. Can you advise us on how we should localize the animation?"

Unfortunately, the WinHelp author had to tell the localization team that the direction of pen motion, and the locations of the ink marks are hard-coded into the program. Since the product had already passed code lockdown, there was nothing that could be done. WinHelp shipped with a pen that moved in the wrong direction in some locales.

Moral of the story: There's more to software development than programming for performance. Today we learned about localizability.

Comments (47)
  1. Mr. Shiny & New says:

    I get the point about the article, but I find it odd that localization code changes aren't made during the localization process. Is this just a case of "The Arabic team wanted it, but not that badly"?

  2. Adam V says:

    @Mr. Shiny & New: I think it's a case of "this is beyond the scope of the changes the localization team is allowed to submit at this point in the process". If someone had brought up the localizability issue earlier in the product lifecycle, the WinHelp author might have had the opportunity to make it more localizable (or scoff at the suggestion).

  3. SimonRev says:

    The intro to the story got me all interested in the "um somthing" that Win Help does.  Probably not worth an entire article, but any ideas what that something is?

    [I never did find out what the heck it was doing. -Raymond]
  4. f0dder says:

    Localization is important, but come on – localizing something like this? Don't people have anything better to do? :)

  5. Stephen Eilert says:

    It seems that the issue here wasn't that the code itself couldn't be localized. It was simply an oversight, made far, far worse by company policies.

    "Code lockdown" is like a "pencils down", which makes sense for tests. However, in this case, it seems that exceptions could be made. Such a simple change can be isolated (this code only talks to the GDI) and easily testes (if it is a single function).

    The hard part here seems to be explaining that to management.

  6. JS Bangs says:

    I, for one, am stunned at the fact that Windows attempts to localize animations.

  7. I always assumed it was indexing, or something…

  8. I believe it is creating a .chi file to speed up keyword searches.  Sure, Windows could ship with a pre-generated .chi file for each .chm, but that would take disk space.

    I had a hunch the animation story would be about localization.

  9. Michael says:

    yeah, yet another premature optimization :)

  10. pingpong says:

    @Maurits: the subject of this blog entry is WinHelp, not HtmlHelp.

  11. Robert Morris says:

    @Maurits: This is WinHelp, not HTML help. No CHMs here.

    Back to the original post, I find it funny that they were so concerned about saving a few resources by writing their own animation code. I'm sure the savings were more significant at the time, but by 2010 standards the idea of doing that for something this small is almost laughable–in fact, it is. :)

  12. DWalker says:

    Um, even IF the existing animation control is "bloated", it's ALREADY THERE, as part of Windows, correct?  So Windows 95 as a whole would be a (marginally) smaller distribution if the common control had been called from WinHelp.  Am I right?

  13. anss123 says:

    @DWalker59

    The code would be smaller, but the then required AVI file is likely larger than any savings.

  14. Mike Dunn says:

    I don't know what the "something" is either, but the result is a hidden .gid file in the same dir as the .hlp file.

  15. Marc says:

    It's an age old argument between raw performance and code maintenance.

  16. Marquess says:

    And again we see that correctness beats efficiency when it comes to algorithms.

  17. Henning Makholm says:

    @Eric Lippert: Of course, taking that principle ("more data, less code") to the extreme would mean that the only actual code in the product becomes a tiny interpreter for a home-grown poorly defined macro language in which the entire behavior of the application can be defined as "data". Until someone notices and gets the macros reclassified as "code", at which point the cycle can begin anew.

    [That's why you hire people smart enough to understand the rationale behind the principle so that they don't turn good advice into bad advice. -Raymond]
  18. Dan Bugglin says:

    @SimonRev I am pretty sure it's building a search index (and/or an index file).  I believe there used to be a dialog which allowed you to specify how thorough you wanted the search index to be, and then the animation would come immediately after.

    Screenshot of said dialog: http://www.joelonsoftware.com/…/fog0000000059.html

    The article itself is a good read too, though not related to this discussion.

  19. Joshua says:

    The um something that it is doing is generating a .cnt file which contains a copy of the index for the .hlp file, probably ordered by search word rather than article for fast lookup.

  20. Sunil Joshi says:

    @Robert Synnott @SimonRev

    As I recall, the animation only shows when you try to search for the first time – It says it is preparing the *index* for first use. In fact, I remember being asked what level of indexing I want. That would august it was building the index.

  21. Nawak says:

    @Grumpy: I agree. Configuration shouldn't be considered as simple data. Configuration is more like code, because it is data remotely activating parts of code.

    There are shades of complexity and "interference" with the code, therefore there are shades of data hidden cost.

    I put configuration files high on the list of dangerous data, just below the indirect (and often unforeseen) creation of a new programming language.

    I would put the avi file quite lower, considering that they often have a much more restricted "scope" in an application, have a stable format and stable display library. AVI files aren't tentacular like config files are (as you illustrate, Grumpy).

    BTW, Raymond often made the case here that configurability can become a burden in a program's code, test and documentation.

  22. Eric Lippert says:

    Good story. I would draw a more general moral than your conclusion though: the moral of the story is that *code is orders of magnitude more expensive than data*. Developers would do well to remember that what we should actually be optimizing for is *profitable production of things of value*. It is way, way more expensive to develop, test, maintain and update code than it is to do the same to data. More data and less code is almost alway the right tradeoff.

  23. MWF says:

    @Stephen Eilert:

    While exceptions can be made for making changes after lockdown, they should be *exceedingly rare*.

    ANY code change at that stage absolutely requires full regression testing. Why? Because even the tiniest change in code can have wide-ranging effects/complications. Just testing the changed function isn't anywhere near sufficient; sure, your function may appear to work just fine, but if your change had a bug that ended up corrupting memory belonging to some other part of the program, you would completely miss this potentially MAJOR problem without full regression testing. This is a significant setback when a team is nearing a completion/ship date.

    Saying that "well, that's just a simple fix" is a slippery slope; you'll end up opening the floodgates to all manner of minor fixes during time-critical stages of development. The concept of "bug bars" (and their particular application during late stages of product development in preparation for deployment) exists for a very, very good reason.

  24. Hmmm… I stand corrected.

  25. Stephen Eilert says:

    @MWF

    Yes, I've had shipping dates changed because bosses wanted to slip in a bunch of "tiny changes", so I can understand why you call it "floodgates".

    This sounds acceptable for an operating system written in a low-level language. This might not be always the case: for a software not written in a low-level language that minimizes side-effects, one should be able to prove (mathematically, if required) that a function change will not impact the rest of the system. It seems to go with what Raymond says about 'good and bad advice'.

    @Henning Makholm and @Eric Lippert

    Or one could realize that code is data *wink*

    [Mathematical proofs work great if we ran on idealized computers. There are other side effects like memory usage which can be quite subtle. A change to the function causes another function to span two pages, which changes the I/O pattern and introduces a performance regression -or worse- opens a race window that leads to a deadlock. -Raymond]
  26. lefty says:

    @Dan Bugglin:  The JoelOnSoftware article is at least somewhat related to the discussion:

    "Now, some annoying help-index-engine-programmer at Microsoft with an inflated idea of his own importance to the whole scheme of things has the audacity, the chutzpah, to interrupt the user once again and start teaching the user things about making lists (or databases)."

    That help-index-engine-programmer may or may not have been all that, but he *did* optimize the animation. I think we can guess what Joel would have preferred he work on (for whatever that's worth)…

  27. Timothy Byrd says:

    @JS Bangs: "I, for one, am stunned at the fact that Windows attempts to localize animations."

    The splash screen for our main product is an animation (with words), and it gets localized, so I don't find it odd. We also have a number of bitmaps with text that get localized. [Warning – horribly mixed metaphor ahead!] Our localization isn't perfect, but in a way I look at it as taking a pair of jeans cut in an English pattern and tailoring them so they aren't a horrible fit for our Chinese, Japanese and Spanish customers.

    @Stephen Eilert: "Or one could realize that code is data *wink*"

    To take Eric's statement to another level, I think the most expensive thing in the world might be *embedded code*, compared to that, normal code seems a lot more like data.

    Also about "Pencils Down" – it's amazing how one little change can break things somewhere else in the system. For example, in doing the Chinese and Japanese translations, we learned that the installer program we use can't really deal with folders that don't contain at least one English character. It just dumps all the files into a single folder.

    And on changing ship dates, the (anti)pattern here seems to be DeveloperX is working on one last fix that must go in, but DeveloperY is done, and we don't want him sitting idle, so let's slip in one tiny feature. It's wafer-thin.

  28. Grumpy says:

    @Henning Makholm: exactly. I was until recently part of a team working on a relatively large and old app handling large volumes of quite varied non-relational data across many customer installations. The app started with a quite flexible configuration setup which all 3 of us coders understood well, and most customers could be trained to understand the relevant parts too. The installers and supporters could also learn where to check and look for trouble.

    Today it has more than 5000 points of configuration and no customer, installer or supporter can be trained to handle all of it, in many cases not even the parts that they are supposed to fiddle with (customized menus etc.). Many of the configuration points are never used even though the app has grown to be quite a lot larger than it was in the days of yore. We have, in effect, created a very complex and badly structured programming language that only two persons can use, myself not included which is why I'm now doing other stuff – I can't stand the gross inelegance of it any more and I'm not allowed to refactor anything. As a matter of fact, there is no way anyone will ever be able to change anything without rewriting the application from the bottom up so maybe it's a good thing I'm not allowed to refactor it. :-)

    I thus completely concur with Raymonds point. There is such a beast as too much of a good thing. Data can be *very* expensive.

  29. Medinoc says:

    Basically, the performance/size issue was like the difference between raster and vectorial graphics: When it's not too complex, the vectorial works better/takes less space).

  30. Gabe says:

    The code to generate the animation is brilliant — that makes it trivial to create an Arabic or Japanese version. I can't imagine how many hundreds of hours the localization department would have spent trying to recreate the animations with just slight changes. The problem is that the programmer foolishly believed that it was a good idea to ship the "code generator" instead of the "generated code".

  31. Mike Dimmick says:

    The obvious answer would be 'use a simpler animation'. If you really couldn't tolerate ~120 frames, that is. Removing the wiggling pen would have got you down to 40 frames, not far off the other Win95 animations. Reducing the smoothness of the lines could help too – cut the number of columns in half and you're down to 20, plus 4 for the turning pages. These are the sorts of decisions that are very easy to take if the animation is a resource rather than a chunk of code.

    The code's still there even in Windows 7 – the resources it uses are in hhctrl.ocx, which plays the same animation when it indexes an HTML Help file. Thankfully HTML Help no longer asks the stupid question.

    Even though Win95 did ship on floppy disks, it shipped on 13 of them, total capacity about 21.3MB. I don't think this trade-off was worth it – it probably saved under 50KB!

    [50KB is over $50,000! One byte used to cost a dollar. -Raymond]
  32. 640k says:

    If I used an RLE animation…

    fail.

  33. DWalker says:

    The JoelOnSoftware article is a good one, talking about the dialog box that used to come up and ask you, before the indexing started, whether you (the user) wanted to optimize the index one way or another… basically, whether you wanted to index more or less data.  The user generally has no earthly idea what the question is talking about, and doesn't have enough information to make an informed decision anyway.  How much data is indexed for the "Maximize search capabilities" choice?  And how long will the indexing process take if I choose that option?

    Joel's excellent point was echoed by Raymond, I believe, in the series on "Now we will demonstrate our superior intelligence by asking you a question that you don't have enough information to answer".

  34. Marquess says:

    “[50KB is over $50,000! One byte used to cost a dollar. -Raymond]”

    But not in 1995.

    [That was the number we were given in 1994. If you have better data, feel free to share. -Raymond]
  35. Marquess says:

    While the link Raymond provided doesn't help all that much (all I can read is something about Windows/386), I'm guessing it's about “each disk costs 1.4 million dollars, hence each byte costs one dollar.” But this is only true if you fill each and every disk completely. Conversely, just one byte to much would result in another $1.4 million.

    And this calculation becomes completely moot once you get to the CD edition.

    [The calculations were cut off, but it took into account compression and other factors. And all copies of Windows 95 probably consumed more floppies than all copies of Windows 3.1. -Raymond]
  36. DWalker says:

    In case any others are confused, as I used to be, about the terms i18n and l10n: "localization" is the letter l, 10 more letters, and an n.  Internationalization is then obvious from the parallel.  I suppose the terms are too long to type over and over.

    I used to see "l10n" and think "lion?  What?".

  37. Mihai says:

    This is rather internationalization (i18n), not localization (l10n).

    For people to understand the process: usually localization happen toward the end of the process. You can't afford to start translating 30 languages at the very beginning, when things are still unstable and strings/UI can still change a lot.

    So when you have the localization done and the localization QE happens, it is already late.

    The way this is handled is by doing early pseudo-translated builds and testing them. The testers should also be aware of the cultural issues, so that they can detect situations like this one (where things are wrong, but nothing crashes and everything works ).

    There are clear "best good practices" on how do deal with this. And Microsoft got better at it with time. Look at Windows 7, the first public beta (or was is a CTP?) was in German (to cover most Latin scripts), Japanese (to cover languages using Kanji, needs other fonts, and bigger fonts), Arabic (complex script, right-to-left, requires UI mirroring), Hindi (complex script, Unicode only (does not have an ANSI code page)).

    Localization is not done by software engineers. Or at least not the ones working on the code. The code should be locale-independent and ready to be localized in to *any* language. Localization should only touch "data files" (not code).

    This should hopefully clarify "that localization code changes aren't made during the localization process" or "this is beyond the scope of the changes the localization team is allowed to submit at this point in the process"

    Localization does not do code changes. Localization happens at the end, so it is only "at this point in the process" that things can be reported.

    And i18n vs. l10n: it is impossible to decide if a bug is an i18n problem or a l10n one just by looking at it.

    If the string is in the code (hard-coded), then it is an i18n problem. If the string is in a resource file that localization missed, it is a l10n one. Same for corrupted characters and many other kind of bugs.

    The rule of thumb: if you need code changes, it is i18n, if you need changes in "resources" it is a l10n one.

    (resources: generic term to cover also text files, help, samples, bitmaps, etc.)

  38. Roland says:

    A problem with the animation control starting with Common Controls 6 was that it no longer used a separate thread but always just a timer for the animation. A famous program affected is Outlook.

    Outlook 2003 was the first Outlook version that used Common Controls 6 (for Visual Styles support), and beginning with that version, the animations in Outlook during lenghty operations were no longer real animations. Try emptying the Deleted Items folder in Outlook 2003 or later: You will see just 2-3 frames of the "animation"! This hasn't improved in Outlook 2010 either.

    I never understood why the automatic separate thread was removed from the animation control of Common Controls 6. I guess Raymond knows why…

    [I'm so awesome, I answered your question four years before you asked it. -Raymond]
  39. rs says:

    Presumably, in situations such as these you can't really separate programming from localization: Whatever you use to generate the animation in the first place requires some programming skills (unless you are willing to draw 120 frames for 20+ localized versions by hand), but also, as pointed out, localization skills.

  40. Gabe says:

    Mihai: The German edition isn't tested just to cover most Latin scripts. German is used because it tends to have the longest words, making it obvious when somebody has created a dialog box without enough space for labels, images that can't fit their text, and that sort of thing.

  41. RobertWrayUK says:

    [I'm so awesome, I answered your question four years before you asked it. -Raymond]

    And someone, quite possibly the *same* someone, called Roland, commented on your linked posting saying "For example, when emptying the Deleted Items folder in Outlook 2003, the animation performs poorly and sometimes appears to hang. Even in Explorer, the animations sometimes get stuck before they continue."

  42. Roland says:

    @RobertWrayUK: Funny, you're right! I was in fact the person who commented on the original post by Raymond in 2006. I had forgotten this but of course also that Raymond already had answered the question in the first place.

    And now you know that the first thing I do when using a new Outlook is emptying the Deleted Items folder to see if the animation issue is still there. And it is! The irony is that with Outlook 97 under Windows 95, we had nice smooth animations on our Pentium 60s. Now with Outlook 2010 under Windows 7 on our quad-cores, the app seems to hang, and the 4th core always begs me: "Don't you have a thread for me to work on? It's so boring in here!" And I always reply: "Be glad and wait until the next Outlook or Common Controls 7 come along, hopefully…"

  43. mikeb says:

    "[I'm so awesome, I answered your question four years before you asked it. -Raymond]"

    I see you finally got your time machine. Soon all will be right with the (Windows) world.

    Unless… "Stupid bug! You go squish now!"

  44. PhilW says:

    I sounds like another case of:

    "More computing sins are committed in the name of efficiency (without necessarily achieving it) than for any other single reason – including blind stupidity.” – W.A. Wulf

    or:

    "We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. Yet we should not pass up our opportunities in that critical 3%.A good programmer will not be lulled into complacency by such reasoning, he will be wise to look carefully at the critical code; but only after that code has been identified" – Donald Knuth

  45. John "Z-Bo" Zabroski says:

    Re-reading my comments, they were indeed clumsy and confusing.  They were not connected to the OP.  Apologies.

  46. Jwrayth says:

    "[I'm so awesome, I answered your question four years before you asked it. -Raymond]"

    And you say you don't have a time machine, eh? ;)

  47. John "Z-Bo" Zabroski says:

    @Eric Lippert

    @Developers would do well to remember that what we should actually be optimizing for is *profitable production of things of value*.

    Parametric polymorphism is poorly understood.

    @Raymond Chen

    @[That's why you hire people smart enough to understand the rationale behind the principle so that they don't turn good advice into bad advice. -Raymond]

    I disagree.  If I were interviewing somebody, I would want to see how effectively they used parametric polymorphism and avoid inheritance.  You hire people who can isolate invariants and encode those in general algorithms.  In this way, those invariants become execution requirements independent of implementation and also independent of use case.  If there is a change to the invariants, then that tells the developer the problem domain information has changed and the model needs to be revisited.  CIL allows both modeling techniques, but the .NET Framework Design Guidelines specifically encourage developers towards using sealed base classes (in other words, maintaining proper object-oriented generalization/specialization relationships between classes).

    Testing for people who "understand the rationale behind the principle" is a second-order effect you can't detect in an interview.  So you can't hire for that.  You should test for people who can demonstrate the principle as a deeply ingrained practice in how they craft a system.  This is like an architect observing an electricians clean wiring work of an electrical panel and using a snapshot of the work as a "this is how your electrical contractors SHOULD be providing service to you" demonstration for his students.

    [I have no idea what today's topic has to do with parametric polymorphism. Please don't try to explain. -Raymond]

Comments are closed.