Psychic debugging: Understanding DDE initiation


You too can use your psychic powers to debug the following problem:

We have a problem with opening documents with our application by double-clicking them in Explorer. What's really strange is that if we connect a debugger to Explorer and set a breakpoint on kernel32!CreateProcessW, then wait a moment after CreateProcess returns, then hit 'g', then the document opens fine. But if we don't wait, then the application launches but the document does not open. Instead, you get the error message "Windows cannot find 'abc.lit'. Make sure you typed the name correctly, and then try again." Here is the command we are executing when we run into this problem:

"F:\Program Files\LitSoft\LitWare\LitWare.exe" /dde

What is wrong?

If you've been reading carefully and paid attention to the explanation of how document launching via DDE works, then you already know what the problem is.

Recall that launching documents via DDE is done by first looking for a DDE server and if none is found, launching a server manually and trying again. The command line above was clearly registered as the command associated with a ddeexec. There are two giveaway clues. First is the fact that the document name itself is not present anywhere on the command line. (This couldn't be a direct execution because the program wouldn't know what document it's supposed to be opening!) But the giveaway clue is the phrase /dde on the command line.

Clearly, something is going wrong when Explorer attempts the second DDE conversation to open the document. The fact that making Explorer wait a few seconds fixes the problem makes the cause obvious: The DDE server is being slow to get itself initialized and listening. Explorer launches the server and tries to talk to it, but the server is not yet ready and therefore doesn't respond to the DDE initiate.

But how do you fix this?

The shell assumes that a DDE server is ready to accept connections when it goes input idle. Once WaitForInputIdle on the DDE server returns, Explorer will make its second attempt at initiating a DDE conversation. The fix is for the application to get its DDE server up and running before it starts pumping messages. My guess is that the application moved its DDE server to a background thread to improve startup performance, since the DDE server is not involved in normal program operation. Too bad the program forgot to get the DDE server up and running prior to going input idle when the /dde flag is passed. The one time it's important to have the DDE server running and it misses the boat.

Moral of the story: If you're going to act as a DDE server, make sure you do so before your primary thread starts pumping messages. Otherwise you have a race condition between your application startup and the shell trying to talk to it.

Comments (11)
  1. Norman Diamond says:

    Clearly, something is going wrong when

    > Explorer attempts the second DDE conversation

    > to open the document.

    OK, I’ll accept that this much is clear to a programmer with a
    moderate amount of psychic powers and a moderate understanding of DDE.
     But let’s rewind a bit.

    > you get the error message “Windows cannot

    > find ‘abc.lit’. Make sure you typed the name

    > correctly, and then try again.”

    With a maybe moderate or maybe lesser amount of psychic powers and
    maybe with or without moderate understandings of what Windows might be
    thinking at various times, I’ve had enough experiences with Windows
    saying that Windows can’t find files that explorer is showing me, and
    have had Windows tell me to make sure I typed a double-click correctly.
     I am well aware of my frequent typos, but I know how to
    distinguish whether I typed a name correctly or not from whether
    Windows blames my typing for Windows’ inability to find what I
    double-clicked on.  And you complain about my cynicism.

    No, I do not think my psychic powers are enough to debug this kind
    of bug from the kind of s*it messages that Windows displays.  And
    you complain about my cynicism.  Don’t you think there’s just a
    wee possibility that this error message from Windows leaves something
    to be desired?

    [This article was written from a programmer’s point of view, not an end user’s point of view. As a programmer, you should be accustomed to looking at the bigger picture to see how something could have happened. As for the error message, what would you have preferred? Something ultra-geeky like “The program responsible for opening this document was run, but it did not accept the document”? That just makes people say, “Computers are hard to use.” -Raymond]
  2. Norman Diamond says:

    It so happens that the numerical value -1

    > for a window handle is suspiciously close to

    > the value of HWND_BROADCAST

    Yup.  Here’s what MSDN says:

    > To initiate a Dynamic Data Exchange (DDE)

    > conversation, the client sends a

    > WM_DDE_INITIATE message. Usually, the client

    > broadcasts this message by calling

    > SendMessage, with –1 as the first parameter.

    http://msdn.microsoft.com/library/default.asp?url=/library/en-us/winui/winui/windowsuserinterface/dataexchange/dynamicdataexchange/usingdynamicdataexchange.asp

    Thank you for your other posting

    http://blogs.msdn.com/oldnewthing/archive/2006/06/13/629451.aspx

    which assists in this observation.

  3. Norman Diamond says:

    As for the error message, what would you have

    > preferred? Something ultra-geeky like “The

    > program responsible for opening this document

    > was run, but it did not accept the document”?

    > That just makes people say, “Computers are

    > hard to use.” -Raymond]

    So you really really think that the ultra-ungeeky “Windows cannot
    find ‘abc.lit’. Make sure you typed the name correctly, and then try
    again” makes people say “Computers are easy to use!  I just
    mistyped my double-click, and I’d better learn to type my double-clicks
    without misspelling!”?  And when people do correct the
    misspellings of their double-clicks, Windows starts working and people
    find Windows easy to use?

    I do think “The program responsible for opening this document was
    run, but it did not accept the document” would be better than your
    favourite wording.  I don’t think it would make computers easier
    for people to use, it would just delete a lie and an insult.  Oh
    yeah, that’s why it can’t be done.

    [Thanks for the insult. A message from you wouldn’t be complete without one. -Raymond]
  4. Norman Diamond says:

    [Thanks for the insult. A message from you

    > wouldn’t be complete without one. -Raymond]

    You’re welcome.  An error message from Windows wouldn’t be
    complete without an insult, which is why I figured out why your
    objection to your hypothetical proposed message was really such a valid
    objection.

    By the way, remember how many times you’ve complained about people
    blaming Windows when the fault is really an application or driver or
    whatever?  Well here’s Windows TELLING people to blame Windows
    (“Windows cannot find ‘abc.lit'”) even when you’ve diagnosed that the
    application is at fault, and you want Windows to continue persuading
    people to blame Windows.  Plus Windows is going to continue with
    its insulting “Make sure you typed the name correctly, and then try
    again.”  When people meet you at parties and say they hate you,
    it’s not because I asked them to, it’s because your company asked them
    to, and here you are supporting your company’s production of these
    insults.

    [I didn’t say it was a good error message. But the alternative wasn’t very good either. The real problem is that the ERROR_FILE_NOT_FOUND lost its context. When generated, its context was the DDE target window that couldn’t be found, but when the error percolated out to the caller of ShellExecute, it got interpreted as referring to the file being launched. Yes, it sucks. But suckage is not the same as insulting. -Raymond]
  5. Norman Diamond says:

    But the alternative wasn’t very good either.

    OK, I agree and maybe others will too.  Nonetheless the alternative is tons better than the actual message.

    > The real problem is that the

    > ERROR_FILE_NOT_FOUND lost its context.

    Actually for some reason my psychic powers weren’t enough to guess that the failure to find a window resulted in the error leaking out this way.  I do see now that this might not be easy to fix.

    > But suckage is not the same as insulting.

    True.  Windows’s instruction to a double-clicker to check their typing is both sucky and insulting, and the reason isn’t because suckage is the same as insulting (they’re not the same), the reason is simply because that instruction is both.  Some of us get used to these insults when we’ve seen them for 10 years, but that doesn’t mean that new users will be less insulted by them.  I do see that a fix might not be easy, and for the foreseeable future you can’t avoid teaching new users to hate Microsoft.  I can’t help it either, sorry.

  6. Jules says:

    <i>This article was written from a programmer’s point of view, not an end user’s point of view. As a programmer, you should be accustomed to looking at the bigger picture to see how something could have happened. As for the error message, what would you have preferred? Something ultra-geeky like "The program responsible for opening this document was run, but it did not accept the document"? That just makes people say, "Computers are hard to use." -Raymond</i>

    Raymond, the problem people have with this message is that what Windows is saying at the moment is just confusing.  The file ‘abc.lit’ clearly exists, the user will say, it’s right here.  Why can’t Windows find it?

    Windows should definitely distinguish the case that the document can’t be found and its associated application can’t.  Perhaps "Windows was unable to open ‘abc.lit’ because of an application error." or something similar.  It doesn’t have to be complicated, it just has to represent the truth of what happened, rather than the current message which is blatantly misleading and is likely to have the user barking up the wrong tree for ages trying to figure out what’s wrong.

    This is one of the problems a lot of people have with Windows: it spends too much time trying to protect the user and not enough making sure they have enough information to fix whatever’s wrong.  

    Error messages that say one thing’s wrong when it’s actually something entirely different that’s wrong don’t help anyone.  It would be as useful to pop up a box that says, "I’m sorry, I can’t do that Dave" or something.

  7. Jules says:

    <i>True.  Windows’s instruction to a double-clicker to check their typing is both sucky and insulting, and the reason isn’t because suckage is the same as insulting (they’re not the same), the reason is simply because that instruction is both.  Some of us get used to these insults when we’ve seen them for 10 years, but that doesn’t mean that new users will be less insulted by them.</i>

    I’m reminded of my work colleague who, once every couple of weeks or so, yells at the top of his voice, "No! I didn’t forget my ****ing password!".  The XP "Welcome" screen isn’t just insulting, it’s condescending.

  8. alexandre.r. says:

    Perhaps "Windows was unable to open ‘abc.lit’

    > because of an application error." or something

    > similar

    That message doesn’t actually help the user in fixing the error at all.

    Who caused the application error? Is it something I have done? A bug in the application? Something else? What step should I follow, as a user, to fix this error?

    At least the former message tried to suggest something, although in the current situation, I agree that the suggestion is rather poor.

    > This is one of the problems a lot of people

    > have with Windows: it spends too much time

    > trying to protect the user and not enough

    > making sure they have enough information to

    > fix whatever’s wrong.  

    Your suggestion is not actually an improvement in that regard.

  9. Norman Diamond says:

    Thursday, June 22, 2006 1:11 PM by alexandre.r.

    > What step should I follow, as a user, to fix

    > this error?

    Complain to the maker of the application?

    Historically some makers occasionally fixed bugs when informed about them, though that’s rare today.

    Some makers still don’t require upfront fees before letting victims complain about bugs.

    Maybe the user can also discover that in the future they wish to buy a competing application instead of the buggy application.  If the buggy application doesn’t come from a monopoly, they might even have the power to follow through on this wish.

    Maybe eventually people will stop blaming Windows when the bug belongs to an application.  There’s a long row to hoe before this message can sink in, but every journey starts with a single step.  Hypothetically this could be such a step.

  10. DriverDude says:

    I totally agree with Norman. As long as bugs aren’t revealed correctly, users don’t learn who’s really at fault and vendors don’t feel they need to fix things.

    Raymond, I’ve heard programmers argue against fixing something because they think the user will blame Windows! I can’t tell you how many times I’ve submitted website bug reports, only to be told by a 1st-level tech support weenie that I should upgrade IE, or update Windows, or go read some MS KB. Even after I tell them I’ve done all those things!

    Remember that long Samba bug discussion… make the users pissed at Linksys or Netgear or whoever sold them the box – that’s the only way to make the vendor fix things.

  11. Yuhong Bao says:

    DriverDude wrote

    I totally agree with Norman. As long as bugs aren’t revealed correctly, users don’t learn who’s really at fault and vendors don’t feel they need to fix things.

    Raymond, I’ve heard programmers argue against fixing something because they think the user will blame Windows! I can’t tell you how many times I’ve submitted website bug reports, only to be told by a 1st-level tech support weenie that I should upgrade IE, or update Windows, or go read some MS KB. Even after I tell them I’ve done all those things!

    Remember that long Samba bug discussion… make the users pissed at Linksys or Netgear or whoever sold them the box – that’s the only way to make the vendor fix things.

    I reply:

    The KB article is more often never read before than already read, however.

    And yes, the NAS manufactor is the only that can be blamed for not updating Samba.

Comments are closed.

Skip to main content