How is the CommandLineToArgvW function intended to be used?


The CommandLineToArgvW function does some basic command line parsing. A customer reported that it was producing strange results when you passed an empty string as the first parameter:

LPWSTR argv = CommandLineToArgvW(L"", &argc);

Well, okay, yeah, but huh?

The first parameter to CommandLineToArgvW is supposed to be the value returned by GetCommandLineW. That's the command line, and that's what CommandLineToArgvW was designed to parse. If you pass something else, then CommandLineToArgvW will try to cope, but it's not really doing what it was designed for.

It turns out that the customer was mistakenly passing the lpCmdLine parameter that was passed to the wWinMain function:

int WINAPI wWinMain(
    HINSTANCE hInstance,
    HINSTANCE hPrevInstance,
    LPWSTR lpCmdLine,
    int nCmdShow)
{
    int argc;
    LPWSTR argv = CommandLineToArgvW(lpCmdLine, &argc);
    ...
}

That command line is not in the format that CommandLineToArgvW expects. The CommandLineToArgvW function wants the full, unexpurgated command line as returned by the GetCommandLineW function, and it breaks it up on the assumption that the first word on the command line is the program name. If you hand it an empty string, the CommandLineToArgvW function says, "Whoa, whoever generated this command line totally screwed up. I'll try to muddle through as best I can."

Next time, we'll look at the strange status of quotation marks and backslashes in CommandLineToArgvW.

Comments (18)
  1. Footsie says:

    It does beg the question though. Why not just call GetCommandLineW inside CommandLineToArgvW then ?

  2. Michael Hotaling says:

    @Footsie:

    GetCommandLineW gets the command line, a string, unparsed and unprocessed.

    CommandLineToArgvW takes the above, and turns it into the argv array, and argc count (as in the old school main variety).

    You wouldn't call one instead of the other, though you may call the first so you can call the second.

  3. RobertWrayUK says:

    Footise, because you then couldn't do any pre-processing on the value returned by GetCommandLineW, for example to replace tokens, which may be simpler prior to it being split up into an array.

  4. Michael Hotaling says:

    @Footsie:

    Eh, my bad, misread your comment.

  5. I think you want LPWSTR *argv = ..., not LPWSTR argv = ...

    I don't see the problem.  I would expect it to:

    return NULL

    SetLastError(ERROR_INVALID_PARAMETER)

    Not touch the output parameter.

  6. Footsie says:

    @RobertWrayUK: Well, if you do pre-processing, what guarantee do you have that you're keeping enforcing the prerequisites that CommandLineToArgvW has ? Its documentation does not quite state what they are...

  7. Joshua says:

    Answer to why it doesn't call GetCommandLine directly: wildargs

    The stock implementation of wildargs is an alternate WinMain that walks the command line, replacing unquoted wildcards with lists of quoted matches (retaining the wildcard argument if no match) and then calling GetCommandLine to parse that command line.

    I believe wildargs is long defunct but if anybody had the .c source it would still work just fine. (Yes, you linked to wildargs.obj to build.)

  8. Ivo says:

    I did a quick test and started a process like this:

    CreateProcess(L"test.exe",L"command",NULL,....)

    In the child process, lpCmdLine is blank and GetCommandLineW() returns just "command". The exe name is nowhere to be found. The documentation of CreateProcess is not clear if the second parameter should contain the exe name. It says that if the first parameter is NULL then the second should have the exe, but doesn't say if the exe should be in the second parameter if the first parameter is valid. So what is the true recommended way to use CreateProcess?

    [Read the linked article. The convention is mentioned in the documentation for CreateProcess. You would be recommended to follow it. -Raymond]
  9. Yuhong Bao says:

    The right way in this case would to be to use wmain instead.

  10. PhilW says:

    I wonder if the "strange results" the customer saw were what the documentation says: "If this parameter is an empty string the function returns the path to the current executable file."

  11. Anon says:

    (OT)

    Please consider changing your "body" font-size CSS rule from "1.2em" to "medium".

    "1.2em" is currently yielding an unhinted 12.528 pt font, which looks just awful with ClearType disabled.

    "medium" should produce a font that is exactly 12 pt in size on all modern browsers.

    [Done. Thanks for the tip. -Raymond]
  12. gpb says:

    wouldn't the right thing to do not be to fail when receiving an argument clearly not well-formed instead of muddling through? Or is this function so old it has to be compatible to functions written in the dark ages?

    [You really think a program should be allowed to display the error: "Error: Giving up before even trying to parse the command line"? (Oh, and CommandLineToArgvW dates back to 1994. It's old enough to drive.) -Raymond]
  13. Jules says:

    @Ivo: "The documentation of CreateProcess is not clear if the second parameter should contain the exe name."

    I'd say it could be clearer, but it's fairly obvious anyway:

    "The lpCommandLine parameter can be NULL. In that case, the function uses the string pointed to by lpApplicationName as the command line."

    Strongly suggests that including the application name in the command line is at the very least something you should consider doing, as it wouldn't be the default behaviour otherwise.

    "If both lpApplicationName and lpCommandLine are non-NULL [...] the null-terminated string pointed to by lpCommandLine specifies the command line. [...] Because argv[0] is the module name, C programmers generally repeat the module name as the first token in the command line."

    A clear recommendation that the module name (i.e. the .exe file name) should be included in the command line.  The suggestion that this is only linked to C programmers is a little strange, as every language I've ever used under Windows follows the same command line conventions, but in absence of any other recommendations and without understanding the reasoning for it any programmer would be well advised to follow the advice given.

    @Raymond: 'You really think a program should be allowed to display the error: "Error: Giving up before even trying to parse the command line"?'

    Well, I'd suggest the error should be phrased more like "Error: invalid command line specified".  But, yes, I'd suggest any application should fail in the presence of clearly incorrect data rather than attempting to guess what the data should have been, as the latter is inviting disaster in the case where the guess is incorrect.

    [In practice the error message would be "Error: Out of memory" because that's what most people would consider a failure of CommandLineToArgvW to mean. Either that or the app will crash because it assumes that argc will always be greater than zero. It seems natural that the recovery mode for "no command line" should be "act like the user entered no command line arguments". -Raymond]
  14. Neil says:

    I found it ironic that while GetCommandLineW was one of the few Unicode methods provided by Windows 95 sadly because CommandLineToArgvW wasn't provided you couldn't really do anything useful with it.

  15. gpb says:

    [You really think a program should be allowed to display the error: "Error: Giving up before even trying to parse the command line"? -Raymond]

    If the only way to get invalid "Commad Lines" is to meddle with them and giving wrong format to a function not supporting this - why not? It's not that I want to show this message when given a real-world command line, read by the appropriate function (as told in documentation).

    This "trying to somehow figure out what could be meant even if given line noise as input" pseudo-solution has been the source of too many bugs. Better to fail early than show some strange bug later on (or even somehow triggering an remote exploitable security risk because some other function deeper inwards expects well-formed input and does not check good enough (a bug itself, of course))

    Of course, if there are (important enough) programs somehow needing this wrongdoing....

  16. Yuhong Bao says:

    "I found it ironic that while GetCommandLineW was one of the few Unicode methods provided by Windows 95 sadly because CommandLineToArgvW wasn't provided you couldn't really do anything useful with it."

    It was provided, look at the exports for the SHELL32 in Win95 to prove it.

  17. googly says:

    'This "trying to somehow figure out what could be meant even if given line noise as input" pseudo-solution has been the source of too many bugs.'

    Amen. You just described half of the IE Trident engine.

  18. It seems natural that the recovery mode for "no command line" should be...

    Begging the question.  The point at issue is whether the API should fail-fast or attempt recovery.  (This is a purely academic debate, since the behavior has already long since been decided.)

Comments are closed.

Skip to main content