Everyone quotes command line arguments the wrong way


At one time or another, we all need to pass arbitrary command line arguments to some program. These arguments could be filenames, debugging flags, hostnames, or any other kind of information: the point is that we are to take a string and make sure some child program receives exactly that string in its argv array no matter what this string contains. The task is harder than it appears.

For better or for worse1, Windows knows about only one command line string for each process. Because one string is not terribly useful, libraries conspire to provide the illusion of multiple command line arguments: before creating a subprocess, a program combines all argument strings into one command line string, and the newly-born subprocess, before calling main, splits this string into arguments and passes the arguments as argv. In principle, each program can parse the command line string differently, but most use the convetion that CommandLineToArgvW and the Microsoft C library understand. This convention is a good one because it provides a way to encode any command line argument as part of a command line string without losing information.

The problem is that there is no ArgvToCommandLineW. How do we construct an argument string understood by CommandLineToArgvW?

Test program

For exposition’s sake, we’ll be using this small program to generate the example output below:

#include <stdio.h>
int __cdecl
wmain(int, wchar_t** argv)
    for (int arg = 0; argv[arg]; ++arg) {
        wprintf (L"%d: [%s]\n", arg, argv[arg]);

Popular solutions: clear, simple, and wrong

The C runtime library is useless

Our first instinct should be look for a library function that’s already solved the problem. Were we to conduct this search, we would quickly find functions provided by the C runtime specifically for running subprocesses: the _exec and _spawn families. These functions appear to be precisely what we need: they take an arbitrary number of distinct command line arguments and promise to launch a subprocess. Unfortunately and counter-intuitively, these functions do not quote or process these arguments: instead, they’re all concatenated into a single string, with arguments separated spaces, and this string is then passed to the child, where it’s eventually interpreted by CommandLineToArgvW. This approach works only for arguments that do not themselves contain spaces.

Thus, if we run child.exe like this:

_spawnlp (_P_WAIT, L"child.exe", L"child.exe", L"argument 1", L"argument 2", NULL)

child.exe receives five command line parameters:

0: [child.exe]
1: [argument]
2: [1]
3: [argument]
4: [2]

That’s not what we want! So, while the C runtime process-launching functions appear to be what we need, they’re actually useless for solving the command line argument problem.

Adding quotation marks is insufficient

Having seen the problems with the previous approach, one might suggest that we heed the advice provided in the the C runtime documentation and surround arguments containing spaces with double-quote characters. This solution is also wrong.

Recall2 that the CommandLineToArgV convention3 stipulates that arguments containing spaces be surrounded by double quotation marks. Following the above approach and surrounding arguments with quotation marks produces good results for simple cases, and many people stop here.

child.exe argument1 "argument 2"  "\some\path with\spaces"

Is correctly interpreted as:

0: [child.exe]
1: [argument1]
2: [argument 2]
3: [\some\path with\spaces]

So far, so good: but what if our arguments are more complex? Bear in mind that our convention also stipulates that we precede a double quotation mark that is part of an argument with a backslash, and that we precede with another backslash a backslash that precedes a quotation mark which itself actually terminates the argument and is not included as part of that argument4. So, if we follow our simplistic approach above and use this as our command line:

child.exe argument1 "she said, "you had me at hello""  "\some\path with\spaces"

then the child sees:

0: [child.exe]
1: [argument1]
2: [she said, you]
3: [had]
4: [me]
5: [at]
6: [hello]
7: [\some\path with\spaces]

We don’t want that either! The problem becomes more insidious when quotes are unbalanced:

child.exe argument1 "argument"2" argument3 argument4
0: [child.exe]
1: [argument1]
2: [argument2 argument3 argument4]

Arguments ending with backslashes also lead to undesired interpretations:

child.exe "\some\directory with\spaces\" argument2

0: [child.exe]
1: [\some\directory with\spaces" argument2]

Many popular programs (including command shells, the authors of which really should know better) use this simple approach. Developers test only with simple argument strings, leaving users confused and puzzled when their command lines are occasionally mangled.

The correct solution

We’ve seen that properly quoting an arbitrary command line argument is non-trivial, and that doing it incorrectly causes subtle and maddening problems. The function below properly quotes an argument; translate it into your language and coding style of choice.

ArgvQuote (
    const std::wstring& Argument,
    std::wstring& CommandLine,
    bool Force
Routine Description:
    This routine appends the given argument to a command line such
    that CommandLineToArgvW will return the argument string unchanged.
    Arguments in a command line should be separated by spaces; this
    function does not add these spaces.
    Argument - Supplies the argument to encode.

    CommandLine - Supplies the command line to which we append the encoded argument string.

    Force - Supplies an indication of whether we should quote
            the argument even if it does not contain any characters that would
            ordinarily require quoting.
Return Value:
    // Unless we're told otherwise, don't quote unless we actually
    // need to do so --- hopefully avoid problems if programs won't
    // parse quotes properly
    if (Force == false &&
        Argument.empty () == false &&
        Argument.find_first_of (L" \t\n\v\"") == Argument.npos)
        CommandLine.append (Argument);
    else {
        CommandLine.push_back (L'"');
        for (auto It = Argument.begin () ; ; ++It) {
            unsigned NumberBackslashes = 0;
            while (It != Argument.end () && *It == L'\\') {
            if (It == Argument.end ()) {
                // Escape all backslashes, but let the terminating
                // double quotation mark we add below be interpreted
                // as a metacharacter.
                CommandLine.append (NumberBackslashes * 2, L'\\');
            else if (*It == L'"') {

                // Escape all backslashes and the following
                // double quotation mark.
                CommandLine.append (NumberBackslashes * 2 + 1, L'\\');
                CommandLine.push_back (*It);
            else {
                // Backslashes aren't special here.
                CommandLine.append (NumberBackslashes, L'\\');
                CommandLine.push_back (*It);
        CommandLine.push_back (L'"');

To construct a command line string for a program from arbitrary arguments, we encode each argument (including the program name) with the above function and follow all but the last with a single space. We can then pass the resulting string as the lpCommandLine parameter to CreateProcess and be confident that the child process will decode each argument to exactly the arguments we were initially given.


One might conclude that we’re done: after all, we now understand how to send arbitrary strings through CreateProcess so that they emerge unchanged on the other side — but life is not that simple. Often, we can’t directly supply our command line CreateProcess, but instead must pass it through a level of indirection before reaching our intended child process. The most common indirection is a trip trough the venerable cmd.exe, which we encounter when we use the system function, construct a script for later execution, write a makefile for nmake, and do many other things. Because the quoting rules for CommandLineToArgvWcmd’s differ from those of cmd, we cannot give a command line intended for the former to the latter and and expect our arguments to survive the trip. Because this difference is subtle, it’s easy to forget about it and leave latent bugs in programs that interact with cmd.

cmd is essentially a text preprocessor: given a command line, it makes a series of textual transformations, then hands the transformed command line to CreateProcess. Some transformations replace environment variable names with their values. Some transformations, such as IO redirection, have useful side effects. Some transformations, such as those triggered by the &, ||, && operators, split command lines into several parts. It’s important to note that cmd doesn’t know or care about command line arguments per se: to cmd, the world is made up of whole command lines. Like the post office delivering postcards, cmd doesn’t try to understand what is handles: it merely does its limited job (via these transformations), then leaves it another program to figure out what the final command line means.

All of cmd’s transformations are triggered by the presence of one of the metacharacters (, ), %, !, ^, ", <, >, &, and |. " is particularly interesting: when cmd is transforming a command line and sees a ", it copies a " to the new command line, then begins copying characters from the old command line to the new one without seeing whether any of these characters is a metacharacter. This copying continues until cmd either reaches the end of the command line, runs into a variable substitution, or sees another ". In the last case, cmd copies a " to the new command line and resumes normal processing. This behavior is almost, but not quite like what CommandLineFromArgvW does with the same character4; the difference is that cmd does not know about the \" sequence and begins interpreting metacharacters earlier than we would expect. It should be apparent why the commands below produce the indicated output:

C:\> child "hello world" >\\.\nul

C:\> child "hello"world" >\\.\nul
0: [child]
1: [helloworld >\\.\nul]

C:\> child "hello\"world" >\\.\nul
0: [child]
1: [hello"world]
2: [>\\.\nul]


If we relying on cmd’s "-behavior to protect arguments, quotation marks will produce unexpected behavior. If we pass untrusted data as command line parameters, then the bugs caused by this convention mismatch become a security issues:

C:\> child "malicious argument\"&whoami"
0: [child]
1: [malicious-argument"]

Here, cmd is interpreting the & metacharacter as a command separator because, from its point of view, the & character lies outside the quoted region. whoami, of course, can be replaced by any number of harmful commands. Note that this command is properly formatted for use with CreateProcess: it’s the passing through cmd that causes trouble.

A better method of quoting

While the " metacharacter cannot fully protect metacharacters in our command lines against unintended shell interpretation, the ^ metacharacter can. When cmd transforms a command line and sees a ^, it ignores the ^ character itself and copies the next character to the new command line literally, metacharacter or not. That’s why ^ works as the line continuation character: it tells cmd to copy a subsequent newline as itself instead of regarding that newline as a command terminator. If we prefix with ^ every metacharacter in an argument string, cmd will transform that string into the one we mean to use.

C:\> child ^"malicious argument\^"^&whoami^"
0: [child]
1: [malicious argument"&whoami]

It’s important to also ^-escape " characters: otherwise, cmd would literally copy ^ characters appearing between " pairs, mangling the argument string in the process:

C:\> child "malicious-argument\^"^&whoami"
0: [child]
1: [malicious-argument\^&whoami]

In effect, because cmd’s " handling is useless for our purposes, we use ^ to tell cmd to not attempt to be smart about quote detection.


In general, we can safely pass arbitrary command line arguments to programs, provided we take a few basic precautions.


  1. Always escape all arguments so that they will be decoded properly by CommandLineToArgvW, perhaps using my ArgvQuote function above.
  2. After step 1, then if and only if the command line produced will be interpreted by cmd, prefix each shell metacharacter (or each character) with a ^ character.

Do not:

  1. Simply add quotes around command line argument arguments without any further processing.
  2. Allow cmd to ever see an unescaped " character.


1 Worse.
2 You did follow my links above, yes?
3 I know you didn’t.
4 Just to be clear: CommandLineFromArgvW neither knows nor cares about cmd’s metacharacters and looks only for " and \.

Comments (13)

  1. wrl says:

    your problem is that _spawnlp() is nothing more than a thin wrapper around CreateProcess(), and CreateProcess() takes the entire command line as one parameter. _spawnlp() just takes all of the arguments it's been passed, concatenates them all together, and shuttles that off to CreateProcess(). if you pass an argument to _spawnlp() that contains a space, you will need to escape it yourself. try passing an argv[0] value that has a space in it and don't escape it, you'll see what I mean.

    Your comment isn't even wrong: you didn't describe the problem. You described the behavior of _spawnlp(). It's that behavior that forms part of the problem.
  2. Dan Fabulich says:

    I've read the post (and the code for ArgvQuote) a few times now, and I can't figure out why anybody would want to use the Force parameter to ArgvQuote. What's it good for? When/why would I want to use force?

  3. J. Bohl says:

    Does somebody happen to know whether the precent sign (%) is interpreted somehow by GetCommandLineW? For me, passing something like "image%5d.jpg" (with or without the quotes…) gives "image].jpg" (result from GetCommandLineW) – however the argv passed to main is correct.

  4. Klaus says:

    When I create a file test.cmd with contents

     @echo off

     echo %1

     echo %2

    and then run

     test "malicious-argument^"^&whoami"

    the result is


     Der Befehl "whoami"" ist entweder falsch geschrieben oder

     konnte nicht gefunden werden.

     ECHO ist ausgeschaltet (OFF).

    so it tried to execute whoami and didn't find a second argument.

  5. nanmahi says:


    i have one issue with the command substitution in Createprocess(). i have a sample command here,

    cd &unixpath2win /dev/fs/C/Users/

    i want this command to work like, it should convert the path into "C:User" and the directory should be changed to that path.

    but when i run this it displays like,

    C:UsersChairman>cd &unixpath2win /dev/fs/C/Users




    It is not changing the directory.

    i don't understand whats going wrong her.

    Anyone suggest the solution

  6. Peter says:

    These lines are suspicious when NumberBackslashes is greater than one:

    // Backslashes aren't special here.

    CommandLine.append (NumberBackslashes, L'\');

    CommandLine.push_back (*It);

    For example, there are two backslashes (NumberBackslashes == 2). This code will output three backslashes. Is it the correct escaping? Should not two backslashes be transformed into four backslashes? Thank you.

  7. extrarius says:

    For CommandLineToArgvW, a backslash only needs to be escaped if it is followed by another backslash or a double quote (because those are the two recognized escape sequences – any other character preceded by a backslash is just a backslash and the other character. The backslash is not a general escape character like it is inside many programming language string literals.)

  8. CyberShadow says:

    Unfortunately your advice does not work when the path to the child program to be executed contains spaces. There is no way to escape those, or the surrounding quotes, in a way that both bypasses cmd's special quote treatment and succeeds at starting the program.

  9. Adrian says:

    I think Peter is write on the last clause – the behavior of \ and \x should be identical, except that in the \x case the builder could *optionally* omit doubling of at most *one* of the slashes. In other words, legal output for \x can be \\\x (6) or \\x (5).

    As for CyberShadow's comment, CreateProcess lets you specify the application name and command line separately. So presumably, that's how you'd approach that problem.

  10. ExGen says:

    Hey guys, the only thing you need to do to handle quoted strings is to double each double quotes!

    If you want to pass the string "Test" (with quotes) you just do it like this: example.exe """Test"""

    And for nested quotes, you make it in steps:

    test"test => example.exe "test""test" => example.exe "example.exe ""test""""test"""

  11. Noel J Grandin says:

    Surely the correct answer is simply to add a ArgvToCommandLineW function to the Shell API?

    How is it that Microsoft people do not seem to understand that they need to evolve and improve their API over time?

  12. Why doesn't MS just create a phreaking exec(argv[])? says:

    This all quoting and unquoting is insane and inconsistent.

  13. Oh wow, I had no idea cmd's quote handling was so broken!