Why does the CreateProcess function modify its input command line?


One of the nasty gotchas of the CreateProcess function is that the lpCommandLine parameter must be a mutable pointer. If you pass a pointer to memory that cannot be written to (for example, a pointer to a page that is marked PAGE_READONLY), then you might crash. Commenter Ritchie wonders why this parameter is so weird.

Basically, somebody back in the 1980’s wanted to avoid allocating memory. (Another way of interpreting this is that somebody tried to be a bit too clever.)

The CreateProcess temporarily modifies the string you pass as the lpCommandLine in its attempt to figure out where the program name ends and the command line arguments begin. Now, it could have made a copy of the string and made its temporary modifications to the copy, but hey, if you modify the input string directly, then you save yourself an expensive memory allocation operation. Back in the old days, people worried about avoiding memory allocations, so this class of micro-optimization is the sort of thing people worried about as a matter of course. Of course, nowadays, it seems rather antiquated.

Now, there may also be good technical reasons (as opposed to merely performance considerations) for avoiding allocating memory on the heap. When a program crashes, the just in time debugger is launched with the CreateProcess function, and you don’t want to allocate memory on the heap if the reason the program crashed is that the heap is corrupted. Otherwise, you can get yourself into a recursive crash loop: While trying to launch the debugger, you crash, which means you try to launch the debugger to debug the new crash, which again crashes, and so on. The original authors of the CreateProcess function were careful to avoid allocating memory off the heap, so that in the case the function is being asked to launch the debugger, it won’t get waylaid by a corrupted heap.

Whether these concerns are still valid today I am not sure, but it was those concerns that influenced the original design and therefore the interface.

Why is it that only the Unicode version is affected? Well, because the ANSI version of the function just converts its strings to Unicode and the calls the Unicode version of the function. Consequently, the ANSI version of the function happens to implement the workaround as a side effect of its original goal: The string passed to the Unicode version of the function is a temporary string!

Exercise: Why is it okay for the ANSI version of the CreateProcess to allocate a temporary string from the heap when the Unicode function cannot?

Comments (27)
  1. jMarkP says:

    My psychic Microsoft compatibility hack sense is tingling and I’m guessing the answer has something to do with a program which relies on the Unicode function modifying its parameter somehow.

    Am I warm?

  2. I’d guess the ANSI version can do a string allocate for two reasons:

    1 – It’s a wrapper – the allocation is done ‘before’ calling into the real CreateProcess.

    2 – The debugger will be launching the Unicode version directly.

  3. Jack Mathews says:

    Since the ANSI version is essentially a shim utility library on top of the UNICODE kernel, a crash in ANSI functions will not cause the recursive crash you talk about.

  4. Mark says:

    It looks like CreateProcessW does allocate from Peb->ProcessHeap to sort out the path.  Not sure how long this has been the case, though.

  5. Alexandre Grigoriev says:

    I’ll chime in.

    First, CreateProcessA doesn’t need a mutable string, because it will first be converted to an UNICODE string and then passed to CreateProcessW.

    Second, you could allocate temporary storage on stack (using _alloca), or use VirtualAlloc which doesn’t depend on any usermode locks.

    Third, the whole idea of running JIT launcher in a context of a failed process is flawed. At least, I hope it’s done in context of a dedicated thread (in which case you’ve got a meg of stack reserved, too, and can use _alloca).

  6. W says:

    Would a 64kB allocation on stack break something?

  7. Nick says:

    Simple, really.  Nobody uses Unicode so there’s no reason to spend time fixing it!

    I kid, I kid (well, kinda :)…

  8. waleri says:

    I suspect the length of the argument of CreateProcessA may not exceed MAX_PATH.

  9. Gabe says:

    Is there any reason CreateProcessW couldn’t trap the write-fail exception and only then allocate memory? That would solve the problem of attempting to call it with a string from a page marked PAGE_READONLY, so at least you don’t get arbitrary failures.

  10. MadQ says:

    Some programs also play around with the input command line. verclsid.exe seems to do this.

  11. NickPick says:

    Exercise:

    Explain why a design decision of CreateProcess function was probably not done by "somebody back in the 1980s"

  12. waleri says:

    While we’re at it, what is the reason *not* to fix the contract for CreateProcess? Even if argument is changed from LPTSTR to LPCTSTR this should be backward compatible. Unless of course CreateProcess *still* modifies that memory…

    [Changing the function prototype breaks existing source code. Try it:
    typedef BOOL (WINAPI *CREATEPROCESSFN)(
     LPCTSTR lpApplicationName,
     LPTSTR lpCommandLine,
     LPSECURITY_ATTRIBUTES lpProcessAttributes,
     LPSECURITY_ATTRIBUTES lpThreadAttributes,
     BOOL bInheritHandles,
     DWORD dwCreationFlags,
     LPVOID lpEnvironment,
     LPCTSTR lpCurrentDirectory,
     LPSTARTUPINFO lpStartupInfo,
     LPPROCESS_INFORMATION lpProcessInformation);
    
    CREATEPROCESSFN DoCreateProcess = MyCreateProcess;
    
    typedef BOOL WINAPI MyCreateProcess(
     LPCTSTR lpApplicationName,
     LPTSTR lpCommandLine,
     LPSECURITY_ATTRIBUTES lpProcessAttributes,
     LPSECURITY_ATTRIBUTES lpThreadAttributes,
     BOOL bInheritHandles,
     DWORD dwCreationFlags,
     LPVOID lpEnvironment,
     LPCTSTR lpCurrentDirectory,
     LPSTARTUPINFO lpStartupInfo,
     LPPROCESS_INFORMATION lpProcessInformation)
    {
     if (!RedirectCreateProcess) {
      // disable trappingin in future callers
      DoCreateProcess = CreateProcess;
      return CreateProcess(…);
     }
     …
    }
    

    I learned this lesson the hard way. -Raymond]

  13. waleri says:

    >> Some programs also play around with the input command line. verclsid.exe seems to do this.

    Isn’t command line is a copy? After all, the child process should access it in its own address space.

  14. dave says:

    Is there any reason CreateProcessW couldn’t

    trap the write-fail exception and only then

    allocate memory? That would solve the problem

    of attempting to call it with a string from

    page marked PAGE_READONLY, so at least you

    don’t get arbitrary failures.

    Well, I’d say that no-one gets arbitrary failures today, anyway.  People who fail to read the documentation get exactly the failures that the documentation implied they’d get. ‘Unexpected by programmer X’ != ‘arbitrary’.

    And in any case, why futz around with an API that has managed to work adequately well for the past 16 years?  Sure, it’s a little odd the first time you bump into it, but aren’t there better things to work on?

    But if we are going to futz with CreateProcess, can we remove the mess that allows you to specify the image name EITHER through a separate parameter or as part of the command line?

    (Joking… it’s way too late for that).

  15. peterchen says:

    @Gabe – that would be worse: a function that promises not to modify memory, but does. (And if it doesn’t make that promise, there’s no point in fixing it).

    waleri – interesting thought! MSDN doesn’t say anything about a limit, though.

    Now, gotta chew through that sample…

  16. Brad says:

    Since people seem to have danced around the answer, is it:

    The JIT debugger is always launched with CreateProcessW? (so that case can never allocate from the heap).

  17. Alexey Borzenkov says:

    Why is it okay for the ANSI version of the CreateProcess to allocate a temporary string

    I’m not 100% sure, but from what I seen during numerous debugging sessions I think that all path related ANSI functions use the same thread-local buffer, so it is either already preallocated, or is not allocated at all.

  18. Someone says:

    “When a program crashes, the just in time debugger is launched with the CreateProcess function, and you don’t want to allocate memory on the heap if the reason the program crashed is that the heap is corrupted.”

    I do not follow that argument. It would only make sense if that command line memory were the only allocation done by CreateProcess. If so, where would CreateProcess get the process stack or the memory for the debugger image from?

    Or is that debugger running in some zombie state all the time, and was CreateProcess hacked to not really launch it, but revive it from the dead? If so, why would one add that code to CreateProcess, rather than creating a new function “StartJITDebugger”?

    [Um, the debugger is a separate process. Heap corruption is process-local. -Raymond]
  19. Mike Dimmick says:

    @Someone: when dealing with CreateProcess, you really have to think in terms of the calling, parent, process, and the child process it creates.

    The article is talking about a buffer that CreateProcess is (not) creating in the context of the calling process. The process’s initial thread stack is not directly created by CreateProcess but by the process startup code in the kernel, in the context of the new process. The virtual memory for the program code is a shared-memory section backed by the program image file, which is paged in on demand.

    I’m surprised that Kernel32’s default unhandled-exception code is creating the debugger process directly. I would have thought it would instead send a message to the Win32 subsystem to invoke a debugger on its behalf.

    A debugger is just a standard Win32 program which calls DebugActiveProcess to debug another process, or creates a process with one of the DEBUG_PROCESS or DEBUG_ONLY_THIS_PROCESS flags. The debugger then calls WaitForDebugEvent to have Windows tell it about new threads, loaded or unloaded DLLs, debugging messages, thread exits, and exceptions. Breakpoints are set by modifying the code of the target process (with WriteProcessMemory) to place a breakpoint instruction at that location instead; that instruction causes an exception (code STATUS_BREAKPOINT, 0x80000003) to occur in the program, which is reported through WaitForDebugEvent.

    To restart, the debugger replaces the breakpoint with the code that should have been at that location, sets the thread’s processor context to single-step (execute one instruction), then restarts with ContinueDebugEvent. A STATUS_SINGLE_STEP exception is then raised in the program, the debugger rewrites the breakpoint, and again continues execution to continue at full speed.

  20. Jamie Anderson says:

    I would hazard a guess that the ANSI wrapper functions in the kernel have their own private heap that they use to allocate memory. That would avoid the matter because any memory allocation will happen on a different heap to the one that’s been corrupted.

    [What if the private heap is corrupted? -Raymond]
  21. Одна из неприятных особенностей функции CreateProcess состоит в том, что параметр lpCommandLine должен

  22. Someone says:

    @Mike Dimmick: thanks for the explanation. I knew most of what you said, except for what surprised you, too: that, apparently, the ‘crashed’ program launches the debugger (or, probably more precisely: the kernel calls CreateProcess to start the debugger while in the context of the crashed application). But even with that, I do not see why one would have to modify the string you pass as the lpCommandLine to figure out where the program name ends and the command line arguments begin.

    The first thing I can think of is that it is not the figuring out, but the use of the program name as an argument to some other system call (FindFile?) that requires the modification. I guess the change would be to overwrite a space in the command line with a NULL. I have seen people do the reverse on Unix systems: creating a program’s command line by replacing the NULLs at the end of all but the last of the argv entries by a space. I doubt that that is guaranteed to work, but it did on the one system where I saw somebody do that.

    A second possibility could be that CreateProcess does some Unicode normalization in-place. That would also explain why the the ANSI version never modifies that argument.

  23. GregM says:

    "I guess the change would be to overwrite a space in the command line with a NULL."

    Yes, that’s what the documentation for CreateProcess says.

  24. Random832 says:

    Wait… what if there is no space between the program name and arguments – like "cmd/?" (or any /flags generally) – where does it put the null then?

  25. Falcon says:

    @Random832: Try typing "cmd/?" into the Run dialog text box and see what happens!

    (Tested on Windows XP, can’t speak for Vista or any other versions…)

  26. teo says:

    While CreateProcess is peculiar, the real fun is in ShellExecuteEx. Yesterday, I had to code a self-elevating program. Basicaly it boilds down to:

    if(!IsUserAnAdmin())

     ShellExecute("runas", my-own-command-line)

    Well, doh! It turns out that Windows (whose mantra used to be "Easy stuff should be easy and hard stuff should be possible") considers this the hard stuff…

    My first try:

    ShellExecute(0, L"runas", argv[0], GetCommandLine() … ), nope it won’t budge

    2nd try:

    ShellExecute(0, L"runas", 0, GetCommandLine() …)

    And so on and so forth. At the end I had to write a helper function which goes through the command line of the current process, builds a specially escaped, massaged, tangerine-flavoured, painted-white-and-pink … command line which after digested by the ShellExecute monstrosity will produce the one I started with. Btw, whoever designed the whole "runas" hack obviously decided that console applications are so out-of-fashion. What window handle a console app is supposed to feed it?

    But that is just a minor hack, i fixed it in less than 15 mins. If you want to have a taste of Hell, go compile a program which should run on XP with Vista/2k8 Platform SDK and imports CreateVssBackupComponents() function. Now that is one fine mess. Where to start? The wrong documentation? The fact that on XP it’s exported as extern "C++", which makes it exported with C++ mangled name, which obviously differs between 32 and 64 bit compilers and is not natively consumable by any other language in existence, being it another c++ compiler, delphi or what not? The most sickening part is the "fix" – in the form of an INLINE FUNCTION in the platform sdk headers; making it completely impossible to compile a program targeting XP using the official function name … Microsoft, Microsoft, what have I done to bring thy wrath upon me?

Comments are closed.