How do I pass a lot of data to a process when it starts up?


As we discussed yesterday, if you need to pass more than 32767 characters of information to a child process, you’ll have to use something other than the command line.

One method is to wait for the child process to go input idle, then FindWindow for some agreed-upon window and send it a WM_COPYDATA message. This method has a few problems:

  • You have to come up with some way of knowing that the child process has created its window so you can start looking for it. (WaitForInputIdle is helpful here.)

  • You have to make sure the window you found belongs to the child process and isn’t just some other window which happens to have the same name by coincidence. Or, perhaps, not by coincidence: If there is more than once instance of the child process running, you will need to make sure you’re talking to the right one. (GetWindowThreadProcessId is helpful here.)

  • You have to hope that nobody else manages to find the window and send it the WM_COPYDATA before you do. (If they do, then they have effectively taken over your child process.)

  • The child process needs to be alert for the possibility of a rogue process sending bogus WM_COPYDATA messages in an attempt to confuse it.

The method I prefer is to use anonymous shared memory. The idea is to create a shared memory block and fill it with goodies. Mark the handle as inheritable, then spawn the child process, passing the numeric value of the handle on the command line. The child process parses the handle out of its command line and maps the shared memory block to see what’s in it.

Remarks about this method:

  • You need to be careful to validate the handle, in case somebody tries to be sneaky and pass you something bogus on your command line.

  • In order to mess with your command line, a rogue process needs PROCESS_VM_WRITE permission, and in order to mess with your handle table, it needs PROCESS_DUP_HANDLE permission. These are securable access masks, so proper choice of ACLs will protect you. (And the default ACLs are usually what you want anyway.)

  • There are no names that can be squatted or values that can be spoofed (assuming you’ve protected the process against PROCESS_VM_WRITE and PROCESS_DUP_HANDLE).

  • Since you’re using a shared memory block, nothing actually is copied between the two processes; it is just remapped. This is more efficient for large blocks of data.

Here’s a sample program to illustrate the shared memory technique.

#include <windows.h>
#include <shlwapi.h>
#include <strsafe.h>

struct STARTUPPARAMS {
    int iMagic;     // just one thing
};

In principle, the STARTUPPARAMS can be arbitrarily complicated, but for illustrative purposes, I’m just going to pass a single integer.

STARTUPPARAMS *CreateStartupParams(HANDLE *phMapping)
{
    STARTUPPARAMS *psp = NULL;
    SECURITY_ATTRIBUTES sa;
    sa.nLength = sizeof(sa);
    sa.lpSecurityDescriptor = NULL;
    sa.bInheritHandle = TRUE;
    HANDLE hMapping = CreateFileMapping(INVALID_HANDLE_VALUE,
                &sa, PAGE_READWRITE, 0,
                sizeof(STARTUPPARAMS), NULL);
    if (hMapping) {
        psp = (STARTUPPARAMS *)
                    MapViewOfFile(hMapping, FILE_MAP_WRITE,
                        0, 0, 0);
        if (!psp) {
            CloseHandle(hMapping);
            hMapping = NULL;
        }
    }

    *phMapping = hMapping;
    return psp;
}

The CreateStartupParams function creates a STARTUPPARAMS structure in an inheritable shared memory block. First, we fill out a SECURITY_ATTRIBUTES structure so we can mark the handle as inheritable by child processes. Setting the lpSecurityDescriptor to NULL means that we will use the default security descriptor, which is fine for us. We then create a shared memory object of the appropriate size, map it into memory, and return both the handle and the mapped address.

STARTUPPARAMS *GetStartupParams(LPSTR pszCmdLine, HANDLE *phMapping)
{
    STARTUPPARAMS *psp = NULL;
    LONGLONG llHandle;
    if (StrToInt64ExA(pszCmdLine, STIF_DEFAULT, &llHandle)) {
        *phMapping = (HANDLE)(INT_PTR)llHandle;
        psp = (STARTUPPARAMS *)
                MapViewOfFile(*phMapping, FILE_MAP_READ, 0, 0, 0);
        if (psp) {
            //  Now that we've mapped it, do some validation
            MEMORY_BASIC_INFORMATION mbi;
            if (VirtualQuery(psp, &mbi, sizeof(mbi)) >= sizeof(mbi) &&
                mbi.State == MEM_COMMIT &&
                mbi.BaseAddress == psp &&
                mbi.RegionSize >= sizeof(STARTUPPARAMS)) {
                // Success!
            } else {
                // Memory block was invalid - toss it
                UnmapViewOfFile(psp);
                psp = NULL;
            }
        }
    }
    return psp;
}

The GetStartupParams function is the counterpart to CreateStartupParams. It parses a handle from the command line and attempts to map a view. If the handle isn’t a file mapping handle, the call to MapViewOfFile will fail, so we get that part of the parameter validation for free. We use VirtualQuery to validate the size of the memory block. (We can’t use a strict equality test since the value we get back will be rounded up to the nearest page multiple.)

void FreeStartupParams(STARTUPPARAMS *psp, HANDLE hMapping)
{
    UnmapViewOfFile(psp);
    CloseHandle(hMapping);
}

After we’re done with the startup parameters (either on the creation side or the consumption side), we need to free them to avoid a memory leak. That’s what FreeStartupParams is for.

void PassNumberViaSharedMemory(HANDLE hMapping)
{
    TCHAR szModule[MAX_PATH];
    TCHAR szCommand[MAX_PATH * 2];
    DWORD cch = GetModuleFileName(NULL, szModule, MAX_PATH);
    if (cch > 0 && cch < MAX_PATH &&
        SUCCEEDED(StringCchPrintf(szCommand, MAX_PATH * 2,
                  TEXT("\"%s\" %I64d"), szModule,
                  (INT64)(INT_PTR)hMapping))) {
        STARTUPINFO si = { sizeof(si) };
        PROCESS_INFORMATION pi;
        if (CreateProcess(szModule, szCommand, NULL, NULL,
                          TRUE,
                          0, NULL, NULL, &si, &pi)) {
            CloseHandle(pi.hProcess);
            CloseHandle(pi.hThread);
        }
    }
}

Most of the work here is just building the command line. We run ourselves (using the GetModuleFileName(NULL) trick), passing the numerical value of the handle on the command line, and passing TRUE to CreateProcess to indicate that we want inheritable handles to be inherited. Note the extra quotation marks in case our program’s name contains a space.

int CALLBACK
WinMain(HINSTANCE hinst, HINSTANCE hinstPrev,
        LPSTR pszCmdLine, int nShowCmd)
{
    HANDLE hMapping;
    STARTUPPARAMS *psp;
    if (pszCmdLine[0]) {
        psp = GetStartupParams(pszCmdLine, &hMapping);
        if (psp) {
            TCHAR sz[64];
            StringCchPrintf(sz, 64, TEXT("%d"), psp->iMagic);
            MessageBox(NULL, sz, TEXT("The Value"), MB_OK);
            FreeStartupParams(psp, hMapping);
        }
    } else {
        psp = CreateStartupParams(&hMapping);
        if (psp) {
            psp->iMagic = 42;
            PassNumberViaSharedMemory(hMapping);
            FreeStartupParams(psp, hMapping);
        }
    }
    return 0;
}

At last we put it all together. If we have a command line parameter, then this means that we are the child process, so we convert it into a STARTUPPARAMS and display the number that was passed. If we don’t have a command line parameter, then this means that we are the parent process, so we create a STARTUPPARAMS, stuff the magic number into it (42, of course), and pass it to the child process.

So there you have it: Passing a “large” (well, okay, small in this example, but it could have been megabytes if you wanted) amount of data securely to a child process.

Comments (18)
  1. Mike Dimmick says:

    There’s a race condition there, isn’t there? What if the parent manages to execute FreeStartupParams before the child got to MapViewOfFile? The mapping object will disappear.

    You also need an unnamed event object (manual reset, initially unsignalled) created by the parent process, the handle to it passed on the command line in the same way as the mapping object; the child must signal it after calling GetStartupParams, and the server must wait on it before calling FreeStartupParams.

  2. Raymond Chen says:

    No race condition here,: The handle was duplicated into the child process when you specified bInherit = TRUE. So now there are two handles to the shared memory block, one in the parent and one in the child. A shared memory object is not destroyed until all handles are closed. If the parent closes its handle first, no problem – the child still has an open handle. Only after both the parent and the child call FreeStartupParams (in whichever order) does the object get destroyed.

    So it’s good that no synchronization is needed, because adding synchronization would create another potential for a bug: What if somebody else manages to sneak in and signal the event before the child gets it? What if the child gets confused and never signals the handle? (For example, it may have run into a critical error in its startup and never got off the ground.) This method is a safe "fire and forget" – you push the data into the child and now it’s all the child’s problem. The parent’s hands are clean.

  3. Catatonic says:

    Excellent article! One question – I gather this technique is not suitable if I want to continue sending data back & forth after startup (I might use pipes for that).

  4. Raymond Chen says:

    You can use a similar trick to pass a pair of anonymous pipe handles on the command line – then you can talk back and forth over the pipes. Note that if you do this, the parent needs to watch out for the child ends of the pipes getting wedged or dying spontaneously – you don’t want to leave zombies in the parent waiting around for nonexistent children.

  5. Mike Dimmick says:

    Thanks for clearing that up, Raymond. For some reason I was thinking an inherited handle was only counted once, not twice.

  6. Anonymous says:

    I much prefer the anonymous pipe method. Simpler code, and easily portable to Unix if the need arises.

  7. Anonymous says:

    *cough*cough* sockets *cough*cough*

  8. Diego says:

    Why not just put the data you want to pass into a temp file (in the temp dir?) and just pass it’s location as the first param to the child process?

  9. Jordan Russell says:

    This line caught my eye:

    if (VirtualQuery(psp, &mbi, sizeof(mbi)) >= sizeof(mbi) &&

    Is the ">=" a typo, or is there really a case where VirtualQuery will return a value greater than that specified in the dwLength parameter?

  10. Raymond Chen says:

    1. *cough* sockets – fine, go ahead and use sockets. I won’t stop you. I didn’t say this was the only way of doing things. It’s just that sockets require the parent to babysit the child, whereas anonymous shared memory is fire-and-forget.

    2. Temporary file: Now you have to worry about when to delete the temporary file. The parent needs to delete the temporary file if the child process somehow failed to get off the ground; detecting this is tricky.

    3. Return value of VirtualQuery: Equality would work here as well. I just had inequality on the brain due to the later size test.

  11. Anonymous says:

    Err, that socket comment was in response to anonymous pipes. I’d personally use shared memory for passing data in a real app or temp files in dummy throw away apps.

  12. Raymond Chen says:

    Sockets: Anonymous pipes have the advantage of avoiding the network stack. You don’t need to load winsock, negotiate a version via WSAInitialize, etc.

  13. Eugene Gershnik says:

    Just use _popen() conveniently provided by VC. This avoids the pre-calculating shared memory size, manually setting up pipes, dealing with sockets, caring about passing and validating the handle values and whole lot of other issues.

  14. Raymond Chen says:

    Notice however that _popen works only for console apps. "The _popen function returns an invalid file handle, if used in a Windows program, that will cause the program to hang indefinitely."

    Note also that if you do this, the child process won’t be able to accept input from the user via stdin or stdout since they got hooked up to the parent instead. _popen is more for running a child process in the background and feeding it input / capturing its output.

    You can still use it if you like. Just be aware of the limitations.

  15. Julian Rozentur says:

    How about simply passing in the data as standard input for the child process? The STARTUPINFO structure has hStdInput member where you pass a handle to a file with data (possibly, memory-mapped).

  16. Raymond Chen says:

    Of course this means you now have to worry about who will delete the file…

  17. IgorF says:

    I recently cooked up a very similar mechanism for doing this, and I’m glad to see it validated as a design approach.

    There’s one thing that I’m unclear about though–what exactly are the child process’s responsibilities regarding closing inherited handles? In your example, you explicitly close the inherited handle. Is that strictly necessary? What if the child process barfs early at startup for some reason and bails before even looking at its args? Or what if a child process is created with inheritable handles that it doesn’t even know to look for? Are those handles then leaked? And does it really matter if the child is short-lived, or does process shutdown clean things up to satisfaction?

    As far as I can tell the docs are silent on this.

  18. Raymond Chen says:

    The child process’s responsibilities are exactly the same as with handles it opened itself.

    http://msdn.microsoft.com/library/en-us/dllproc/base/terminating_a_process.asp

    "Open handles to files or other resources are closed automatically when a process terminates. However, the objects themselves exist until all open handles to them are closed. This means that an object remains valid after a process closes, if another process has a handle to it."

Comments are closed.