Be careful when redirecting both a process’s stdin and stdout to pipes, for you can easily deadlock


A common problem when people create a process and redirect both stdin and stdout to pipes is that they fail to keep the pipes flowing. Once a pipe clogs, the disturbance propagates backward until everything clogs up.

Here is a common error, in pseudocode:

// Create pipes for stdin and stdout
CreatePipe(&hReadStdin, &hWriteStdin, NULL, 0);
CreatePipe(&hReadStdout, &hWriteStdout, NULL, 0);

// hook them up to the process we're about to create
startup_info.hStdOutput = hWriteStdout;
startup_info.hStdInput = hReadStdin;

// create the process
CreateProcess(...);

// write something to the process's stdin
WriteFile(hWriteStdin, ...);

// read the result from the process's stdout
ReadFile(hReadStdout, ...);

You see code like this all over the place. I want to generate some input to a program and capture the output, so I pump the input as the process's stdin and read the output from the process's stdout. What could possibly go wrong?

This problem is well-known to unix programmers, but it seems that the knowledge hasn't migrated to Win32 programmers. (Or .NET programmers, who also encounter this problem.)

Recall how anonymous pipes work. (Actually, this description is oversimplified, but it gets the point across.) A pipe is a marketplace for a single commodity: Bytes in the pipe. If there is somebody selling bytes (Write­File), the seller waits until there is a buyer (Read­File). If there is somebody looking to buy bytes, then the buyer waits until there is a seller.

In other words, when somebody writes to a pipe, the call to Write­File waits until somebody issues a Read­File. Conversely, when somebody reads from a pipe, the call to Read­File waits until somebody calls Write­File. When there is a matching read and write, the bytes are transferred from the writer's buffer to the reader's buffer. If the reader asks for fewer bytes than the writer provided, then the writer continues waiting until all the bytes have been read. (On the other hand, if the writer provides fewer bytes than the reader requested, the reader is given a partial read. Yes, there's asymmetry there.)

Okay, so where's the deadlock in the above code fragment? We write some data into one pipe (connected to a process's stdin) and then read from another pipe (connected to a process's stdout). For example, the program might take some input, do some transformation on it, and print the result to stdout. Consider:

You Helper
WriteFile(stdin, "AB")
(waits for reader)
ReadFile(stdin, ch)
reads A
(still waiting since not all data read)
encounters errors
WriteFile(stdout, "Error: Widget unavailable\r\n")
(waits for reader)

And now we're deadlocked. Your process is waiting for the helper process to finish reading all the data you wrote (specifically, waiting for it to read B), and the helper process is waiting for your process to finish reading the data it wrote to its stdout (specifically, waiting for you to read the error message).

There's a feature of pipes that can mask this problem for a long time: Buffering.

The pipe manager might decide that when somebody offers some bytes for sale, instead of making the writer wait for a reader to arrive, the pipe manager will be a market-maker and buy the bytes himself. The writer is then unblocked and permitted to continue execution. Meanwhile, when a reader finally arrives, the request is satisfied from the stash of bytes the pipe manager had previously bought. (But the pipe manager doesn't take a 10% cut.)

Therefore, the error case above happens to work, because the buffering has masked the problem:

You Helper
WriteFile(stdin, "AB")
pipe manager accepts the write
ReadFile(stdout, result)
(waits for read)
ReadFile(stdin, ch)
reads A
encounters errors
WriteFile(stdout, "Error: Widget unavailable\r\n")
Read completes

As long as the amount of unread data in the pipe is within the budget of the pipe manager, the deadlock is temporarily avoided. Of course, that just means it will show up later under harder-to-debug situations. (For example, if the program you are driving prints a prompt for each line of input, then the problem won't show up until you give the program a large input data set: For small data sets, all the prompts will fit in the pipe buffer, but once you hit the magic number, the program hangs because the pipe is waiting for you to drain all those prompts.)

To avoid this problem, your program needs to keep reading from stdout while it's writing to stdin, so that neither will block the other. The easiest way to do this is to perform the two operations on separate threads.

Next time, another common problem with pipes.

Exercise: A customer reported that this function would sometimes hang waiting for the process to exit. Discuss.

int RunCommand(string command, string commandParams)
{
 var info = new ProcessStartInfo(command, commandParams);
 info.UseShellExecute = false;
 info.RedirectStandardOutput = true;
 info.RedirectStandardError = true;
 var process = Process.Start(info);
 while (!process.HasExited) Thread.Sleep(1000);
 return process.ExitCode;
}

Exercise: Based on your answer to the previous exercise, the customer responds, "I added the following code, but the problem persists." Discuss.

int RunCommand(string command, string commandParams)
{
 var info = new ProcessStartInfo(command, commandParams);
 info.UseShellExecute = false;
 info.RedirectStandardOutput = true;
 info.RedirectStandardError = true;
 var process = Process.Start(info);
 var reader = Process.StandardOutput;
 var results = new StringBuilder();
 string lineOut;
 while ((lineOut = reader.ReadLine()) != null) {
  results.AppendLine("STDOUT: " + lineOut);
 }
 reader = Process.StandardError;
 while ((lineOut = reader.ReadLine()) != null) {
  results.AppendLine("STDERR: " + lineOut);
 }
 while (!process.HasExited) Thread.Sleep(1000);
 return process.ExitCode;
}
Comments (18)
  1. henke37 says:

    The exercise answer is simple:

    There is no rule that the writes will all first be to one pipe and then to the other. Disregarding the buffer, an example situation is like this:

    Write to stdout.

    Stdout is read.

    Write to stderr.

    Stderr is read.

    Write to stdout.

    No more reading of any of the pipes, deadlock.

  2. Adam Rosenfield says:

    Simultaneously reading from a child process's stdout and stderr without the possibility of deadlock is tricky.  One option is to use multiple threads.  Another option is to use a named pipe opened with FILE_FLAG_OVERLAPPED (overlapped operations are not supported by anonymous pipes) and use asynchronous ReadFile calls along with WaitForMultipleObjects.  Neither option is trivial.

    Unix programmers would use select(2)/poll(2).

  3. Torben Nehmer says:

    You should be able to avoid a deadlock without threads by using PeekNamedPipe, which according to MSN allows a peek into anonymous pipes as well (despite the name). Haven't tested it yet though. However, in combination with writing to the processes stdin it is quite tricky to get things right without a deadlock (if not impossible as you cannot peek into the pipes buffer as far as i know). I am currently dealing with a situation like this in Axapta 3.0, which doesn't support sane multi-threading making things – well – interesting to say the least…

  4. Ben says:

    On Unix this is solved with SIGALRM, select or poll.

    It could also be solved easily with some sort of ReadFileTimeout function or WaitForSingleObject/WaitForMultipleObjects.

  5. Brian says:

    This may be too simple an answer which may well be mocked.

    Exercise 1 – it's hanging because the process is waiting for input.  And since you've redirected stdout/err, you never know it is waiting… you just get a blinky cursor.  So you should see what is actually being generated and process it accordingly.

    -customer attempts to follow the advice-

    Exercise 2 – The bytes in stdout are emptied and the bytes in stderr are emptied before the condition occurs that requires attention.  All you've done is, well, nothing.  And you fall back into the same condition in Exercise 1.

    -customer realizes theyre going about the problem the wrong way and cleans their act up-

    Maybe thats just wishful speculation though.

  6. problem solved says:

    Solution: Increase buffer size until bugs go away.

  7. Ian Kelly says:

    Exercise #1: If the program being created writes enough data to the standard output/error to exceed the pipe buffers then there will be a deadlock. The helper program will be waiting for the spawning program to read the data, while the spawning program will be waiting for the helper program to exit.

    Exercise #2: The MSDN points out this specific problem. If the helper program writes enough to the standard error stream to exceed the pipe buffers then a deadlock will occur. The helper program will be stuck waiting for the spawning program to read from standard error, but the spawning program will be waiting for the program to write to standard output or close the pipes/exit.

    MSDN points out that you can use the asynchronous methods BeginOutputReadLine or BeginErrorReadLine of the Process class to solve this issue, by asynchronously reading one stream and synchronously reading the other. (You can also asynchronously read both or put synchronously read both in separate threads)

  8. Timothy Byrd says:

    Raymond – thanks for the stock market analogy. It made me smile.

  9. GSerg says:

    CreatePipe(&hReadStdin, &hWriteStdin, NULL, 0);

    Surely, SECURITY_ATTRIBUTES cannot be null? Otherwise the pipe handle will not be inherited by the child process.

  10. Tim says:

    For once I can solve the exercises, but it looks like Ian got the correct answers first.

  11. Medinoc says:

    Stupid software apparently ate my comment, since I got no confirmation message. And of course, clicking "back" gives you only a blank field". Anyway:

    The worst part about polling is that pipes are not listed as supporting wait functions ( msdn.microsoft.com/…/ms687032(VS.85).aspx ).

    At least now we can always use PeekNamedPipe(), since Win9x is more or less gone and even named pipes are implemented with anonymous pipes.

  12. Neil says:

    If he doesn't want the output of the process, why doesn't he redirect its output to NUL and/or start the process in a detached console?

  13. Leo Davidson says:

    PeekNamedPipe is virtually useless.

    How do you know when to call it?

    If you use a separate event (or other synchronization object), you might as well use that to signal when it's time to do a normal read of the pipe.

    If you don't, you end up polling, which is terrible.

    It really should be possible to pass pipe handles to the wait functions. Overlapped I/O is ridiculously difficult to get right if you work through all the possible cases (as Raymond has discussed in the past), unless you cheat and only read one character at a time or something.

  14. Adam Rosenfield says:

    @Medinoc: Actually, it's the other way around.  According to the docs for CreatePipe, "Anonymous pipes are implemented using a named pipe with a unique name. Therefore, you can often pass a handle to an anonymous pipe to a function that requires a handle to a named pipe."

    And yes, it's incredibly annoying that the blog software eats comments if you don't submit them within X seconds of loading the page, for some absurdly small value of X.  I sincerely hope somebody's informed the MSDN blogs folks about this.

  15. Medinoc says:

    @Adam Rosenfield: Damn, I have dyslexia today. You're right, absolutely right, it's what I was trying to say in the first place.

    About the comment eating part, I think it's linked to that captcha thing that used to be visible on the page: It's invisible now, but it guess it's that part that expires.

  16. Gabe says:

    The list of objects that MSDN says WaitForSingleObject can wait on is not an exhaustive list. For example, file handles are waitable yet not on the list. So it's quite likely that named pipes are perfectly acceptable to wait on too.

  17. Leo Davidson says:

    @Gabe: File handles are only waitable in very specific, limited situations involving overlapped I/O, where you can already tie an event to the operation anyway.

    stackoverflow.com/…/waitforsingleobject-on-a-file-handle

    As for pipes, if they even are waitable handles, I would not use them in a wait call without knowing what their wait semantics are (i.e. when they become signalled/unsignalled). You'd have to make an awful lot of assumptions to use them, and it'd be behaviour which could change.

  18. mixedup says:

    Do the customers really bother you with such a simple things?!? In my project it took me a second to think "gosh that'll deadlock for sure" and think about asynchrony.

Comments are closed.