How can SIGINT be safely delivered on the main thread?


Commenter AnotherMatt wonders why Win32 console programs deliver console notifications on a different thread. Why doesn’t it deliver them on the main thread?

Actually, my question is the reverse. Why does unix deliver it on the main thread? It makes it nearly impossible to do anything of consequence inside the signal handler. The main thread might be inside the heap manager (holding the heap critical section) when the signal is raised. If the signal handler tried to access the heap, it would deadlock with itself if you’re lucky, or just corrupt the heap if you aren’t.

For example, consider this signal handler:

void catch_int(int sig_num)
{
    /* re-set the signal handler again to catch_int, for next time */
    signal(SIGINT, catch_int);
    /* and print the message */
    printf("Don't do that");
    fflush(stdout);
}

What happens if the signal is raised while the main program is executing its own fflush, say after it had already flushed half the buffer? If two threads called fflush, the second caller would wait for the first to complete. But here, it’s all coming from within the same thread; the second caller can’t wait for the first caller to return, since the first caller can’t run until the second caller returns!

(Note also that this signal handler potentially modifies errno, which can lead to “impossible” bugs in the main program.)

Win32 doesn’t believe in interrupt user-mode code with other user-mode code asynchronously because it makes it impossible to reason about the state of the process. Delivering the console notification on a second thread means that if the second thread tries to access the heap while the first thread is inside the heap manager, the second thread will dutifully wait for the heap to stabilize before it goes ahead and starts mucking with it.

Comments (20)
  1. Gabe says:

    This is the same reason you had to be careful about what calls you made in TSRs.

  2. Karellen says:

    "Why does unix deliver it on the main thread?"

    History. IIRC, threads and the paraphernalia surrounding them weren’t standardised for quite a while, but POSIX needed to define a way of working with signals. So setting a sig_atomic_t, which can be tested on the main (single) thread later, was defined as The Right Thing. When POSIX later standardised threads, it needed to do so in a stable backwards-compatible manner. Sending signals to the main thread where the old Right Thing still worked is fine, as it means that you don’t have to fiddle with unrelated parts of your program just because one part of it (or a library you now want to use) wants to add threading support.

    Fortunately, with signalfd()[0] in Linux 2.6.22/glibc 2.8 signal delivery becomes part of the "everything is a file" unix philosophy and can be waited on with select()/poll()/epoll()/etc… which is a lot more elegant than asynchronous syscall-interrupting delivery.

    With timerfd()[1] from Linux 2.6.25 things get tidier still.

    (Hopefully, in the distant future, these might get ported to other unixen, and then even further down the line standardised by POSIX as an alternate way to handle signals.)

    I haven’t looked into it closely, but I suspect that Plan9[2] will have already been doing this for a while… :)

    [0] http://kerneltrap.org/man/linux/man2/signalfd.2

    [1] http://kerneltrap.org/man/linux/man2/timerfd_create.2

    [2] http://en.wikipedia.org/wiki/Plan_9_from_Bell_Labs

  3. Koro says:

    Makes me think, I’d love if Unix had WSAEventSelect/WaitForMultipleObjects. Makes all threaded socket code SOOOO much easier. Just wait on both the socket event and a "cancel" event.

    In fact, just having a WaitForMultipleObjects equivalent would be more than enough. It seems everything in Unix has their own function to wait on them which are separate.

  4. Karellen says:

    Koro > If you supply your cancel event via a file descriptor (e.g. a pipe(2)) then you can wait on it and a socket with select()/poll()/epoll().

    If you’re on Linux (and some other unices, but you’ll have to check the documentation to know which) then you can reliably write() to a file descriptor created with pipe() in a signal handler.

    A search for "self-pipe trick"[0] will give you more info.

    [0] http://www.google.com/search?q=%22self-pipe+trick%22

  5. Alexander Grigoriev says:

    "Fortunately, with signalfd()[0] in Linux 2.6.22/glibc 2.8 signal delivery becomes part of the "everything is a file" unix philosophy and can be waited on with select()/poll()/epoll()/etc… which is a lot more elegant than asynchronous syscall-interrupting delivery."

    Which conceptually is no different than PostMessage/MsgWaitMultipleObjects

  6. Koro says:

    Thank you a lot Karellen and Alexander, I will remember that when I port my code to Unix in a few years :)

  7. BryanK says:

    Karellen — the link that Raymond provided (to securecoding.cert.org, SIG31-C) has a link itself to SIG30-C, which says that the only functions that are guaranteed to be safe to call from a signal handler (by the C99 standard) are abort, _Exit, and signal (as long as signal’s first argument is the same signal that’s being caught).

    However, it also says that POSIX adds a few more functions to the list; in particular, write() and read() are both async-safe according to POSIX.

    So I believe that any OS that claims to be POSIX compliant should have a signal-safe write().  Of course, stdio is out of the question, but raw low-level write is possible.

  8. SuperKoko says:

    From Raymond:

    "Why does unix deliver it on the main thread?"

    It doesn’t necessarily. A signal delivered to a process is recieved by the first thread that doesn’t mask the signal.

    If you wish to recieve no signal on the main thread, simply mask all signals on it with sigprocmask.

    Signals specifically sent to a specific thread will remain pending during all the time the signal is masked on the thread.

    So, sigprocmask can be used for critical code sections, where signals aren’t accepted.

    Or, you may mask all messages, and then use sigwait().

  9. Tim Smith says:

    Back in my old fart days with VMS, if you did any type of programming with asynchronous system traps (AST) you had to worry about this sort of stuff.  

    Instead of using things like a mutex to synchronize data access, you would have to disable ASTs.  Given that having to worry about disabling AST or locking a lock boils down to basically the same thing, the question is how well protected from ASTs is the OS and runtime libraries such as CRTL.  *shrug*  It has been too long for me to remember exactly, but off the top of my head, I don’t remember having to worry about the OS calls.  But is this much different from having to worry about a given library being multithreaded safe.

    I will say that given the choice between the two, I’ll take the Win32 model.  When working with ASTs, you always did the least amount of work possible in the AST.

  10. Yuhong Bao says:

    Mac OS used completion routines to notify apps when async I/O was complete and these ran at interrupt time. There were some routines that was "interrupt-safe" and thus could be used at interrupt time:

    http://developer.apple.com/technotes/tn/tn1104.html

  11. John says:

    signal() dates back to at least Unix V6, released in 1975.  Threads weren’t widely available on Unix until the 1990’s.  IMHO, Linux didn’t really have usable threading until the 2.6 kernel.

  12. Joel says:

    "Makes me think, I’d love if Unix had WSAEventSelect/WaitForMultipleObjects. Makes all threaded socket code SOOOO much easier. Just wait on both the socket event and a "cancel" event.

    In fact, just having a WaitForMultipleObjects equivalent would be more than enough. It seems everything in Unix has their own function to wait on them which are separate."

    No, that’s not the case.  You can use select()/poll() to wait on file descriptors, not only for input and output, but also for other events on those descriptors.  If you use the self-pipe trick (instead of signalfd), you can also wait on signals with select().  It’s trivial to implement.

  13. Roger says:

    There are two types of signal, asynchronous (SIGINT, SIGQUIT, etc) and synchronous (SIGSEGV, SIGBUS etc). Historically UNIX programs were single threaded, so a thread was not an option, But setting a flag, writing to a pipe or using longjmp() are all good solutions.

    For the MT world, using sigwait() in the main thread and having other threads do the applications work gets the best of all worlds.

    For the curious, try SIGSEGV on Win32.

  14. Roger says:

    On Win32 you can inject code into the main thread by duplicating the current thread handle and using QueueUserAPC.

    However doing a longjmp() in a handler called during select() really screws winsock up.

  15. Daniel Colascione says:

    THE HARD WAY

    See also pselect, which can keep signals blocked except when waiting for IO, effectively allowing select() to wait on signals.

    Another option is to just keep signals blocked, and use signals for I/O completion.

    I.e., block all signals when the program starts up using sigmask). (Though if you’re polite, you still want SIGINT and SIGTERM to exit the program immediately.)

    Next, when you open a file descriptor on which you want to block, use fcntl() to make it send a signal. Have it send a realtime signal so the signals queue up.

    Then, when you’re ready to wait for IO, use sigwait(). When that returns, check which signal triggered the return and you’re set.

    THE EASY WAY

    Use the BSD-licensed libevent (http://monkey.org/~provos/libevent/).

  16. 640k says:

    WinApis use threads as a golden hammer. They make programs far more complex, which we could observe recently in this article: http://blogs.msdn.com/oldnewthing/archive/2008/07/25/8770548.aspx

  17. In code that conforms to the letter of the C (or C++) standard, there’s not a lot you can do in a signal handler, that is true.

    Windows runs the signal handler in a new thread, and thus allows more functions to be called from a signal handler. On the one hand, this is a good thing: you can do more in a signal handler.

    On the other, this is a bad thing: you now have a separate thread, and have to ensure that it correctly protects against race conditions with the other threads in the application. By allowing the handler to do more, you lead users along the path of wanting to access shared data, and thus exposing them to all the issues regarding protecting shared data from accesses from multiple threads.

    It’s all just swings and roundabouts.

  18. Drew Frezell says:

    Based on the documentation from opengroup, fflush() is not an async signal safe function.  Here is a list of functions that can be safely called from a signal handler.

    http://www.opengroup.org/onlinepubs/009695399/functions/xsh_chap02_04.html#tag_02_04_03

    The basic guideline, if you call a function that may block, you shouldn’t be calling it in a signal handler.

  19. Daniel Colascione says:

    Drew, note that it’s perfectly safe to call socket, connect, read, and write from a signal handler. Even fork and exec are permissible. You can actually do a lot from a signal handler, provided you obey certain restrictions and don’t use non-reentrant *library* functions.

  20. Ian says:

    I don’t know much about UNIX, but I’ve spent quite a while writing code for OS-9 (Radisys’, not the one from Apple).

    The way to deal with signals in OS-9 is to put them into a queue in the signal handler, and then unqueue them in the main event loop and act on them.

    I honestly think this is the best way to deal with signals in any case. If your system treats signals as true interrupts, you have no idea what the system is doing at the time the signal is first processed, whether it is in the main thread or not. A secondary thread that terminates the process on a SIGINT would be just as bad as calling an unsafe function from the main thread.

    When I ported the software to Windows, I used  QueueUserAPC() (as Roger suggested above). There is a slight difference in that the ‘quasi-signal’ isn’t processed until the thread enters an alertable wait state, but that didn’t matter for my application.

Comments are closed.