Comparing WaitOnAddress with futexes (futexi? futexen?)


Linux has a synchronization primitive called a futex which is similar to Windows's Wait­On­Address in that both let you create a synchronization object out of nothing. (Well, okay, you need to set aside some memory.)

The basic operations on a futex depend on what color glasses you're wearing. If you are wearing syscall-colored glasses, then the basic operations are WAIT and WAKE. But we're going to wear user-mode-colored glasses, in which the basic operations are UP and DOWN. (In futex-speak, "up" and "down" are verbs. And you thought Microspeak was weird.)

Conceptually, a futex is a 32-bit variable that acts like a semaphore with a maximum count of 1.

To initialize a futex, set it to 0 if you want it to be initially unavailable, or 1 if you want it to be initially available.

"Downing" a futex is what you do when you want to wait for the futex to become available. Atomically decrement the futex, and if the result is 0, then you have successfully claimed the futex. Otherwise, call into the kernel to wait for the futex to become available. (This isn't perfectly accurate, but bear with me.)

"Upping" a futex is what you do when you want to make the futex available. Atomically increment the futex, and if the result is 1, then there are no waiters, and you're done. Otherwise, set it to 1 explicitly (indicating that the futex is now free) and call into the kernel to tell it to release the waiting threads. (It is an error to "up" a futex that is already available.)

In theory, the values of a futex correspond to these states:

  • 1 means that the futex is available.
  • 0 means that the futex is unavailable, and there are no waiters.
  • A negative number means that the futex is unavailable, and the number of waiters is the negative of the value. For example, a value of −2 means that there are two waiters.

In practice, however, any negative number means that the futex is unavailable. The actual number of waiting threads is kept in the kernel. The actual rule for "downing" a futex is that if you down it to a negative number, you should set the futex to −1 and then call into the kernel. Forcing the value to −1 avoids underflow if more than 2 billion threads wait on the futex.¹

A futex, therefore, is a lightweight version of an auto-reset event. There is at most one token, and the only operations are to claim the token (waiting as necessary for a token to become available) and to release the token.

An obvious difference between futexes and Wait­On­Address is that a futex mimics an auto-reset event (or a single-token semaphore, which is basically the same thing), whereas Wait­On­Address can be used to mimic a large number of things depending on how you use it.

A notable difference between futexes and Wait­On­Address is that the contents of the futex are not your choice; the contents are controlled by the rules for futexes. If you have other information you want to keep track of, you need to keep it somewhere outside the futex. On the other hand, Wait­On­Address lets you put anything you want in the memory, and it is typically some central bookkeeping information that you would have needed to keep track of anyway.

A big difference is that futexes work across processes (via shared memory), whereas Wait­On­Address works only within a process.

Here's a summary table, because people like tables.

futex Wait­On­Address
Size 4 bytes 1, 2, 4, or 8 bytes (your choice)
Contents Controlled by futex Controlled by you
Access Must use atomic primitives No restrictions
Scope Can be cross-process Limited to a single process
Operations "Down" (wait for object) Wait for value to change
"Up" (release object, wake waiters) Wake waiters
Acts like Event (or single-token semaphore) Whatever you want
Spurious wakes? Yes Yes

Both primitives enter kernel mode only if they need to wait or wake. If there is no contention, then both primitives operate entirely in user mode.

¹ If you can get 2 billion threads to wait on a futex, then I'm both impressed and disappointed. Impressed that you were able to create 2 billion threads in the first place, and disappointed that you have a futex so hot that you managed to get 2 billion threads waiting on it. You should fix that.

Comments (19)
  1. Karellen says:

    As index -> indices, so futex -> futices.

    (For the nitpickers, yes, indexes is a word a word in your dictionary, but no, it’s not the plural of index; it’s a conjugation of “to index”, as in “she indexes”.

    I now eagerly wait for Skitt’s Law to strike…)

    1. laonianren says:

      I have Windows 3.1 (a licensed copy for which I still have the floppies!) installed in DOSBox just so I can run my ancient copy of the NSOED. And it says:

      index n. / Pl. indexes, (esp. technical) indices

      1. Lesson learned: oral language does not play nicely with pedants.

    2. Roman says:

      The “ex” in “futex” isn’t from “index”, though. “Futex” stands for “fast user-space mutex” and “mutex” stands for “mutual exclusion”. So your reasoning doesn’t apply.

      1. Karellen says:

        “Index”, “mutual” and “exclusion” all come from Latin roots (indicō, mūtuus, excludere), so I posit that it is entirely appropriate that “mutex” take a plural form in the same style as “index” (and “appendix”, fwiw).

        1. Dirk Gently says:

          “Mutex” is a portmanteau of latin words, but it’s not a latin word itself, so the unusual latin plural should not apply to it.

          1. Karellen says:

            So what other, more commonly (correctly) used, plural form for English words ending in “-<vowel>x” would you suggest instead?

          2. @karellen: English grammar patterns are more traditionally informed by the etymological roots of the word than the construction of the word itself. When the roots are non-obvious, mixed, or simply non-existent, then different grammatical patterns may be applied across a population.

          3. GregM says:

            Karellen, I’d say we have a couple choices, -es (boxes, foxes, axes, taxes, annexes, equinoxes, fixes, sexes, hexes), and -en (oxen).

        2. Dirk Gently says:

          wiktionary says the plural of mutex is mutexes. This makes sense to me, as “-ices” ending is only applied to latin words. Other online dictionaries I found didn’t have an entry for “mutex”.

    3. alegr1 says:

      Do they speak English or Latin in What?

  2. skSdnW says:

    These atomic and locks post are always good! Were there a lot of kernel changes required to support the new stuff in Vista? And before that, keyed event was the only real change?

  3. Bradley says:

    “Forcing the value to −1 avoids underflow if more than 2 billion threads wait on the futex.”

    Thanks, now everyone is looking at me strange because I burst out laughing in the office.

    1. Joshua says:

      Well considering that requires not less than 16TB RAM …

  4. Did this really deserve a patent? I’m confused what the innovation is here.
    https://www.google.com/patents/US20120144406

    1. I just listed an entire table full of interesting things.

      1. GWO says:

        Besides, patents are not granted based on being interesting, they’re for novelty and non-obviousness. If no-one else thought of using the synchronisation object to store extra state, then it novel. And if a patent examiner then decides its not obvious, it gets a patent.

  5. Hasturkun says:

    I think your description of futex is incorrect.

    As far as I can tell from the documentation of the futex(2) syscall, the only case where the contents of the futex are specified is when using PI futexes, in all other cases the contents are yours to do as you like.

    It seems to me that the user mode documentation is only giving an example usage, as shown in the futex(7) notes:

    This man page illustrates the most common use of the futex(2) primitives; it is by no means the only one.

    The futex itself behaves much like WaitOnAddress does, with the addition of some other operations such as FUTEX_CMP_REQUEUE.

  6. Joshua says:

    Linux “userspace” futexes have this nice property that they can be unlocked when one process crashes out while holding a futex in a shared memory segment. (Yes, the programmer has to bother to set this up to make it work.) I’ve since learned this isn’t as entirely desirable as one would hope as now on acquiring lock the code must check for corruption due to other process being interrupted at any point. But still…

Comments are closed.

Skip to main content