It’s fine to use fibers, but everybody has to be on board with the plan


We saw fibers a long time ago when I looked at how you can use fibers as a form of coroutines to simplify the writing of enumerators. A fiber is a handy tool, but it's a tool with very sharp edges.

Since fibers are promiscuous with threads, you have to be careful when running code that cares about what thread it is running on, because that code may discover that its thread changed out from under it.

For example, critical sections and mutexes remember which thread owns them. If you enter a critical section on a fiber, and then you unschedule the fiber, then reschedule it onto a different thread, and then you leave the critical section, your critical section will end up corrupted because you broke the rule that says that a critical section must be exited on the same thread that entered it.

Actually, you were already in bad shape once you unscheduled the fiber while it owned a resource: An unscheduled fiber cannot release the resource. Unscheduling a fiber is like suspending a thread: Anybody who later waits for that fiber to do anything will be waiting for an awful long time, because the fiber isn't running at all. The difference, though, is that the fiber is unscheduled at controlled points in its execution, so you at least have a chance at suspending it at a safe time if you understand what the fiber is doing.

For example, suppose you enter a critical section on a fiber, and then unschedule the fiber. Some time later, a thread (either running as a plain thread or a thread which is hosting a fiber) tries to enter the critical section. One of two things can happen:

  1. The thread happens to be the same one that was hosting the fiber that entered the critical section. Since a thread is permitted to re-enter a critical section it had previously acquired, the attempt to enter the critical section succeeds. You now have two chunks of code both running inside the critical section, which is exactly what your critical section was supposed to prevent. Havoc ensues.
  2. The thread happens to be different from the one that was hosting the fiber that entered the critical section. That thread therefore blocks waiting for the critical section to be released. But in order for that to happen, you have to reschedule the owning fiber on its original thread so it can exit its protected region of code and release the critical section.

More generally, if you use an object which has thread affinity on a fiber, you are pretty much committed to keeping that fiber on that thread until the affinity is broken.

This affinity can be subtle, because most code was not written with fibers in mind. Any code which calls TlsGetValue has thread affinity, because thread local storage is a per-thread value, not a per-fiber value. (This also applies to moral equivalents to TlsGetValue, like code which calls GetCurrentThreadId and uses it as a lookup key in a table.) You need to use FlsGetValue to get values which follow fibers around. But on the other hand, if the code is not running on a fiber, then you can't call FlsGetValue since there is no fiber to retrieve the value from. This dichotomy means that it's very hard if not impossible to write code that is both thread-safe and fiber-aware if it needs to store data externally on a per-thread/fiber basis. Even if you manage to detect whether you are running on a thread or a fiber and call the appropriate function, if somebody calls ConvertThreadToFiber or ConvertFiberToThread, then the correct location for storing your data changed behind your back.

If you are calling into code that you do not yourself control, then in the absence of documentation to the contrary, you don't really have enough information to know whether the function is safe to call on a fiber. For example, C runtime functions like strcmp have thread affinity (even though there's nothing obviously threadlike about comparing strings) because they rely on the current thread's locale.

Bottom line: (similar to the bottom line from last time): You have to understand the code that runs on your fiber, or you may end up accidentally stabbing yourself in the eyeball.

Bonus chatter: Structured exception handling is fiber-safe since it is stack-based rather than thread-based. Note, however, that when you call ConvertThreadToFiber, any active structured exception handling frames on the thread become part of the fiber.

Comments (13)
  1. Sunil Joshi says:

    And presumably all this is why UMS will replace fibers.

  2. John says:

    Maybe it’s just me, but I’ve never understood what a fiber is actually useful for.  From the MSDN page it seems like it is just a mechanism to allow your application to keep track of and schedule its own quasi-threads of execution instead of using the built-in operating system threads.  I’m not sure why I would ever want to do this.  I heed all the dire warnings and just stay away from fibers entirely.

    [As Larry Osterman explained in 2005, fibers were created for customers like SQL Server (which calls it “lightweight pooling“.) -Raymond]
  3. Pi says:

    Should I honorably commit suicide for the good of humanity if I am a professional programmer for four years and I never heard of fibers?

    [It’s okay. Fibers are a solution for a niche problem, and changes in techology in the meantime have weakened the advantage they provide. See Larry’s article I linked to in an earlier comment. -Raymond]
  4. Vilx- says:

    I’m sorry to point out that the “fibers are promiscuous with threads” link in the article is displaying Apple-affinity and has changed its protocol to iHTTP.

    [Fixed. iThanks. (In my mind, iX is an Intel thing. They were doing it long before Apple did. They even tried to trademark the letter “i”.) -Raymond]
  5. acq says:

     For example, C runtime functions like strcmp have thread affinity

    Isn’t it "like stricmp"? strcmp is "simpler":

    http://msdn.microsoft.com/en-us/library/e0z9k731(VS.80).aspx

    "The strcmp functions differ from the strcoll functions in that strcmp comparisons are not affected by locale, whereas the manner of strcoll comparisons is determined by the LC_COLLATE category of the current locale"

  6. Gabe says:

    John: Fibers are threads you schedule yourself. If you have lots of calculations that are best implemented with their own stack but don’t want the overhead of kernel threads, you would use fibers.

    For example, let’s say that you’ve got some function that wants to enumerate so many elements of some complicated data structure. You could implement this with a function that recursively walks the data structure, yielding a new element at every step. Since it’s recursive, it needs its own stack, meaning it can’t be called as a function without running to completion. If you need to be able to consume its output before it finishes, you have to run it on a separate thread. Unfortunately that’s going to be slow because every time your consumer needs a new item, it will signal an event indicating that it’s ready for a new item and wait on an event that will indicate when the new item is available. That wait causes the kernel to schedule a new thread on that CPU, which is not necessarily the thread that is producing new items. It’s much simpler to just tell the CPU to switch stacks (and registers) to the producer function for the brief time required to produce the next item. You could probably do this hundreds of times in the amount of time you would otherwise spend waiting for your thread to be scheduled.

    As Raymond said, it’s not common to be in this scenario. Personally, I prefer to get my fibers from popcorn (see http://blogs.msdn.com/oldnewthing/archive/2010/02/26/9969665.aspx#comments).

  7. rs says:

    Aren’t fibers actually much simpler than threads? With multiple threads you have to be prepared that anything can happen at any time, but with fibers you can explicitely control when to pass access to shared objects.

  8. dal says:

    re: “Actually, you were already in bad shape once you unscheduled the fiber while it owned a resource: An unscheduled fiber cannot release the resource. “

    The concept of ownership is not well defined here.

    My understanding is that fibers themselves do not own a resource. As you point out, those resources which have some thread affinity must be released on the same thread that acquired them, regardless of which fiber was used. For those resources that do not have affinity, presumably one can release them on an arbitrary thread or fiber.

    Fiber resource ownership would have to be implemented by the fiber code itself, not the operating system or other external code. But even then, a sufficiently clever algorithm could set things up so that any fiber could release it even if the one that acquired it was unscheduled or terminated (hey, sounds like a garbage collector fiber).

    [True, the “ownership” is unclear, but I was talking more informally, where you think of a resource being “owned” by the code that acquired it (and therefore needs to be released by some other part of that same code). -Raymond]
  9. Daniel Colascione says:

    Of course, if you ensure that all fibers run to completion, and that a given fiber runs on exactly one thread, you’re in much safer territory.

  10. Joe says:

    @Daniel:

    Doesn’t ensuring that a given fibre runs on exactly one thread give up all the advantages of useing fibres?

  11. Leo Davidson says:

    @Joe:

    Not at all, if you’re using them as an abstraction mechanism.

    Fibres can still be useful on a single thread just as threads can still be useful on a single CPU.

  12. SN says:

    Few years back while debugging a out process COM server I have seen that my critical sections behaving strangely, that two thread in execution were able to acquire the same critical section, without one of them calling a LeaveCriticalSection, now I seem to think most probably the com server was using fibers to schedule those unit of execution  .

  13. Joe says:

    @Leo: Do you mean running all fibres on a single thread and using them to take advantage of the seperate stacks, rather than taking advantage of the opportunities for concurrency?

Comments are closed.