Ok, the light’s at the end of the tunnel. I couldn’t figure out where to slot this in, so it got stuck at the end…
Someone (I’m forgetting who) asked about what happens when someone’s producing data to be handed to a pool of threads. The simplest example of this is a network packet handler handing messages to worker functions where the worker functions run on different threads.
So the packet handler takes requests from the network, validates the message, and sticks it into a worker thread pool. The worker threads pick up the items and process them.
This works great, until you run into protocol elements that have ordering requirements. For example, consider an abstract network filesystem protocol. The protocol has verbs like “open”, “read”, “write” and “close” and when an open request is received, the worker thread handler for the “open” verb opens a local file. Similarly, the “read” and “write” verb handlers will read and write from the file handle. For efficiencies sake, the protocol also supports combining these messages into a single network message (protocols do exist that have these semantics, (trust me :)))
In this hypothetical example, when the protocol handler receives one of these combined requests, it breaks up these combined high level messages into low level messages, and drops them into the respective lower level protocol handlers.
Now what happens when an “Open&Read” request is received. The open request is dispatched to worker thread 1. And the read request is dispatched to worker thread 2.
And the scheduler decides to schedule thread 2 before thread 1.
So when you’re dealing with this kind of situation, you need to ensure that your architecture handles it reasonably. But this can be tricky, because you don’t want to hurt scalability just because of this one corner case. There are lots of solutions, the first that came to mind is to have a manual reset event associated with the file and when the file was opened, set the event to the signaled state. Before each read or write (or close) wait on the event with an infinite timeout. But this penalizes all read and write operations after the open – they’ve got to make a system call (even if its a relatively inexpensive system call) for each operation after the open’s succeeded.
Another solution would be to add the ability to queue requests to other requests, and when a request completed queue the chained requests to the worker thread pool. This has some real advantages (and would be the solution I’d chose) because it allows chaining of arbitrary operations. For example, you could chain an open, read, and close into a single packet – chain the close to the read, and the read to the open. Or if you had open, read, write, read, you could chain the read and write to the open, and re-queue both the read and write once the open completes.
Tomorrow, I want to talk about debugging concurrency issues, then it’ll be wrap-up time 🙂