In my articleI discuss the fundamental principles of using locks correctly. In that article I strongly encourage the use of reader-writer locks because these locks create the protection you need (insuring that readers and writers don’t conflict), while potentially allowing significantly more concurrency to take place (by allowing multiple readers to enter the lock simultaneously). I did not have space in that article to discuss reader-writer locks in detail however. I would like to start correcting that here.
The .NET Runtime implementation of a reader-writer lock is System.Threading.ReaderWriterLock. It is really a pretty simple API at its heart (ignore stuff about LockCookies, it is there to support nesting of locks, which you should probably avoid anyway). Here is a sample use
MyReaderWriterLock myLock = new MyReaderWriterLock();
// No writers can be in here (AquireWriterLock will block)
// No readers or other writers can be in here (Aquire*Lock blocks)
It is pretty straightforward to convert a program from using ordinary locks into using reader-writer locks (for those blocks that only read use AcquireReaderLock, otherwise use AquireWriterLock).
This is all very nice. Unfortunately, there is a problem. The current .NET Runtime implementation of ReaderWriterLock is about 8 times slower than System.Monitor (normal locks), in the common case where there is no lock contention. This is quite unfortunate, as it acts as a disincentive for doing the right thing (and using reader-writer locks). Microsoft is likely to fix this in the next release, however what do you do in the mean time?
Implementing a Reader-Writer Lock
The obvious answer is to implement an efficient ‘drop in’ version of ReaderWriterLock that you can use until Microsoft provides an better one. This is what this article is all about. Sadly, if our goal is to be as efficient as System.Montor, we can’t use the standard locking protocol to implement a Reader-Writer Lock since every operation would have to lock and unlock, making it at least twice as slow as System.Monitor. Thus we are forced to look into ‘Low Lock Techniques’, for an answer.
In my article Understand the Impact of Low-Lock Techniques in Multithreaded Apps I list the techniques that if used very carefully, can sometimes be applied to do very efficient multi-threaded synchronization. I also mention that these techniques should only be used when the use of standard locks are problematic (usually because of performance issues). Reader-Writer locks fall into this problematic cataegory.
Possible Low lock techniques to use
In the article the techniques are listed in order of decreasing utility, and are
- Avoid locks on reads
- Spin Locks
- Raw Interlocked operations
Because the important APIs for a reader-writer lock always do an update, technique 1 is not applicable. Thus our best bet is Spin locks (note the later two techniques are not NEARLY as useful as the first two).
Implementing a spin locks
It turns out that spin locks have some very nice properties. Namely they ARE locks, so all the nice properties of locks apply (once you take the lock you have mutual exclusion and reasoning about the program gets MUCH easier). They are also efficient (one interlocked operation per Enter-Exit pair). Thus we should be able to implement a Reader-writer lock with only one interlocked operation per method call, which is the same as System.Monitor (which also has one interlocked operation per method call).
Please refer to the code attached to this post readerWriterDemo.cs. The first class is the spin lock. It has an ‘Enter’ and ‘Exit’ API. For more information on spin locks see my low-lock article or Jeff Richter’s article Concurrent Affairs. The implementation in readerWriterDemo.cs deals with all the subtle details you need to worry about. It might need some tuning on how long it spins, but otherwise it is a very respectable implementation and you should feel free to use it if you need a spin lock yourself. The only caveat I curretly have is that I have commented out calls to Thread.BeginCriticalRegion and Thread.EndCriticalRegion because they are relatively expensive (they make my perf numbers look bad), and is only needed in environments like SQL Server where threads may be aborted at any instant. If a thread is aborted while the spin lock is held, the SQL Server host wants to know so that it can kill the entire appdomain instead of just the thread. If you are not in SQL Server you can get away without this.
Before using the spin lock, we need to confirm that our read-writer lock is a good candidate for its use. To be a good candidate is must be the case that the time the lock is held is very short (measured in at most dozens of instructions). We believe that this is the case since all of our API for the reader-writer lock do little and are perf critical, so the code paths NEED to be short. However after we get our implementation we need to confirm that we don’t have rare paths that might hold the lock for a long time. It turns out with a bit of work we can make certain this is the case.
Implementing a reader-writer lock with the spin lock
With a spin lock in hand, writing the reader-writer lock is not too much code (~ 150 lines, with comments), and hopefully is easy to understand. Every API takes a lock, does work, and leaves. To deal with the case where you must wait, we need OS structures to wait on (eg ManualResetEvent, AutoResetEvent). When a ManualResetEvent is ‘Set’ all threads waiting will wake up. This is appropriate for readers. For writers, it makes more sense to only wake up the first thread, and this is what AutoResetEvent does. The code has a lot of comments, so please take a look. There are actually a lot of subtleties to think about (what happens in various error conditions, what APIs can fail, how efficient the lock is in various scenarios, how long we are holding the spin lock …). I point out some of these subtleties in comments, however the really important point is that the code itself is short and straightforward. This is CRITICAL because the more code you have, the more difficult it is to confirm the code works in all the subtle cases. It pays to keep things simple.
Performance Results of the reader-writer lock
If you run the program, it will measure the performance of an Enter/Exit pair. Here is what I got.
10Meg Enter/Exit 371.6953 msec = 59.61 CPU clocks per Enter/Exit
10Meg RW Locks (.NET library) 3175.8594 msec = 8.5X monitor time
10Meg New RW Locks 428.8234 msec = 1.2X monitor time
As you can see, on my machine, the .NET library is 8.5X slower than System.Monitor, but the new RW locks are only 20% slower. With some more tuning I suspect I can get even closer, however it will make the code more complicated, so I don’t want to do that in this published code (cleanliness is worth something).
As written, the lock has the property that if a writer is waiting for the locks, new readers can’t come in and block it indefinitely. Instead the readers wait until all the other readers drain, and the writer gets its turn. The lock is unfair in the sense that when a lock is released it is not guaranteed that the first waiter gets the lock. Thus starvation is possible (however unlikely). It turns out that fairness causes performance issues (I will discuss in a later blog), so the current unfair protocol has better performance characteristics in almost all cases. It is not too hard to change the implementation to be fair, and I might take that up in a later blog (please request it if you are interested).
The lock as written is also not reentrant. It is relatively easy and efficient to add this (although it is not clear this feature is a good idea), and I show how to do that too (please request it if you are interested).
The code is really worth looking at carefully and I encourage you to do so. It is small enough that you can understand the intent quickly, but there are many scenarios to think about.
What are the take-aways?
- Use reader-writer locks as suggested in my first article. If lock perf is a problem, take the implementation here as a ‘drop in’ replacement for the .NET System.ReaderWriterLock. Feel free to use this implementation until Microsoft fixes the version in its library.
- Spin-locks are a really valuable low lock technique. There were used here to make a clean, simple but efficient reader-writer lock written competely in managed code. This lock is significantlysimpler than most implementations I have seen, yet is clost to optimal. The reason for this is that the use of spin locks GREATLY simplifies the design and analysis of the lock, without sacrificing performance. Feel free to use the Spin lock implementation show here to make other high performance concurrent constructs.
- Even something as simple as a Reader-Writer lock has subtleties about its behavior (especially when you factor in performance concerns). Keeping it simple really pays off here.