Locks are bad for your application's health

For the next couple of posts, I will turn this blog into a message-passing soap-box. If you read my earlier posts, you know that I think the current mainstream methodologies for reasoning about concurrent applications have some issues, to be diplomatic. I will elaborate a bit on that and suggest that other models have fewer such issues, and that ultimately those issues can be addressed with proper tooling.

In Ed Lee's lecture, he pointed to the fact that the current model is one of total non-determinism, one where the state of the universe may change from one moment to the next. Programmers are then asked to rein it in by using mutual exclusion.

In C#, that looks something like this:

lock (obj) { ... critical region ... }

Now, the beauty here is that within that region of code, only one all parallel threads that have agreed to rely on that same object for mutual exclusion can run at once. Not just for this region, but any region of code that uses that same object for protection.

Of course, it's not really code we need to protect, it is data, and the data that is protected by that lock is in practice unknowable, since it is the set of data reached by transitive closure through all operations within the region. Even if you through some miracle of disciplined software (and perhaps some amount of reverse) engineering happen to know what's protected by the region, next year someone will ship a new version of that library you depend on and you will no longer know. Locks is an example of a really simple programming model concept, having minimal syntactic impact on a language, but which turns out to be very hard to use correctly.

Do you know where your pointers have been?

Then again, maybe it doesn't matter, since you may only care about knowing that some data that you depend on is protected or not, so all that transitive closure stuff is irrelevant. We still have the problem of not knowing that the data you care about is always protected -- in fact, the data that is actually protected by the lock is the intersection of all data sets that are accessed only within regions proteced by that same lock. Can you catalogue all accesses to all data in your head (because you will need to) or do you have some tool that will do it for you? I don't, but if you do, you should consider yourself lucky.

The ad hoc nature of the conventions used by programmers to associate locks with the data sets they are intended to protect also make it much harder to build development tools that efficiently and reliably help programmers overcome the problems of understanding their protection scheme and if / when it is violated. There is no single canonical scheme that a tool might try to enforce. This may seem like a second-order problem, but much of the complaints I hear about tool support for concurrency can be reduced to the ad hoc nature of the threads + locks programming model.

Some might say that Java's 'synchronized' methods do associate locks with the data they protect, but that does nothing to solve this problem -- if you need a demonstration, take a look at Ed Lee's lecture (there's so much good stuff in there, I'll keep coming back to it...)

a + b = ??

Even if, by being very disciplined and not letting anyone extend your software without being equally disciplined, you do figure out your own scheme under which a set of data is associated with a particular lock and always accessed under its protection, you still have to deal with the problem that locks don't compose.

What I mean by this, is that if you have a data set A and a data set B with associated locks and all operations on each set is protected by those locks (in other words, you are a very careful programmer), then any operation that involves both A and B need to use a composed lock, somehow. How do you do that? Programmatically, by taking first A then B (and if you are careful and know about deadlocks, you know to always take them in the same order). All this is very ad hoc and usually gets even extremely talented and programmers in trouble sooner or later. One solution is to have a single lock for all data in the entire application, which is fairly safe but also fairly bad for your program, as it typically won't scale. If we think of the amount of data protected by a lock as the locks 'grain size,' coarse-grained locks have scalability issues, while fine-grained locks have usability problems (no composition).

Both schemes fall flat when libraries are used, since you don't know about all the locks that a library may be taking, and the library doesn't know about your locks.

Transactions

One solution to the composability problem of locks is to take another approach to mutual exclusion: transactional memory. As I see things, it is the most serious competitor for 'best thing since sliced bread' status that I've seen in a long time. TM is still experimental, but if / when it can be made to work and perform, life will indeed be so much simpler for everyone. I'm not a TM expert, and it will still be some time before the technology is ready for prime time, but it's very exciting.

With transactional memory, the approach is to protect an operation from external side-effects, regardless of what data is involved in the operation -- the runtime magically takes care of it. This is a slightly different way of thinking about mutual exclusion, but it won't be a topic in this blog for some time.

It hurts when I do this...

Locks are useful for many things, but you have to be willing to live with the pain. As part of an application-level programming model, they are not the best, but don't think I don't realize that they are necessary and good for system-level programming. For example, unless the underlying hardware supports for message-passing (such as the Transputer did), you pretty much need to implement the MP infrastructure using locks.

If locks are problematic, what are we to do? As the doctor told the patient who said "it hurts when I do this..." -- "well, don't do that, then!" Instead of relying on mutual exclusion, what if there's another pattern that can address the need for protection while avoiding some of the pitfalls of locks? One which either introduces fewer new problems than it addresses, or introduces problems that are easier to reason about (and tool)?

So far, I've only been bitching about the problems locks impose on us, which is obviously the easy part. In a later post, I will outline why I think a pattern fitting the description above exists, one that is based on message-passing between isolated components.