Optimistic and Pessimistic Concurrency - A Simple Explanation

I remember that this was a tricky thing for me to understand and remember when I first got started with databases, so I thought I might help someone out by explaining these in a simple way.

Optimistic and pessimistic locking (or concurrency control) are ways of addressing a problem such as the following:

  1. User A reads the row for customer #123
  2. User B reads the row for customer #123
  3. User B updates the row for customer #123
  4. User A updates the row for customer #123 and overwrites User B's changes

The problem here is that one user has changes that conflict with another user's, and unless we do something about it, we'll lose User B's changes without even noticing.

The pessimistic concurrency control approach is to have the database server lock the row on User A's behalf, so User B has to wait until User A has finished its work before proceeded. We effectively solve the problem by not allowing two users to work on the same piece of data at the same time. It just prevents the conflict.

The optimistic concurrency control approach doesn't actually lock anything - instead, it asks User A to remember what the row looked like when he first saw it, and when it's time to update it, the user asks the database to go ahead only if the row still looks like he remembers it. It doesn't prevent a possible conflict, but it can detect it before any damage is done and fail safely.

You can remember which is which by picturing it this way:

  • The pessimistic approach is a user thinking "I'm sure someone will try to update this row at the same time as I will, so I better ask the database server to not let anyone touch it until I'm done with it."
  • The optimistic approach is a user thinking "I'm sure it's all good and there will be no conflicts, so I'm just going to remember this row to double-check later, but I'll let the database server do whatever it wants with the row."

There are other differences in how errors are handled, how smart the clients need to be, what the performance is on client and server for each approach, etc., but hopefully this clearly explains what each approach is all about.

Enjoy!