Concurrency Series: Basics of Transaction Isolation Levels

Author: Sunil Agarwal
Reviewers: Kun Cheng, Chuck Heinzelman, Shaun Tinline-Jones, Sanjay Mishra, Kevin Liu

In this series of blogs, I want to show some common concurrency questions that we get asked by customers. Before we look into common concurrency issues seen by customers, it will be good to review the transaction isolation levels provided in SQL Server.

Isolation Levels: SQL Server supports all four isolation levels as defined in ANSI SQL standard.  The ANSI standard does not dictate how these need to implemented but SQL Server has implemented them using locks with the exception of RCSI/SI that I will discuss later. These isolation levels are:

·         Read Uncommitted (aka dirty read): A transaction T1 executing under this isolation level can access data changed by concurrent transaction(s). For example, if a concurrent transaction T2 updates a row R1, it can still be read under T1 even though T2 can potentially roll back later.  

o   Pros:  No read locks needed to read data (i.e. no reader/writer blocking). Note, T1 still takes transaction duration locks for any data modified.

o   Cons:  Data is not guaranteed to be transactionally consistent.

o   Usage: It is typically used in queries/applications where data inconsistency can be tolerated. For example, computing average salary of employees.

  

·         Read Committed: A transaction T1 executing under this isolation level can only access committed data. For example, if a concurrent transaction T2 updates a row R1, it cannot be accessed under T1, in fact T1 will get blocked until T2 either commits or rolls back. The S (i.e. Share) lock is held when accessing the data and is released once the access is complete.  So for example, if T1 reads R1, R2, and R3 in a SQL statement, T1 acquires/releases S lock on R1 and then acquires/releases lock on R2. In some cases, the ‘S’ lock is not released until the end of the statement but for now we will not discuss that.

o   Pros:  Good compromise between concurrency and consistency.

o   Cons: Locking and blocking. The data can change when accessed multiple times within the same transaction.

o   Usage: Very commonly used isolation level. In fact, this is the default isolation level in SQL Server. 

 

·         Repeatable Read: A transaction T1 executing under this isolation level can only access committed data with an additional guarantee that any data read cannot change (i.e. it is repeatable)  for the duration of the transaction. SQL Server achieves it by holding an S lock for the duration of the transaction. However, it does not protect phantoms, new data rows that qualify the query predicate can appear.

 

Example: Assume there is a table ACCOUNTS that stores customer’s account information along with the money balance.  

Ø  Start a transaction (T1) under repeatable read isolation level. Query  the ACCOUNTS table with predicate (account_balance > 1000). Let us say it returns 10 rows

Ø  Another transaction (T2) inserts a new row in the ACCOUNTS table with account_balance = 1020 and commits.

Ø  Query the ACCOUNTS table again with the same predicate (account_balance > 1000) and it will return 11 rows. The reason is that transaction T1 only locked the 10 qualifying rows but did not lock the predicate range. With the result, the transaction T2 could insert a new row in the same predicate range.

 

o   Pros:  Higher data consistency.

o   Cons: Locking and blocking.  The S locks are held for the duration of the transaction that can lower the concurrency. It does not protect against phantom rows.

o   Usage: Not very common.

 

·         Serializable: A transaction T1 executing under this isolation level provides the highest data consistency including elimination of phantoms but at the cost of reduced concurrency.  It prevents phantoms by taking a range lock or table level lock if range lock can’t be acquired (i.e. no index on the predicate column) for the duration of the transaction.

o   Pros:  Full data consistency including phantom protection. Serializable isolation level guarantees transactions will end up with one possible serial order with an appearance that concurrent transactions did not interfere with each other. For example, if T1, T2, T3 are running under serializable isolation level, the possible serial orders are (T1, T2, T3), (T1, T3, T2), (T2, T1, T3), (T2, T3, T1), (T3, T1, T2), (T3, T2, T1).

o   Cons: Locking and blocking.  The S locks are held for the duration of the transaction that can lower the concurrency.

o   Usage: It is used in cases where data consistency is an absolute requirement.

 

In addition to these isolation level, SQL Server, starting with SQL Server 2005 release, added two changes as follows

·         Read-Committed-Snapshot (RCSI): This is not a new isolation level but a new implementation of read committed isolation level that does not take any S lock on the data. The word snapshot stems from the fact that query under RCSI sees the snapshot of the database as of the beginning of the statement.  It is a better alternative for applications that must access only committed data but without taking locks. SQL Server implements it using row versioning that has some overhead. You can map blocking implementation of read committed isolation level to RCSI by enabling READ_COMMITTED_SNAPSHOT option at the database level without forcing any changes to applications.

o   Pros:  Non-blocking access to transactionally consistent data.

o   Cons: Some overhead of maintaining row versions.

o   Usage: To minimize reader/writer blocking and to replace ‘read uncommitted’ or NOLOCK access to data with transactional consistent data

·         Snapshot Isolation (SI): This is new proprietary isolation level that provides non-blocking access for read operations. The transactions under SI see the snapshot of the database as of the beginning of the transaction.  There is no automatic mapping of transaction isolation levels to SI so you must change your application to access data under SI isolation level.

o   Pros:  Non-blocking access to transactionally consistent data. Higher consistency than repeatable read.

o   Cons: Some overhead of maintaining row versions.

o   Usage:  To get non-blocking access to consistent data across multiple statements within a transaction.

As an aside, you can explicitly set isolation levels by executing ‘set transaction isolation level <isolation>’ or you can override it at an object level in a statement using locking hints.

Thanks

Sunil Agarwal