Hot It Works: SQL Server SuperLatch'ing / Sub-latches

SQL Server Books Online uses the terms super latches and sub-latch to describe them.   For example the SQL Server:Latches performance counter group calls them super latches.   The DVM that exposes the super latches use the term sub-latch, sys.dm_os_sublatches.  They are the same internal structure simply exposed under separate terms.

A latch is a user mode, reader/writer lock structure used by the SQL Sever to protect its internal structures.   An example of a latch used by SQL Server is the trace latch.  The trace latch structure is created and associated with the trace controller class.   Whenever you start or stop a trace the latch must be acquired in EXclusive mode to prevent other threads from accessing the controller data members while the physical state change takes place, providing single threaded, physical access to critical data members.    

A more visible latch is created with each BUF structure.  Generally speaking, the BUF is responsible for the status, hash state and protection of a data page in memory.   The latch protects the physical integrity of the data while a lock protects the transactional integrity.    When a modification to a data page takes place the lock is first acquired.   Then for the short period of the physical modification the latch is acquired, usually in EX mode for a modification, to prevent other threads from modifying the page.   As soon as the physical modification is complete the latch is released.   The page has been modified and can be accessed by other threads while the lock protects the transactional integrity goals of the data.

SQL Server will monitor usage patterns for certain latches.  Among them are the BUF latches.  When a latch achieves a state of very heavy, read-only (SHared) access patterns it can be promoted to a sub-latch.

The following diagram represents a normal latch.  Each latch structure contains state of the latch (held mode), list of waiters and other state information.   Each time a latch is acquired this information is protected using interlocked access patterns.  (Refer to the API InterlockedCompareExchange for more details.)  This means there are a few members of the latch that can become hot access paths and result in reduced performance.

image

To increase performance you always want to remove global state as much as possible.   For a latch the global state is reduced during a latch promotion where by the latch is split into a latch per scheduler (partitioned) and the main latch becomes a proxy redirector as shown in the diagram.  The design is such that SQL Server can dynamically identify a hot spot and remove it by promotion to the sub-latch.

image

For a read-only latch this is a very powerful action.   Instead of maintaining global state where multiple readers are acquiring and releasing the latch in a single location this responsibility remains on the local scheduler partition.   To acquire a shared latch on a promoted latch the worker only needs to acquire a SHared latch on the locally, partitioned latch it aligns with.   The state only needs to be maintained on the local partition, involves local memory and reduces the hot spot.

Conversely a sub-latch is not effective for EXclusive latching and EXclusive latch patterns will quickly cause the sub-latch to be demoted.   When a latch is in sub-latch state (promoted) and an EXclusive (one writer) request arrives the EXclusive latch state must be acquired on EVERY sub-latch.   In the diagram this would require (4) EXclusive state acquires across each partition to prevent reader access.   The release involves the reverse pattern on each of the sub-latch.

Sub/Super latches can be very effective and SQL Server automatically promotes and demotes a latch as needed to provide increased performance. 

Bob Dorr
SQL Server Principal Escalation Engineers