Performing transaction log backups using AlwaysOn Availability Group read-only secondary replicas-Part2

Describe the log backup process on a ROSR

This is the second post in a planned series of articles relating to SQL Server transaction log backups of databases using a read-only secondary replica (ROSR).  Since the introduction of ROSRs in SQL Server 2012, administrators have been able to perform full database (copy only) backups as well as transaction log backups on any of the secondary replicas within the same Availability Group.  This allows the administrator to reduce or eliminate resource contention between production activity and backups. This article describes the steps and communications between the primary and secondary that take place when the secondary performs a backup.

What is the process for taking a backup on an ROSR?

The following is a high-level description showing the sequence of events that must take place on both the secondary and the primary to complete a successful backup. The valid types of backups that can be taken on an ROSR are: FULL and LOG. For FULL backups, the following are supported when using the COPY_ONLY option: DATABASE, FILE, or FILEGROUP. Differential backups are not supported against secondary replicas.

  1. The secondary first checks to see that the BACKUP command is not inside another transaction.
  2. The secondary then acquires a "backup" lock on the database so other sessions on the secondary cannot do a backup at the same time.
  3. The secondary sends a HADR message (HadrMsgTypeBackupSyncMsg) to the primary for it to request a "backup" lock.
  4. The primary acquires a "backup" lock if it can.
  5. The primary sends a HADR message (HadrMsgTypeBackupInfoMsg ) that includes the backup LSN of where to start (the "first_lsn" stored in msdb.dbo.backupset for this backup).
  6. The secondary performs the requested backup.
  7. The secondary sends a HADR message (HadrMsgTypeBackupInfoMsg ) back to the primary signifying backup completion and notifying the primary of the "last_lsn" backed up (the "last_lsn" stored in msdb.dbo.backupset for this backup).
  8. The primary issues a checkpoint.
  9. The primary releases the "backup" lock.
  10. The primary sends a HADR message (HadrMsgTypeBackupSyncMsg) which will signal the secondary the primary has checkpointed and released its lock.
  11. The secondary releases its "backup" lock.
  12. The secondary updates its MSDB database of the backup.

The next screen shot shows an Excel spreadsheet of the Xevents from both servers listed in order and labelled with the same step numbers as above so you can see in one comprehensive view the sequence of events.

Color Key

Description

Blue Signifies the event was captured on the Primary.
Orange Signifies the event was captured on the Secondary.
Yellow Messages sent from Secondary à Primary (along with the corresponding "recv" on the primary.
Light pink Messages sent from Primary à Secondary (along with the corresponding "recv" on the secondary.
Light blue Locks and checkpoints during the process.
Green TSQL Statements issued: "backup" & "insert"(into the msdb backupset table).

What do the XEvents look like for tracking this process?

The next screen shot is from the Extended Events viewer in SSMS for the Secondary. It shows most of the XEvents captured during the conversation. Some were omitted because they are not covered in this article.

And here we have a screen shot of the Xevents from the Primary's perspective. (Again a few have been filtered out for clarity of this article.)

Here let's focus on a subset of the Xevents that show the request from the secondary to the primary that essentially says – "I want to do a backup. Tell me if it's okay and where to start."

  • You can see the secondary "Send" the HadrMsgTypeBackupSyncMsg (step 3 from above) on the left. Correspondingly you can see the "Recv" of that same message in the primary's Xevents on the right.
  • Then in the blue square boxes you can see the primary on the right "Send" the backup LSN "00000028:00000090:0001", and the secondary "Recv" it on the left (Step 5 from above).
  • Finally, after the secondary completes the backup, in the black ovals we see the secondary "Send" the backup LSN "00000028:00000120:0001" to the primary, and the primary "Recv" it. This is Step 7 above.

Okay, I see the backup LSN go back and forth, how can I relate that to my backups and the LSNs I see in MSDB tables? They don't look the same.

In the Xevent screen shots above we saw two backup LSNs sent back and forth. The first was from the primary to the secondary: "00000028:00000090:0001". The second was from the secondary to the primary indicating where it finished: "00000028:00000120:0001". These two LSNs are the same exact LSNs as can be seen in MSDB.dbo.BackupSet for this backup – except the LSN in the Xevent is HEX based and the one in BackupSet is Decimal based.

This is where part 1 of this blog series comes in handy. If we take the various components of the HEX based LSN and convert them to decimal, we'll see they are the same. For example:

To convert 00000028:00000090:0001 to decimal, take each section and convert to decimal.

HEX (00000028) = 40 in decimal

HEX (00000090) = 144 in decimal – then pad left with 0s until 10 digits wide

HEX (0001) = 1 in decimal – then pad left with 0s until 5 digits wide

Concatenating these together we get one version of the decimal format:

40000000014400001

    Sometimes you will also see the decimal LSN in the following format – still with colon separators:

40:0000000144:00001

In Summary

This article has attempted to demonstrate the steps and communications that take place between the secondary and primary replicas when performing a log backup from the secondary replica.

Coming Up: Part 3: Various transaction log backup scenarios

The next post in this series will cover several scenarios when attempting to do log backups from secondary replicas: multiple secondary attempts to back up the transaction log, backing up the transaction log from a replica that is behind, attempting to back up the same LSNs that have already been backed up elsewhere – and others.