Point-in-time restore and knowing where to stop

 

In some disaster recovery scenarios, a database needs to be restored to a point in time before a particular operation has occurred, e.g. before a user mistakenly dropped a table, ran a DELETE or UPDATE with no WHERE clause, etc. If the database is in the full recovery mode and the log backup chain is not broken, then this is can be achieved by using the RESTORE LOG command with the STOPAT clause. The STOPAT clause is used to specify a point in time when the recovery should stop during log restore. In this case, this would be the time immediately before the transaction containing the erroneous DROP TABLE command is committed. The problem is, frequently you do not know the specific point in time when the user made the mistake. On one hand, to minimize data loss, you need to stop the recovery immediately before the transaction commit. On the other hand, if you overshoot that point in time when restoring the log, you need to start the restore sequence from scratch, restoring the latest full backup, differential backups (if any), and a set of transaction log backups. This may have to be repeated several times, until you arrive at the exact STOPAT time by trial and error. Needless to say, this approach may not be realistic, particularly for large databases.

Luckily, in many cases there is a way to determine the precise point where the restore should stop. This approach may be difficult to implement in some cases, particularly on systems with a lot of write activity, but when it does work, it can save you a lot of time during disaster recovery, while minimizing data loss.

I am going to assume that the transaction log containing the transaction with the user error operation has been backed up – if that is not the case, you will need to perform the tail-log backup before starting with the steps below.

The approach is to use the undocumented fn_dump_dblog() table-valued function, which provides a rowset over the contents of a transaction log backup file (I described the function in an earlier blog post). Each row in the function’s result set represents a log record in the log backup file specified as the function argument. The first column in the result set is named Current LSN, and represents the Log Sequence Number of each log record in the backup file. The RESTORE LOG command has the STOPBEFOREMARK clause, which similarly to the STOPAT clause allows the recovery to be stopped immediately before a particular LSN (in SQL Server 2005 and later). Therefore, if we can determine the LSN associated with the commit of the transaction that included the erroneous command, we can specify that value in the STOPBEFOREMARK clause, and thus restore the database to a point in time immediately before the disaster occurred. The difficulty here is that it may not be easy or even possible to find the erroneous command in the large result set returned by the fn_dump_dblog() function, particularly if the database had a lot of write activity when the erroneous command was run.

The sample script below demonstrates the approach.

-- Create test database, switch it to full recovery model

CREATE DATABASE PointInTimeRestore;
GO

USE PointInTimeRestore;
GO

ALTER DATABASE PointInTimeRestore SET RECOVERY FULL;
GO

— Create the initial full backup
BACKUP DATABASE PointInTimeRestore
TO DISK = ‘C:\MSSQLData\MSSQL10.SS2008\MSSQL\Backup\PointInTimeRestore.bak’;
GO

— Create test table, insert sample data

CREATE TABLE Table1
(
Col1 INT PRIMARY KEY
);
GO

INSERT INTO Table1 (Col1)
VALUES (1), (2), (3);
GO

— Simulate user error and drop the table
DROP TABLE Table1;
GO

— Backup the log containing the erroneous command
BACKUP LOG PointInTimeRestore
TO DISK = ‘C:\MSSQLData\MSSQL10.SS2008\MSSQL\Backup\PointInTimeRestore.trn’;
GO

— Change database context to allow restore of the test database
USE tempdb;
GO

— Restore the database, but do not run recovery to be able to restore transaction logs
RESTORE DATABASE PointInTimeRestore
FROM DISK = ‘C:\MSSQLData\MSSQL10.SS2008\MSSQL\Backup\PointInTimeRestore.bak’
WITH REPLACE, NORECOVERY;
GO

— Examine log backup contents to find the LSN corresponding to the DROP TABLE transaction
SELECT *
FROM fn_dump_dblog
(
DEFAULT, DEFAULT, DEFAULT, DEFAULT,
‘C:\MSSQLData\MSSQL10.SS2008\MSSQL\Backup\PointInTimeRestore.trn’,
DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT,
DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT,
DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT,
DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT,
DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT,
DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT,
DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT
);
GO

/*
In this case, the log record corresponding to the commit of the transaction
that dropped the Table1 table can be easily found in the result set – the
Operation column for the row is LOP_COMMIT_XACT, and the value in the
Transaction ID column corresponds to the value in the Transaction ID column for
an earlier row with Operation LOP_BEGIN_XACT and Transaction Name DROPOBJ.
We also happen to know that this is the only table that was dropped at the time
in the End Time column.
The LSN in the Current LSN column is 0000002a:00000074:0036.
*/

/*
Restore log to the point in time just before the table was dropped.
Note that we need to prepend 0x to the LSN value –
will get the “The named mark does not identify a valid LSN” an error otherwise.
*/
RESTORE LOG PointInTimeRestore
FROM DISK = ‘C:\MSSQLData\MSSQL10.SS2008\MSSQL\Backup\PointInTimeRestore.trn’
WITH STOPBEFOREMARK = ‘lsn:0x0000002a:00000074:0036’;
GO

— Confirm that the table is again present in the database
USE PointInTimeRestore;
GO

SELECT *
FROM Table1;

© 2019 Microsoft. All rights reserved.

Reading database transaction log with fn_dump_dblog()

While the format of SQL Server transaction log is not publicly documented, there is a number of ways to view the contents of the log. This is sometimes necessary for troubleshooting and forensic purposes. One way is the DBCC LOG command. Another is fn_dblog() table-valued function. Both are undocumented, however you can easily find unofficial documentation on various web sites and blogs.

If you use Intellisense in SSMS 2008, you may notice another function, called fn_dump_dblog():

This function provides the same rowset as fn_dblog(), but has some interesting functionality that makes it useful is some troubleshooting and recovery scenarios. Specifically, it can read not only transaction log of the current database, but also transaction log backups on either disk or tape.

The first two parameters, @start and @end, can be used to filter the output by a range of LSN values (you need to make sure that the LSN values start with ‘0x’). This is similar to fn_dblog(). If you use default values for all other parameters, the fn_dump_dblog() function behaves just like fn_dblog(), returning a rowset over the log of the current database.

The third parameter, @devtype, is where the differences between fn_dump_dblog() and fn_dblog() start. This parameter determines the type of backup device. You can specify either ‘DISK’ or ‘TAPE’. ‘DISK’ is the default.

The fourth parameter, @seqnum, is an integer that can be used to specify a particular backup within a backup device, when the same device has been used for multiple backups. Most backup devices contain only one backup, so usually this will be 1.

The fifth parameter is the fully-qualified path to the backup file on disk, when backup device type is ‘DISK’. Note that Intellisense displays this parameter simply as @fname1. Note that the SQL Server service account will need read permission on the backup file.

The sixth parameter, displayed as @fname2, can be used to provide the name of a backup file in the default backup directory for the SQL Server instance.

All other parameters appear to be unused (please post a comment if you find otherwise). Update 2012-05-18: Paul Randal just blogged about the fn_dump_dblog() function – the rest of the parameters are used to specify multiple media families in a media set used for a log backup, i.e. a log backup striped over multiple backup files.

The fn_dump_dblog() function exists in SQL Server 2005/2008. Do note that the function is undocumented – use it at your own risk.

Update 2012-02-08: I just found out that the function can also work against database backups, not just log backups. A database backup contains a portion of the log that will be rolled forward on restore to make the restored database transactionally consistent, and that portion of the log can be viewed with the fn_dump_dblog() function. This is potentially useful to discover the LSN and the timestamp of the checkpoint that occurs during the backup – look for LOP_END_CKPT in the Operation column.

© 2019 Microsoft. All rights reserved.