Database Corruption Part 5 :: Dealing with Log File Corruption

Hi Friends,

I am back again after a long time. Last month was the end of yet another financial year, and all of us were very busy with the processes involved with each year-end. Now that all those are completed, I could find some time for my next blog post on database corruption.

Here, I will be targeting issues related to corruption in the database log files. As we all know, the Transaction Log File(s) of a database (*.ldf files) are essential for maintaining transactional consistency in the databases.

In my earlier blog post on Shrinking the Transaction Log files in SQL Server, I have already outlined the usage of the Transaction Log File(s) during normal operation of the database. Again, when a database starts up, it has to go through the recovery process. The recovery process consists of 3 distinct phases – analysis, redo and undo.

During the analysis phase, a background thread reads the Transaction Log File(s) of the database, from the last marked CheckPoint till the end of the file, and analyzes the records in the Transaction Log File(s). If this phase completes successfully, the recovery process moves to the next phase called the redo phase, and later into the undo phase.

A failure in any of these phases can cause the recovery process to fail; there by causing the database to be marked suspect (a database is marked suspect when the database cannot be recovered). More often than not, in case of Log File corruption, the recovery process fails during the analysis phase. Any failure during recovery is recorded in the SQL Server ErrorLog file, and the information in this file can give deep in site into the problem on hand.

Note: Log File corruption is just one of the reasons why recovery might fail for a database. Recovery might also fail when the mdf or the ndf files are corrupt. In these cases, recovery mostly fails during the redo/undo phases.

Next steps when a Log File is found to be corrupt:: Step 1

The first step that you should consider when you find the Log File to be corrupt, is to analyze the Windows Application and System Event Logs for any hardware issues. If you can spot any hardware issue, including any disk issue, please engage your hardware vendor immediately. If a database has become corrupt once due to a hardware issue, it will mostly likely become corrupt again if the hardware issue is not resolved.

Next steps when a Log File is found to be corrupt:: Step 2

The obvious next step is to restore the database from the last good known backup. Restore the last Full Backup, the last Differential Backup taken after the applied Full Backup, and all Transaction log Backups taken after the applied Differential Backup, in order. This will ensure that the transactional consistency of the database is maintained and the database is recovered to a known point in time.

However, if restoring the database from the last good known backup is not an option, then the only other option is to try and rebuild the Log File(s). Since there is no way in which we can repair a corrupt Transaction Log File, our best shot is to rebuild the Log File(s).

Before you rebuild the Transaction Log File, please go through the article When should you rebuild the transaction log? by the SQL Server Storage Engine Team.

To rebuild the log files, follow the following steps:

  1. Detach the database using the sp_detach_db command. You might need to alter the database and change it to emergency mode before you can detach the database. For details on how to change the database state to Emergency, please see the following MSDN article: https://msdn.microsoft.com/en-us/library/ms174269(SQL.90).aspx

    Syntax:

    ALTER DATABASE database_name
    {
    <set_database_options>
    }
    [;]

    <set_database_options>::=
    SET
    {
    { <optionspec> [ ,...n ] [ WITH <termination> ] }
    }

    <optionspec>::=
    {
    <db_state_option>
    }

    <db_state_option> ::=
    { ONLINE | OFFLINE | EMERGENCY }

  2. Now, delete or rename the corrupt Transaction Log File(s). You can, now, use the Create Database with the FOR ATTACH_REBUILD_LOG option to attach the database while rebuilding the Log File(s):

    Syntax:

    CREATE DATABASE database_name
    ON <filespec> [ ,...n ]
    FOR { ATTACH_REBUILD_LOG }
    [;]

    According to the Books OnLine (https://msdn.microsoft.com/en-us/library/ms176061(SQL.90).aspx):

    FOR ATTACH_REBUILD_LOG
    Specifies that the database is created by attaching an existing set of operating system files. This option is limited to read/write databases. If one or more transaction log files are missing, the log file is rebuilt. There must be a <filespec> entry specifying the primary file.

    Note:
    If the log files are available, the Database Engine will use those files instead of rebuilding the log files.

    FOR ATTACH_REBUILD_LOG requires the following:
    => A clean shutdown of the database.
    => All data files (MDF and NDF) must be available.

    Important:
    This operation breaks the log backup chain. We recommend that a full database backup be performed after the operation is completed.

Next steps when a Log File is found to be corrupt:: Step 3

If Step 2 fails, there is one more option that you can give a try. 

  1. You need to alter the database and change it to emergency mode. For details on how to change the database state to Emergency, please see the following MSDN article: https://msdn.microsoft.com/en-us/library/ms174269(SQL.90).aspx

    Syntax:

    ALTER DATABASE database_name
    {
    <set_database_options>
    }
    [;]

    <set_database_options>::=
    SET
    {
    { <optionspec> [ ,...n ] [ WITH <termination> ] }
    }

    <optionspec>::=
    {
    <db_state_option>
    }

    <db_state_option> ::=
    { ONLINE | OFFLINE | EMERGENCY }

  2. Now, execute DBCC CHECKDB with REPAIR_ALLOW_DATA_LOSS.

    Syntax:

     DBCC CHECKDB 
    [
        [  (  database_name | database_id |  0
            [  ,  NOINDEX 
            |  ,  { REPAIR_ALLOW_DATA_LOSS | REPAIR_FAST | REPAIR_REBUILD } ]
             )  ]
        [ WITH 
            {
                [ ALL_ERRORMSGS ]
                [  ,  EXTENDED_LOGICAL_CHECKS ] 
                [  ,  NO_INFOMSGS ]
                [  ,  TABLOCK ]
                [  ,  ESTIMATEONLY ]
                [  ,  { PHYSICAL_ONLY | DATA_PURITY } ]
            }
        ]
    ]
    

When a database has been set to emergency mode, DBCC CHECKDB can perform some special repairs on the database if the REPAIR_ALLOW_DATA_LOSS option is specified. This is a fully documented and supported way of repairing your transaction log. For details, please refer to the following:

From Books Online: DBCC CHECKDB (Transact-SQL)

From the SQL Server Storage Engine Blog Link: Ta da! Emergency mode repair

Disclaimer: All information provided here is my personal opinion and is neither verified nor approved by Microsoft before it is published. All information, and code samples, if any, is provided "AS IS" with no warranties and confers no rights.