There are already few blogs and a KB to talk about this problem so I’m trying to tie them together to make sure you get this quickly
When you are restoring a database (in my case a database having 32 files), post 90% complete, all but one threads in sysprocesses went into SUSPENDED state waiting on BACKUPTHREAD.
One thread was in runnable state with a wait time of close to 6 hrs. and the wait type was PREEMPTIVE_OS_WRITEFILEGATHER
A quick search got me http://support.microsoft.com/kb/2091024 which explains the problem which in our case was a 380 GB LDF file and SQL Server was busy in zeroing out the file.
So all we have to do was to wait for it to complete
But how to prevent this situation or what can be done if you are in this situation?
1. Monitor the size of your LDF and %space used within the LDF using DBCC SQLPERF(‘logspace’) to make sure you don’t have a really big LDF file with just 2% space used. You should remember that when you take a backup of this database which has say 300 GB LDF file with 10% used space, still SQL Server has to create a 300 GB LDF file during Restore and zero out 270 GB of disk space so this is going to take lot of time.
2. If you are already in this situation and if you still have access to the source database which you took backup, try shrinking the LDF and take the backup again to perform the restore. This should be a quick process rather than waiting for zeroing to complete
3. Stop any activity (if possible) to free-up the target disk’s header so that it can dedicatedly work on zeroing out the LDF file
FYI: This Instant File Initialization is not applicable to LDF which means SQL Server has to zero out the disk when LDF file is expanded to new offset. But you can prevent zeroing out in MDF file but granting “Perform Volume Maintenance Task” privilege to SQL Server service account. Reference: http://msdn.microsoft.com/en-us/library/ms175935.aspx
To see which files are being zeroed out, you can use the undocumented trace flag 3004 and the undocumented xp_readerrorlog. Undocumented means it is unsupported and may change or be removed with no notice at any time. Trace flag 3004 shows information about backups and file creations. Trace flag 3605 redirects the output to the SQL error log.
WARNING: These trace flags should be used under the guidance of Microsoft SQL Server support. They are used in this post for discussion purposes only and may not be supported in future versions.