I invested some time today investigating what SQL Server does with the initialization of TEMPDB’s log file, because a question was raised last week during Robert Davis session on TEMPDB at SQL PASS Summit in Seattle, and I wasn’t sure about what the answer would be.
The reason why people is afraid the size of their TEMPDB’s transaction log file could affect the boot of the service is because, in practice, you can do nothing against an instance of SQL Server until its TEMPDB database is started. And they have also learned that the transaction log files are zeroed out so that the log manager can find the tail of the log during recovery, even when all the conditions are there so that SQL could instant initialize them.
Now, it happens that SQL Server’s storage engine treats TEMPDB as a special case in many circumstances. And this is one of them. When TEMPDB files zeroing of its transaction log is one of those cases where TEMPDB is not treated as the rest of databases.
TEMPDB transaction log is not zeroed out completely. However, when its files are created, as with any other database, an initial checkpoint is written to the transaction log file. Those are the first records which are written to the transaction log and so what happens before those are written in the case of TEMPDB is that since we hadn’t completely zeroed the log file before this point, we zero out enough of the first virtual log file so that the log manager can find the end of log. That being 122880 bytes only, no matter how large your transaction log file is.
So the conclusion is the initial size of your TEMPDB’s transaction log file wouldn’t negatively affect the performance of your instance’s startup time.
Now, having said this I need to reveal some further information so that my blog doesn’t flood with comments describing different circumstances that demonstrate this is not always true.
When SQL calls CreateFile on any of TEMPDB’s files it first tries to open a pre-existing copy of the file which is tyipically going to be there unless this is the very first time TEMPDB is created or unless they have been removed from disk or renamed, while SQL Server was stopped, by someone/something external to SQL.
When there wasn’t an existing copy of the file on disk that was reused, or when that pre-existing copy of the is smaller than the size of the one that is about to be created, SQL decides to zero initialize the whole file, whether it’s a data or a log file.
On the contrary, when the existing copy of the file exists and is equal or larger than the one about to be created, it assumes that file had already been zero initialized by SQL Server sometime in the past, and it just goes with it the way it is.
Therefore, if you want to indeed force the situation where the way TEMPDB files are initialized ends up affecting your performance negatively, you would have to deliberately do things to screw yourself.
One of those things would consists of:
- Creating a file (see CreateFile),
- Setting its size to some number equal or larger than what SQL is going to use for the file you plan to replace with this one (see SetFilePointer and SetEndOfFile),
- Making sure you don’t write anything, anywhere in that just created file so that the file system doesn’t zero initialize it for you.
- Stop SQL Server service,
- Replace one of your TEMPDB’s data or log files whose size is equal or smaller than the one you just created, with the one you just created,
- Start SQL Server.
What would happened in this case?
Well, if it’s a data file, SQL won’t zero initialize it, but when it initializes the allocation pages (DIFFMAP, GAM, SGAM, PFS, and MLMAP) the file system will progressively zero the file out for you from the beginning up to the point where the currently initialized allocation page is being written. That is typically significantly slower than when SQL zeroes the file out, because SQL will zero it out in much bigger chunks than the file system will do.
If it’s the transaction log file, SQL won’t zero initialize it either, but when it formats the physical log file as a set of virtual log files and the resulting virtual log file headers are written to disk (to their corresponding offset inside the physical file), the file system will have to zero out the portions of the file that have not been zeroed out yet, all the way from the beginning of the file up to the offset where the current write is targeted.
Therefore, one can conclude that everything has been thought, designed, and implemented in SQL Server so that the size of the transaction log doesn’t linearly degrades the performance of the service startup phase as its initial size increases, unless someone has been playing with TEMPDB files in ways which are not supported.