To prepare for the DevDiv TFS2010 upgrade we had to copy 8TB of SQL backups about 100 miles across a WAN link so that we could restore it on our test system. The link speed was reasonably good and the latency fairly low (5ms), but when you’re dealing with files this big then the odds are against you and using sneakernet can be a good option. In our case it wasn’t an option and we had to find the next best solution. In the end we were able to copy all 8TB over 7 days without having to resume or restart once.
The 8TB backups were spanned across 32 files of 250GB each which makes them a little easier to deal with. The first problem that you’ll encounter when using a normal Windows file copy, XCopy, RoboCopy or TeraCopy to copy these large files is that your available memory on the source server will start to drop and eventually run out. The next problem you’ll encounter is the connection will break for some reason and you’ll have to restart or resume the transfer.
Fortunately the EPS Windows Server Performance Team have a blog post on the issue and a great recommendation: Ask the Performance Team : Slow Large File Copy Issues
The problem lies in the way in which the copy is performed - specifically Buffered vs. Unbuffered Input/Output (I/O).
Buffered I/O describes the process by which the file system will buffer reads and writes to and from the disk in the file system cache. Buffered I/O is intended to speed up future reads and writes to the same file but it has an associated overhead cost. It is effective for speeding up access to files that may change periodically or get accessed frequently. There are two buffered I/O functions commonly used in Windows Applications such as Explorer, Copy, Robocopy or XCopy:
- CopyFile() - Copies an existing file to a new file
- CopyFileEx() - This also copies an existing file to a new file, but it can also call a specified callback function each time a portion of the copy operation is completed, thus notifying the application of its progress via the callback function. Additionally, CopyFileEx can be canceled during the copy operation.
So looking at the definition of buffered I/O above, we can see where the perceived performance problems lie - in the file system cache overhead. Unbuffered I/O (or a raw file copy) is preferred when attempting to copy a large file from one location to another when we do not intend to access the source file after the copy is complete. This will avoid the file system cache overhead and prevent the file system cache from being effectively flushed by the large file data. Many applications accomplish this by calling CreateFile() to create an empty destination file, then using the ReadFile() and WriteFile() functions to transfer the data.
- CreateFile() - The CreateFile function creates or opens a file, file stream, directory, physical disk, volume, console buffer, tape drive, communications resource, mailslot, or named pipe. The function returns a handle that can be used to access an object.
- ReadFile() - The ReadFile function reads data from a file, and starts at the position that the file pointer indicates. You can use this function for both synchronous and asynchronous operations.
- WriteFile() - The WriteFile function writes data to a file at the position specified by the file pointer. This function is designed for both synchronous and asynchronous operation.
Which Tool? ESEUTIL
Yes, the tool has some limitations – but in my experience it’s well worth the time investment to get running. See How to Run Eseutil /Y (Copy File)
To get the utility, you need access to an Exchange server or to install Exchange in Administrator-only mode. When you install Exchange in Administrator-only mode, the appropriate binaries are copied to your computer and you can then copy these three files off and use them on another computer:
It does not accept wildcard characters (such as *.* to copy all files), so you must have to specify a file name and copy one file at a time. Or use a command like: FOR %f IN (d:\backups\*.BAK) DO ESEUTIL /Y "%f"
DESCRIPTION: Copies a database or log file.
SYNTAX: D:\BIN\ESEUTIL /y <source file> [options]
PARAMETERS: <source file> - name of file to copy
OPTIONS: zero or more of the following switches, separated by a space:
/d<file> - destination file (default: copy source file to
/o - suppress logo
NOTES: 1) If performed on arbitrary files, this operation may fail
at the end of the file if its size is not sector-aligned.
D:\>d:\bin\eseutil /y c:\Backups\Backup1.bak /d \\destination\c$\Backups\Backup1.bak
Extensible Storage Engine Utilities for Microsoft(R) Exchange Server
Copyright (C) Microsoft Corporation. All Rights Reserved.
Initiating COPY FILE mode...
Source File: c:\Backups\Backup1.bak
Destination File: \\destination\c$\Backups\Backup1.bak
Copy Progress (% complete)
0 10 20 30 40 50 60 70 80 90 100
Operation completed successfully in 7.67 seconds.
If you read the comments of the performance team’s blog post, you’ll see that XCopy has a /J option in Windows 7 and Windows 2008 R2 that does unbuffered I/O. However that’s not an option when you haven’t upgraded to R2 yet.
/J Copies using unbuffered I/O. Recommended for very large files.
Through trial and error, we determined that it was much more reliable to run eseutil.exe on the SOURCE server and push the files to the remote share. This seemed to absorb any network blips and required no manual interruption over the 7 days it took us to copy the files.
The third problem you want to avoid is getting the files copied and then finding out that they match in size but the contents are corrupt. You can check for this by generating hashes on both the source and target systems and comparing them after the copy.
Then run it like this on each system:
fciv.exe C:\Backups -type *.bak -r -wp -xml hashes.xml