PCIE SSD Update - Optimizing SQL Backup Performance

In my last blog entry (https://blogs.msdn.com/b/microsoftbob/archive/2010/11/06/the-potential-of-ssd-for-millions-of-iops-on-increasingly-larger-databases.aspx), I shared the throughput numbers backing up from 2 Fusion-IO devices to a 4-spindle 15K SAS Raid-10 device.  I also shared the numbers of backing up to NUL device, and was surprised they weren’t better especially given the advertised throughput of 700 MB/Sec for a standard SLC Fusion-IO IoDrive. 

I did some searching and found out that there are several parameters that can have a profound impact on backup speed, especially related to SSDs.  The following article was especially helpful:

https://henkvandervalk.com/how-to-increase-sql-database-full-backup-speed-using-compression-and-solid-state-disks

Also, there is an article by the SQLCAT team that discusses how to tune backups:

https://sqlcat.com/technicalnotes/archive/2009/02/16/tuning-backup-compression-part-2.aspx

Armed with this, I did some trial and error to find optimal settings and retested both the backup to disk and the backup to NUL.  It turns out that for my scenarios using 6 backup files on the target media (SQL will stripe the backup to multiple files) along with a buffer count of 64 and a Block Size of 65536 (defaults to just 512) almost doubled the backup speed.  On the new test, the backup to disk completed in less than 2 minutes with a throughput of 618 MB/Sec contrasted with the prior 361 MB/Sec.  Changing the MaxTransferSize did not have measurable impact and made things worse when too large.

The Backup to Null device was an interesting experience.  The first time around I left the Compression option on, and achieved 765 GB/Sec with the changed parameters, but noticed that the CPU was bound for most of the backup, pegged at close to 98%.  Normally, Backup Compression improves throughput, but in this case the processors were not able to keep up.  Based on that, I did a Backup to NUL without compression and obtained a whopping 1.42 GB/Sec which matches to what the 2 IoDrives are capable of in tandem.  Here’s a screen snapshot for the skeptics:

image

BACKUP DATABASE [tp_v5] TO  -- DISK = -- 'NUL:'
disk='nul:'

WITH NOFORMAT, NOINIT,  NAME = N'tp_v5-Full Database Backup', SKIP, NOREWIND, NOUNLOAD, NO_COMPRESSION, STATS = 10
-- Magic:
,BUFFERCOUNT = 64
,BLOCKSIZE = 65536

10 percent processed.
20 percent processed.
30 percent processed.
40 percent processed.
50 percent processed.
60 percent processed.
70 percent processed.
80 percent processed.
90 percent processed.
Processed 41488 pages for database 'tp_v5', file 'TradingOptimizer_v2' on file 1.
Processed 160288 pages for database 'tp_v5', file 'History09' on file 1.
Processed 67584 pages for database 'tp_v5', file 'History01' on file 1.
Processed 63872 pages for database 'tp_v5', file 'History02' on file 1.
Processed 68424 pages for database 'tp_v5', file 'History03' on file 1.
Processed 60408 pages for database 'tp_v5', file 'History04' on file 1.
Processed 69936 pages for database 'tp_v5', file 'History05' on file 1.
Processed 85040 pages for database 'tp_v5', file 'History06' on file 1.
Processed 103368 pages for database 'tp_v5', file 'History07' on file 1.
Processed 132088 pages for database 'tp_v5', file 'History08' on file 1.
Processed 1225312 pages for database 'tp_v5', file 'HistoryData2a' on file 1.
Processed 580112 pages for database 'tp_v5', file 'LoadData2' on file 1.
Processed 408120 pages for database 'tp_v5', file 'LoadData01' on file 1.
Processed 288784 pages for database 'tp_v5', file 'MiscData2' on file 1.
Processed 233408 pages for database 'tp_v5', file 'MiscData3' on file 1.
Processed 154032 pages for database 'tp_v5', file 'History10' on file 1.
Processed 64 pages for database 'tp_v5', file 'Load00' on file 1.
Processed 66824 pages for database 'tp_v5', file 'Load01' on file 1.
Processed 173824 pages for database 'tp_v5', file 'History09b' on file 1.
Processed 188768 pages for database 'tp_v5', file 'History09c' on file 1.
Processed 205832 pages for database 'tp_v5', file 'History09d' on file 1.
Processed 195912 pages for database 'tp_v5', file 'History10a' on file 1.
Processed 209480 pages for database 'tp_v5', file 'History10b' on file 1.
Processed 206416 pages for database 'tp_v5', file 'History10c' on file 1.
Processed 100824 pages for database 'tp_v5', file 'History10d' on file 1.
Processed 160 pages for database 'tp_v5', file 'Archive01' on file 1.
Processed 65920 pages for database 'tp_v5', file 'Load02' on file 1.
Processed 77208 pages for database 'tp_v5', file 'Load03' on file 1.
Processed 73488 pages for database 'tp_v5', file 'Load04' on file 1.
Processed 72728 pages for database 'tp_v5', file 'Load05' on file 1.
Processed 80248 pages for database 'tp_v5', file 'Load06' on file 1.
Processed 78896 pages for database 'tp_v5', file 'Load07' on file 1.
Processed 80944 pages for database 'tp_v5', file 'Load08' on file 1.
Processed 90816 pages for database 'tp_v5', file 'Load09' on file 1.
Processed 95600 pages for database 'tp_v5', file 'Load10' on file 1.
Processed 85848 pages for database 'tp_v5', file 'Load11' on file 1.
Processed 96328 pages for database 'tp_v5', file 'Load12' on file 1.
Processed 82880 pages for database 'tp_v5', file 'Load13' on file 1.
Processed 72600 pages for database 'tp_v5', file 'Load14' on file 1.
Processed 89216 pages for database 'tp_v5', file 'Load15' on file 1.
Processed 83320 pages for database 'tp_v5', file 'Load16' on file 1.
Processed 83472 pages for database 'tp_v5', file 'Load17' on file 1.
Processed 85880 pages for database 'tp_v5', file 'Load18' on file 1.
Processed 238520 pages for database 'tp_v5', file 'Load19' on file 1.
Processed 224080 pages for database 'tp_v5', file 'Load20' on file 1.
Processed 169024 pages for database 'tp_v5', file 'Load21' on file 1.
Processed 176960 pages for database 'tp_v5', file 'Load22' on file 1.
Processed 80568 pages for database 'tp_v5', file 'Load23' on file 1.
Processed 16 pages for database 'tp_v5', file 'Load24' on file 1.
Processed 16 pages for database 'tp_v5', file 'Sim42' on file 1.
Processed 340576 pages for database 'tp_v5', file 'Intraday01' on file 1.
Processed 327112 pages for database 'tp_v5', file 'Intraday' on file 1.
100 percent processed.
Processed 1 pages for database 'tp_v5', file 'TradingOptimizer_v2_log' on file 1.
BACKUP DATABASE successfully processed 8042633 pages in 43.967 seconds (1429.096 MB/sec).

Here is what the backup command looked like and the final result for actually going to the Raid-10 devices

BACKUP DATABASE [tp_v5] TO  -- DISK = -- 'NUL:'
disk='h:\sqlbackup\folibackuptest1.bak',
disk='h:\sqlbackup\folibackuptest2.bak',
disk='h:\sqlbackup\folibackuptest3.bak',
disk='h:\sqlbackup\folibackuptest4.bak',
disk='h:\sqlbackup\folibackuptest5.bak',
disk='h:\sqlbackup\folibackuptest6.bak'

WITH NOFORMAT, NOINIT,  NAME = N'tp_v5-Full Database Backup', SKIP, NOREWIND, NOUNLOAD, COMPRESSION, STATS = 10
-- Magic:
,BUFFERCOUNT = 64
,BLOCKSIZE = 65536

And the final result:

BACKUP DATABASE successfully processed 8042633 pages in 101.513 seconds (618.965 MB/sec).

I was actually surprised to see that 4 146 SAS Drives could support this write throughput, but this is probably because the throughput is measured relative to the backup compression.  I.e., the source data was written at 618 MB/Sec, but the target throughput on the disk did not need to be that high as only about 25GB actually had to be written to disk for the backup of the 150GB database.  So, the actual combined write speed on the SAS drives was probably closer to about 225 MB/SEC

So, this makes me fairly certain that 4 Fusion-IO 1.28 TB MLC duo cards could read a SQL Database at a total of 6 GB/Sec and then write at a speed 6 GB/Sec to another server with 4 SLC duo cards (Backup compression typically shrinks by a factor of at least 5 so even a 5 TB database should fit on 4 320GB cards.  In that case, a Infiniband Network of 40Gb between the 2 servers is actually the bottleneck, as that probably won’t allow more than 5GB/Sec.

All this testing and blogging is getting me way behind on my day job and my research work, so you may not hear from me again for a while.  Hopefully you’ve found it useful.

I plan to create another blog for my research work (probably on Microsoft Live) related to my PhD and include that link as some of you have expressed interest in this. 

Also, some of you have wondered why I am only posting info related to the Fusion-io SSDs and not other devices.  The reason is that these are the only ones that I own and have personal experience with other than some Intel 2.5 SSD drives.   If another vendor wants to send me demo versions of their high-end products, I’ll be happy to try them out and let you know what I find (hint, hint…) Smile

Thanks for reading.

Technorati Tags: SSD,PCIE SSD,Fusion-io,IoDrive,Duo,SQL Server Performance,SQL Server Backup,Optimizing SQL Server Backup