Accelerating Oracle -> SQL Server Migrations with Fusion-io ioMemory

Article
06/09/2011

We have recently completed several migrations at customer sites using Fusion-io ioMemory Devices to greatly accelerate throughput and reduce downtime. A correctly designed and configured infrastructure and migration strategy can achieve throughput of between 1TB to 2TB per hour. In our customer tests we achieved several cases of 6TB Oracle systems exported & imported to SQL Server in 7 hours.

What are Fusion-io devices & are they supported with SQL Server?

Fusion-io devices are ultra-high performance Solid-State Devices. Unlike conventional mechanical disks these devices have no moving parts. Fusion-io devices plug directly into the PCIe bus in a server and therefore do not pass through the SAN fabric components such as HBAs, FC switches and frontend controllers on the SAN. A single Fusion-io device IOPS capability is measured in hundreds of thousands of IOPS – the largest/fastest Fusion-io device has more than 1,200,000 IOPS. One standard mechanical disk has between 150-200 IOPS.

Fusion-io SSD disks and selected other SSD brands are fully supported for use with SQL Server. The SQL Server Development Lab has extensively tested and validated Fusion-io devices.

Why are Fusion-io devices needed to speed up OS/DB Migrations? Isn’t SAN disk fast enough?

Over the last 5-6 years the capabilities of Intel hardware, the Windows OS and SQL Server have evolved exponentially. SAP systems larger than 10TB to 15TB were exclusively found on UNIX platform 7-10+ years ago. Modern Intel Westmere EX 4 processor servers benchmark over 75,000 SAPS and close to 14,000 users. 8 processor configurations benchmark more than 130,000 SAPS and 24,000 users. Today there are many SAP customers with DB sizes larger than 10TB. We have a large number of UNIX/Oracle customers that wish to move onto cheaper lower cost commodity hardware.

In response to this requirement Microsoft released traceflag 610 for SQL Server to “shutdown” almost all transaction logging during OS/DB Migrations. Traceflag 610 is documented in Note 1241751 - SQL Server minimal logging extensions. Traceflag 610 is discussed in section 9 of my OS/DB Migration FAQ document and is generally recommended for all migrations to SQL Server. This feature dramatically reduces both the size of the transaction log and the IO throughput to the transaction log file, thereby speeding up the migration process. Traceflag 610 works when the database is in BULK_LOGGED and SIMPLE mode.

There is one case where Traceflag 610 is ineffective and SQL Server will switch to standard SIMPLE logging. Very large 10-15TB Oracle systems are almost always exported using Oracle ROW ID table splitting and with an unsorted export. The SAP on Oracle schema unlike SAP on SQL Server schema does not store data in a clustered (sorted) index. Very large SAP systems typically take a very long time to export if the export is configured to sort the data. Section 2 of my OS/DB Migration FAQ document has more details.

Traceflag 610 is only effective if the import source is already sorted, therefore we have found very high LOGWRITE and LOGBUFFER waits. The reason for high LOGWRITE and LOGBUFFER wait times is that SQL Server is sorting all the data in the clustered index to ensure all rows are in an ascending sequence as per the primary index. As new rows get inserted with R3LOAD, there is a high chance that considerable data movement is occurring as rows are shuffled to remain in ascending sequence. Eventually with large OS/DB migrations this leads to very high utilization on the transaction log which increases LOGWRITE times and a secondary effect of high LOGBUFFER times.

The good news is that Fusion-io and other solid state devices completely mitigate the overhead of the additional logging due to an unsorted import source.

How to detect excessive transaction logging?

SQL Server Wait Statistics can be used to analyse if transaction logging or other factors are impacting performance.

To analyse performance fully review this document run this command. This will purge the current Wait Statistics. It is usual for the LOGWRITE performance to be somewhat lower during a OS/DB migration because the Log Write size will be around 60kb rather than the usual 8kb

DBCC SQLPERF ( "sys.dm_os_wait_stats" , CLEAR )

Note: Do not run this command on a running SAP system as this command will cause serious disruption to the SAP SQL Server performance collector and related statistics. This command should be used only while optimizing a R3LOAD import

Now run the OS/DB migration import. When the import finishes immediately run this command:

SELECT * FROM sys.dm_os_wait_stats

Copy and paste the Wait Statistics from SQL Management Studio into Excel. It is useful to add the columns “Wait/Requests” and to express the Wait Time as a percentage of total Wait Time.

The below example displays 60% of Wait Time was around transaction logging. If WRITELOG and LOGBUFFER are more than 5-10% in total it is likely the performance of the transaction log is a limiting factor.

Wait Type	Requests	Wait Time	Signal Wait Time	Wait/Requests	Time %
LOGBUFFER	107572144	480955987	573028200	4.47	38.63%
WRITELOG	54366633	288127653	146760600	5.30	23.14%
ASYNC_NETWORK_IO	130867116	110600897	336770500	0.85	8.88%
LCK_M_IX	6506	103455497	5509479	15901.55	8.31%
SOS_SCHEDULER_YIELD	27292022	76591533	283617	2.81	6.15%
PAGELATCH_SH	461479	40322038	501463	87.38	3.24%
LAZYWRITER_SLEEP	1634918	40099749	459286	24.53	3.22%
LATCH_EX	13971	36515448	1244014	2613.66	2.93%
PAGEIOLATCH_EX	2696925	7986746	67403	2.96	0.0064
LATCH_SH	19696	6667832	1076536	338.54	0.54%

The example below shows an OS/DB Migration Import and indicates that Network was the largest bottleneck on this system consuming 38.92% of all time. WRITELOG and LOGBUFFER are performing well in this case due to the addition of a Fusion-io card.

Wait Type	Requests	Wait Time	Signal Wait Time	Wait/Requests	Time %
Total	699504300	1753331000	573028200	2.51	100.00%
ASYNC_NETWORK_IO	418040400	682462100	146760600	1.63	38.92%
SOS_SCHEDULER_YIELD	175555400	336949100	336770500	1.92	19.22%
LAZYWRITER_SLEEP	16000280	293505800	5509479	18.34	16.74%
CXPACKET	16429890	101365500	283617	6.17	5.78%
IO_COMPLETION	12262320	43917180	501463	3.58	2.50%
PAGELATCH_UP	463536	6751748	459286	14.57	0.39%
WRITELOG	1800964	6746422	1244014	3.75	0.38%
LOGBUFFER	116672	133657	67403	1.15	0.01%
SLEEP_BPOOL_FLUSH	6078143	5279898	1076536	0.87	0.30%
PAGEIOLATCH_SH	20012	213189	3905	10.65	0.01%
PAGEIOLATCH_UP	8832	195787	6573	22.17	0.01%
MSQL_XP	10077	195254	0	19.38	0.01%
PREEMPTIVE_OS_GET	10077	195027	0	19.35	0.01%

A common scenario while importing during SAP migrations is that the waits on LOGBUFFER wait can become high whereas WRITELOG waits still look fine with less than 5ms latency. The reason for this is due to SQL Server limiting the total number of outstanding Log Write IOs. For details on the limits see this article from SQL CAT Team.

This results in worker threads of SQL Server possibility waiting to write into the log buffers. Since all log buffers might be outstanding, there is a short wait time for a log buffers to be available. Hence speeding up I/O latency against the transaction log volume will reduce the chance of SQL Server hitting the limits of outstanding I/O or log volumes. With that the wait on a free log buffer would be reduced.

Some common Wait conditions during R3LOAD imports and their possible solutions are listed below:

WRITELOG/LOGBUFFER : Put Transaction Log on Fusion-io disk

ASYNC_NETWORK_IO : Use 10G network cards with RSS

CXPACKET : Do not set MAXDOP to 0, rather use a value such as 4

SOS_SCHEDULER_YIELD : Use powerful new Intel/AMD based servers

The advantage of Fusion-io cards in reducing I/O latency is twofold. First is that these devices reduce IO latency by factors of hundreds or thousands compared to mechanical disks. Second advantage is the fact that the cards are connected directly with the PCIe slot without any further controller card or switches to the storage. Therefore the I/O latency is reduced dramatically because no IO travels through the SAN fabric. The entire bandwidth of the SAN fabric & cache is available for writing to the SQL Datafiles.

Windows Performance Monitor Disk Counters will show between 1-5ms per IO READ or WRITE (0.001-0.005 seconds) for mechanical disks on midrange SAN. Fusion-io devices with the same workload can show 0.01-0.001ms per IO READ or WRITE (0.00001-0.000001 seconds).

How to deploy Fusion-io devices?

Fusion-io and other solid-state devices require addition configuration and installation steps to achieve maximum throughput. The exact deployment guidelines are slightly different between different solid-state vendors, therefore this blog will not go into details of drive formatting and configuration. We will release a subsequent blog and whitepaper on the detailed configuration of Fusion-io disks (and other solid-state vendors) specifically for optimizing the OS/DB Migration effort.

Thanks to Amit Jethva

Contributors: Sumeet, Mathew & Joshua from Fusion-io

Accelerating Oracle -> SQL Server Migrations with Fusion-io ioMemory

Additional resources