Accelerating Oracle -> SQL Server Migrations with Fusion-io ioMemory

We have recently completed several migrations at customer sites using Fusion-io ioMemory Devices to greatly accelerate throughput and reduce downtime. A correctly designed and configured infrastructure and migration strategy can achieve throughput of between 1TB to 2TB per hour. In our customer tests we achieved several cases of 6TB Oracle systems exported & imported to SQL Server in 7 hours.

What are Fusion-io devices & are they supported with SQL Server?

Fusion-io devices are ultra-high performance Solid-State Devices. Unlike conventional mechanical disks these devices have no moving parts. Fusion-io devices plug directly into the PCIe bus in a server and therefore do not pass through the SAN fabric components such as HBAs, FC switches and frontend controllers on the SAN. A single Fusion-io device IOPS capability is measured in hundreds of thousands of IOPS – the largest/fastest Fusion-io device has more than 1,200,000 IOPS. One standard mechanical disk has between 150-200 IOPS.

Fusion-io SSD disks and selected other SSD brands are fully supported for use with SQL Server. The SQL Server Development Lab has extensively tested and validated Fusion-io devices.

Why are Fusion-io devices needed to speed up OS/DB Migrations? Isn’t SAN disk fast enough?

Over the last 5-6 years the capabilities of Intel hardware, the Windows OS and SQL Server have evolved exponentially. SAP systems larger than 10TB to 15TB were exclusively found on UNIX platform 7-10+ years ago. Modern Intel Westmere EX 4 processor servers benchmark over 75,000 SAPS and close to 14,000 users. 8 processor configurations benchmark more than 130,000 SAPS and 24,000 users. Today there are many SAP customers with DB sizes larger than 10TB. We have a large number of UNIX/Oracle customers that wish to move onto cheaper lower cost commodity hardware.

In response to this requirement Microsoft released traceflag 610 for SQL Server to “shutdown” almost all transaction logging during OS/DB Migrations. Traceflag 610 is documented in Note 1241751 - SQL Server minimal logging extensions. Traceflag 610 is discussed in section 9 of my OS/DB Migration FAQ document and is generally recommended for all migrations to SQL Server. This feature dramatically reduces both the size of the transaction log and the IO throughput to the transaction log file, thereby speeding up the migration process. Traceflag 610 works when the database is in BULK_LOGGED and SIMPLE mode.

There is one case where Traceflag 610 is ineffective and SQL Server will switch to standard SIMPLE logging. Very large 10-15TB Oracle systems are almost always exported using Oracle ROW ID table splitting and with an unsorted export. The SAP on Oracle schema unlike SAP on SQL Server schema does not store data in a clustered (sorted) index. Very large SAP systems typically take a very long time to export if the export is configured to sort the data. Section 2 of my OS/DB Migration FAQ document has more details.

Traceflag 610 is only effective if the import source is already sorted, therefore we have found very high LOGWRITE and LOGBUFFER waits. The reason for high LOGWRITE and LOGBUFFER wait times is that SQL Server is sorting all the data in the clustered index to ensure all rows are in an ascending sequence as per the primary index. As new rows get inserted with R3LOAD, there is a high chance that considerable data movement is occurring as rows are shuffled to remain in ascending sequence. Eventually with large OS/DB migrations this leads to very high utilization on the transaction log which increases LOGWRITE times and a secondary effect of high LOGBUFFER times.

The good news is that Fusion-io and other solid state devices completely mitigate the overhead of the additional logging due to an unsorted import source.

How to detect excessive transaction logging?

SQL Server Wait Statistics can be used to analyse if transaction logging or other factors are impacting performance.

To analyse performance fully review this document run this command. This will purge the current Wait Statistics. It is usual for the LOGWRITE performance to be somewhat lower during a OS/DB migration because the Log Write size will be around 60kb rather than the usual 8kb

DBCC SQLPERF ( "sys.dm_os_wait_stats" , CLEAR )

Note: Do not run this command on a running SAP system as this command will cause serious disruption to the SAP SQL Server performance collector and related statistics. This command should be used only while optimizing a R3LOAD import

Now run the OS/DB migration import. When the import finishes immediately run this command:

SELECT * FROM sys.dm_os_wait_stats

Copy and paste the Wait Statistics from SQL Management Studio into Excel. It is useful to add the columns “Wait/Requests” and to express the Wait Time as a percentage of total Wait Time.

The below example displays 60% of Wait Time was around transaction logging. If WRITELOG and LOGBUFFER are more than 5-10% in total it is likely the performance of the transaction log is a limiting factor.

Wait Type

Requests

Wait Time

Signal Wait Time

Wait/Requests

Time %

LOGBUFFER

107572144

480955987

573028200

4.47

38.63%

WRITELOG

54366633

288127653

146760600

5.30

23.14%

ASYNC_NETWORK_IO

130867116

110600897

336770500

0.85

8.88%

LCK_M_IX

6506

103455497

5509479

15901.55

8.31%

SOS_SCHEDULER_YIELD

27292022

76591533

283617

2.81

6.15%

PAGELATCH_SH

461479

40322038

501463

87.38

3.24%

LAZYWRITER_SLEEP

1634918

40099749

459286

24.53

3.22%

LATCH_EX

13971

36515448

1244014

2613.66

2.93%

PAGEIOLATCH_EX

2696925

7986746

67403

2.96

0.0064

LATCH_SH

19696

6667832

1076536

338.54

0.54%

The example below shows an OS/DB Migration Import and indicates that Network was the largest bottleneck on this system consuming 38.92% of all time. WRITELOG and LOGBUFFER are performing well in this case due to the addition of a Fusion-io card.

Wait Type

Requests

Wait Time

Signal Wait Time

Wait/Requests

Time %

Total

699504300

1753331000

573028200

2.51

100.00%

ASYNC_NETWORK_IO

418040400

682462100

146760600

1.63

38.92%

SOS_SCHEDULER_YIELD

175555400

336949100

336770500

1.92

19.22%

LAZYWRITER_SLEEP

16000280

293505800

5509479

18.34

16.74%

CXPACKET

16429890

101365500

283617

6.17

5.78%

IO_COMPLETION

12262320

43917180

501463

3.58

2.50%

PAGELATCH_UP

463536

6751748

459286

14.57

0.39%

WRITELOG

1800964

6746422

1244014

3.75

0.38%

LOGBUFFER

116672

133657

67403

1.15

0.01%

SLEEP_BPOOL_FLUSH

6078143

5279898

1076536

0.87

0.30%

PAGEIOLATCH_SH

20012

213189

3905

10.65

0.01%

PAGEIOLATCH_UP

8832

195787

6573

22.17

0.01%

MSQL_XP

10077

195254

0

19.38

0.01%

PREEMPTIVE_OS_GET

10077

195027

0

19.35

0.01%

A common scenario while importing during SAP migrations is that the waits on LOGBUFFER wait can become high whereas WRITELOG waits still look fine with less than 5ms latency. The reason for this is due to SQL Server limiting the total number of outstanding Log Write IOs. For details on the limits see this article from SQL CAT Team.

This results in worker threads of SQL Server possibility waiting to write into the log buffers. Since all log buffers might be outstanding, there is a short wait time for a log buffers to be available. Hence speeding up I/O latency against the transaction log volume will reduce the chance of SQL Server hitting the limits of outstanding I/O or log volumes. With that the wait on a free log buffer would be reduced.

Some common Wait conditions during R3LOAD imports and their possible solutions are listed below:

WRITELOG/LOGBUFFER : Put Transaction Log on Fusion-io disk

ASYNC_NETWORK_IO : Use 10G network cards with RSS

CXPACKET : Do not set MAXDOP to 0, rather use a value such as 4

SOS_SCHEDULER_YIELD : Use powerful new Intel/AMD based servers

The advantage of Fusion-io cards in reducing I/O latency is twofold. First is that these devices reduce IO latency by factors of hundreds or thousands compared to mechanical disks. Second advantage is the fact that the cards are connected directly with the PCIe slot without any further controller card or switches to the storage. Therefore the I/O latency is reduced dramatically because no IO travels through the SAN fabric. The entire bandwidth of the SAN fabric & cache is available for writing to the SQL Datafiles.

Windows Performance Monitor Disk Counters will show between 1-5ms per IO READ or WRITE (0.001-0.005 seconds) for mechanical disks on midrange SAN. Fusion-io devices with the same workload can show 0.01-0.001ms per IO READ or WRITE (0.00001-0.000001 seconds).

How to deploy Fusion-io devices?

Fusion-io and other solid-state devices require addition configuration and installation steps to achieve maximum throughput. The exact deployment guidelines are slightly different between different solid-state vendors, therefore this blog will not go into details of drive formatting and configuration. We will release a subsequent blog and whitepaper on the detailed configuration of Fusion-io disks (and other solid-state vendors) specifically for optimizing the OS/DB Migration effort.

Thanks to Amit Jethva

Contributors: Sumeet, Mathew & Joshua from Fusion-io

Fusion logo_Horiz_sm