Migrating UTF8 data from Oracle to SQL Server using SSMA for Oracle

Article
08/09/2012

Before we could begin our Data Migration, you’ll need to clearly understand what is UTF8 is the 8-bit encoding of Unicode?

It is a variable-width encoding and a strict superset of ASCII. This means that each and every character in the ASCII character set is available in UTF-8 with the same code point values. One Unicode character can be 1 byte, 2 bytes, 3 bytes, or 4 bytes in UTF-8 encoding.

Oracle’s support for Unicode

Oracle started supporting Unicode as a database character set in version 7.Unicode characters can be stored in an Oracle database in two ways.

Unicode database

The end user can create a Unicode database that enables you to store UTF-8 encoded characters as SQL CHAR datatypes (CHAR, VARCHAR2, CLOB, and LONG). For this you have to specify the database character set as UTF-8 when creating the database.

SQL Server support for Unicode

SQL Server started supporting Unicode as database character set from version 7. It uses UCS-2 character encoding for storing Unicode data.

Now let’s move on to how to Migrate Oracle database with UTF8 data to SQL Server using SSMA.

Unicode database solution

If database is Unicode database, then Oracle is configured to store multi-byte strings in VARCHAR2 / CHAR columns or variables. In this case, you can use customized mappings to map the character types to Unicode types in SQL Server.

Oracle	SQL Server 2005/2008
VARCHAR2	nvarchar
CHAR	nchar
LONG, CLOB	nvarchar(max)

Otherwise, non-ASCII strings can be distorted during data migration

Unicode data type solution

For Unicode datatype solution, source strings declared as national (NVARCHAR2 and NCHAR) are automatically mapped to nvarchar and nchar. Large object types like nclob is automatically mapped to nvarchar(max). So there is no need to change the default mappings for the above datatypes.

Customizing Data Type Mappings

You can customize type mappings at the project level, object category level (such as all tables), or object level. Settings are inherited from the higher level unless they are overridden at a lower level. For example, if you map smallmoney to money at the project level, all objects in the project will use this mapping unless you customize the mapping at the object or category level.

The following procedure shows how to map data types at the project, database, or object level:

To map data types follow the below steps:

To customize data type mapping for the whole project, open the Project Settings dialog box:

a. On the Tools menu, select Project Settings.

b. In the left pane, select Type Mapping. The type mapping chart and buttons appear in the right pane.

Or, to customize data type mapping at the database, table, view, or stored procedure level, select the database, object category, or object in Oracle Metadata Explorer:

c. In Oracle Metadata Explorer, select the folder or object to customize.

d. In the right pane, click the Type Mapping tab.

Author : Ajay(MSFT), SQL Developer Engineer, Microsoft

Reviewed by : Smat (MSFT), SQL Escalation Services, Microsoft

Migrating UTF8 data from Oracle to SQL Server using SSMA for Oracle

Additional resources