Bulk insert command fails to insert the correct characters?

Bulk insert command is not inserting correct characters from a flat file into the table. For instance, the character “Ñ” is displayed as “+æ“ after getting imported in the table. In other words the characters from flat file are getting changed when imported to the table using Bulk insert command.

Lets take an example.

Here is a Spanish word desempeñar. Now let’s try to add this word into table from flat file.

Step 1: Create a text file with this word desempeñar in it.

Step 2 : Create a table named test_table. Here is the table definition

CREATE TABLE test_table

(

Name [nvarchar](50) NULL

) ON [PRIMARY]

Step 3: Run this command to bulk insert data from this text file.

BULK INSERT test_table from 'c:\test.txt'

Step 4: Now run a select query. Select name from test_table

clip_image002

Strange behavior . Isn't it. I never intended this to happen.

Step 5 : Lets try to import this data by another method import /export wizard.

clip_image003

clip_image004

clip_image005

This option worked and the word got imported as it is . So what is different in this method. Watch closely and you will see a option Code page. In our case it has value 1252.

So do we have this parameter in Bulk insert command and surprisingly we have

BULK INSERT
[ database_name . [ schema_name ] . | schema_name . ] [ table_name | view_name ]
FROM ' data_file '
[ WITH
(
[ [ , ] BATCHSIZE = batch_size ]
[ [ , ] CHECK_CONSTRAINTS ]
[ [ , ] CODEPAGE = { 'ACP' | 'OEM' | 'RAW' | 'code_page' } ]
[ [ , ] DATAFILETYPE =
{ 'char' | 'native' | 'widechar' | 'widenative' } ]
[ [ , ] FIELDTERMINATOR = ' field_terminator ' ]
[ [ , ] FIRSTROW = first_row ]
[ [ , ] FIRE_TRIGGERS ]
[ [ , ] FORMATFILE = ' format_file_path ' ]
[ [ , ] KEEPIDENTITY ]
[ [ , ] KEEPNULLS ]
[ [ , ] KILOBYTES_PER_BATCH = kilobytes_per_batch ]
[ [ , ] LASTROW = last_row ]
[ [ , ] MAXERRORS = max_errors ]
[ [ , ] ORDER ( { column [ ASC | DESC ] } [ , ...n ] ) ]
[ [ , ] ROWS_PER_BATCH = rows_per_batch ]
[ [ , ] ROWTERMINATOR = ' row_terminator ' ]
[ [ , ] TABLOCK ]
[ [ , ] ERRORFILE = ' file_name ' ]
) ]

Step 6: So lets run the bulk insert command with the following parameter

BULK INSERT test_table from 'c:\test.txt'

With

(

codepage='1252'

)

Bingo it worked .

clip_image007

So the morale of the story is if you are inserting a foreign word into SQL using Bulk insert or Import Export wizard then you have to chose the code page value defined for that particular language.

For further information on the code page please visit this link

https://msdn.microsoft.com/en-us/library/windows/desktop/dd317756(v=vs.85).aspx

Written By: - Azad Raosaheb Sale,SE,Microsoft GTSC
Reviewed By: – Pradipta Das,TL,Microsoft GTSC 
                          Karthick Krishnamurthy,TL,Microsoft GTSC