Bulk insert using UNICODE data files


 

Did you ever tried to use bulk insert with unicode instead of ansi data files using a format file as described below:

BULK INSERT MyTestTable
FROM ‘c:\data2.txt’
WITH (
FORMATFILE = ‘c:\format.fmt’,DATAFILETYPE = ‘widechar’
)
go

data2.txt (UNICODE format)

"test1",test2,Praha 5 – Reporyjezøíujc
"test1",test2,Praha 5 – Reporyjezøíujc

format.fmt (ANSI format otherwise SQL Server asks for XML format file type)

9.0
3
1       SQLNCHAR            0       80      ","       1     col1         Latin1_General_CI_AS
2       SQLNCHAR            0       80      ","       2     col2         Latin1_General_CI_AS
3       SQLNCHAR            0       80      "\r\n"  3     col3         Latin1_General_CI_AS

However the following error is generated:

Msg 4863, Level 16, State 1, Line 1
Bulk load data conversion error (truncation) for row 1, column 2 (col2).
Msg 7399, Level 16, State 1, Line 1
The OLE DB provider "BULK" for linked server "(null)" reported an error. The provider did not give any information about the error.
Msg 7330, Level 16, State 2, Line 1
Cannot fetch a row from OLE DB provider "BULK" for linked server "(null)".

Solutions:

1)

BULK INSERT MyTestTable
FROM ‘c:\data2.txt’
WITH (FIELDTERMINATOR = ‘,’,
DATAFILETYPE = ‘widechar’
)
go

2)

If you really need to use format file just change as described below:

9.0
3
1       SQLNCHAR            0       80      ",\0"       1     col1         Latin1_General_CI_AS
2       SQLNCHAR            0       80      ",\0"       2     col2         Latin1_General_CI_AS
3       SQLNCHAR            0       80      "\r\0\n\0"  3     col3         Latin1_General_CI_AS

Notes:

Format file needs to be in ANSI however datafile needs to be in UNICODE due special characters, when SQL Server is doing comparison is trying to compare a 1 byte field terminator against 2 byte character data because ANSI needs single byte per character versus UNICODE that needs 2 bytes. So doing a small change on format file to use 2 bytes instead works fine.

3)

http://support.microsoft.com/kb/942660 – FIX: Error message when you run the BULK INSERT statement in Microsoft SQL Server 2005: "The bulk load failed. The column is too long in the data file for row <n>, column <n>"

Hope this will help someone.


Comments (2)

  1. Zac says:

    It's not often I leave feedback, but in this case your post helped me so much that I simply must say thank you for taking the time to make this solution available. It's very well explained and you have saved me many, many hours of frustration.

    Thank you.

  2. Francesc Fernàndez says:

    In my case the problem was produced by another cause. In the format file I specified "rn" as end of line and it worked well, but today I started to receive files with only "n" as end of line.