Bulk insert using UNICODE data files
Did you ever tried to use bulk insert with unicode instead of ansi data files using a format file as described below:
BULK INSERT MyTestTable
FROM 'c:\data2.txt'
WITH (
FORMATFILE = 'c:\format.fmt',DATAFILETYPE = 'widechar'
)
go
data2.txt (UNICODE format)
"test1",test2,Praha 5 - Reporyjezøíujc
"test1",test2,Praha 5 – Reporyjezøíujc
format.fmt (ANSI format otherwise SQL Server asks for XML format file type)
9.0
3
1 SQLNCHAR 0 80 "," 1 col1 Latin1_General_CI_AS
2 SQLNCHAR 0 80 "," 2 col2 Latin1_General_CI_AS
3 SQLNCHAR 0 80 "\r\n" 3 col3 Latin1_General_CI_AS
However the following error is generated:
Msg 4863, Level 16, State 1, Line 1
Bulk load data conversion error (truncation) for row 1, column 2 (col2).
Msg 7399, Level 16, State 1, Line 1
The OLE DB provider "BULK" for linked server "(null)" reported an error. The provider did not give any information about the error.
Msg 7330, Level 16, State 2, Line 1
Cannot fetch a row from OLE DB provider "BULK" for linked server "(null)".
Solutions:
1)
BULK INSERT MyTestTable
FROM 'c:\data2.txt'
WITH (FIELDTERMINATOR = ',',
DATAFILETYPE = 'widechar'
)
go
2)
If you really need to use format file just change as described below:
9.0
3
1 SQLNCHAR 0 80 ", \0" 1 col1 Latin1_General_CI_AS
2 SQLNCHAR 0 80 ", \0" 2 col2 Latin1_General_CI_AS
3 SQLNCHAR 0 80 "\r \0\n \0" 3 col3 Latin1_General_CI_AS
Notes:
Format file needs to be in ANSI however datafile needs to be in UNICODE due special characters, when SQL Server is doing comparison is trying to compare a 1 byte field terminator against 2 byte character data because ANSI needs single byte per character versus UNICODE that needs 2 bytes. So doing a small change on format file to use 2 bytes instead works fine.
3)
https://support.microsoft.com/kb/942660 - FIX: Error message when you run the BULK INSERT statement in Microsoft SQL Server 2005: "The bulk load failed. The column is too long in the data file for row <n>, column <n>"
Hope this will help someone.