SYSK 230: Do You Need BizTalk To Load a CSV File?

Consider the following scenario: you get a comma separated values (CSV) file from your business partner via a file upload feature available on your B2B web site. For this file to be processed, you need to convert it to a relational format (e.g. data table). So, what do you do?

 

You could copy the uploaded file to a BizTalk receive folder that’s monitored by BizTalk engine, that’ll start a BizTalk orchestration... You likely would’ve used the BizTalk mapper to convert the CSV schema to the format your application can easily read… When the file arrives, the orchestration will convert the CSV to an XML document, then you could use a BizTalk Adapter for SQL Server to load the data into your SQL Server database…

 

But wait! First, you transfer relational data into XML; then, convert XML back to relational? Doesn’t seem like the most optimum path…

 

Now, don’t take me wrong – I love BizTalk! And if you need enterprise level workflow implementation, with high degree of scalability, fail over support, throttling capabilities, business activity monitoring, and many other features that come with BizTalk server, then the small overhead of using XML message format for communication between components within the orchestration pipeline is a small price to pay, in my opinion.

 

This blog is meant solely to show that you can, relatively easily, convert CSV data to a relational format (e.g. DataTable) without the use of the BizTalk mapper. Below is a snippet of code that should get you started.

 

// Remember to add SHE...

// This is just a sample to demonstrate a way to convert data from

// CSV to relational format

 

string path = @"C:\Temp\CSV2XML";

string fileName = @"test.csv";

string cnString = "Driver={Microsoft Text Driver (*.txt; *.csv)};Dbq=" + path + ";Extensions=asc,csv,tab,txt;Persist Security Info=False";

           

System.Data.Odbc.OdbcConnection cn = new System.Data.Odbc.OdbcConnection(cnString);

cn.Open();

System.Data.Odbc.OdbcCommand cmd = new System.Data.Odbc.OdbcCommand(String.Format("select * from [{0}]", fileName), cn);

System.Data.Odbc.OdbcDataAdapter da = new System.Data.Odbc.OdbcDataAdapter(cmd);

System.Data.DataSet ds = new DataSet();

da.Fill(ds, "YourTableName");

cmd.Dispose();

cn.Dispose();

One more thing, make sure you have the schema.ini file in same folder as your .scv data file. The schema.ini file should contain the following information (see http://msdn.microsoft.com/library/default.asp?url=/library/en-us/odbc/htm/odbcjetschema_ini_file.asp for more detailed information):

- File name

- Whether or not the data file includes column headers as first row

- The delimiter (e.g. CSVDelimited, TabDelimited, FixedLength)

- How many rows to scan to determine data types (if zero is used, the entire file is scanned)

- File character set (e.g. ANSI, OEM)

 

Here is the schema.ini file I used with the code above:

[Test.txt]

ColNameHeader=True

Format=CSVDelimited

MaxScanRows=25

CharacterSet=OEM