Unable to correctly display Chinese (Unicode) characters in Excel when opened through ASP.Net page

Recently I was working on an issue wherein one of our customers was trying to stream data from their web application in CSV format for it to be recognizable and opened through Excel on the client's end. Basically they were setting content-type and content-disposition to open the file outside the browser and open it in MS-Excel. Everything would have worked had they not used Chinese characters as data in this case.

Something like this:

Page.Response.Clear()
Page.Response.ContentType = "application/vnd.ms-excel"
Page.Response.ContentEncoding = System.Text.Encoding.UTF8
Page.Response.AddHeader("Content-Disposition", "attachment; filename=ExportData.xls")

And later in the code they were reading the column headers and column row in CSV format into a string which will get flushed as a response output.

Something like this:

'Output Column Headers as 

        columnHeaders = "HEADER1" + Chr(9) + "HEADER2"             
        columnHeaders = columnHeaders & Chr(13) & Chr(10)

[Here, Chr(9), Chr(10) and Chr(13) correspond to Tab, Linefeed and Carriage Return characters in ASCII respectively to adhere to CSV format]

Page.Response.Write(columnHeaders)
Page.Response.Write(Chr(10))

and

'Output Column Row as
columnRow = ""

After populating the columns in various strings we do this to adhere to CSV format:

columnRow = coulmn1 + Chr(9) + column2

columnRow = columnRow & Chr(13) & Chr(10)

...............

Page.Response.Write(columnRow)        ' Finally display the data

Now if you see above this should work if we try to open the file using Excel. Although if we are sending the data in UTF-8 encoding (let's say for Chinese characters), Excel doesn't recognize it correctly and opens it in ASCII. In normal scenarios the above functionality will not cause issues but if we are using any Unicode characters like Chinese the data will be wrongly displayed in Excel. You may see "???????" etc. Although it may display perfectly fine in the webpage control , let's say in a datagrid.

The resolution to such an issue is to switch from UTF-8 to Unicode and add Unicode byte leader
to the start of the file. Excel will recognize the byte-leader as an indication of Unicode data coming in, and correctly read the file as Unicode. This way Unicode characters like Chinese can be preserved when opened through Excel.

Here is something you can try:

Dim rgByteLeader(1) As Byte
        rgByteLeader(0) = &HFF
        rgByteLeader(1) = &HFE

        Page.Response.Clear()
        Page.Response.ContentType = "application/vnd.ms-excel"
        Page.Response.ContentEncoding = System.Text.Encoding.Unicode
        Page.Response.AddHeader("Content-Disposition", "attachment; filename=ExportData.xls")

' Write out the Unicode header FFFE so that Excel recognizes the file as Unicode()
        Page.Response.BinaryWrite(rgByteLeader)

'Output Column Headers as before

        columnHeaders = "HEADER1" + Chr(9) + "HEADER2"             
        columnHeaders = columnHeaders & Chr(13) & Chr(10)

        Page.Response.Write(columnHeaders)
        Page.Response.Write(Chr(10))

 

'Output Column Rows as before
        columnRow = ""

        .............

       columnRow = coulmn1 + Chr(9) + column2
       columnRow = columnRow & Chr(13) & Chr(10)

       Page.Response.Write(columnRow)

       .....

       Page.Response.End()

 

I am no Globalization/MS-Excel expert but I had a tough time researching on this issue so thought of sharing it with others. Hope this helps!