2009
10.01

The internet is filled with people trying to understand why their CSV generated by PHP/ASP.NET/whatever which uses the UTF-8 encoding doesn’t get imported correctly into Excel. To be more precise Excel doesn’t seem to be able to handle UTF-8 for CSV files at first glance.  The only reasonable solution Google came up with was to rename the file as .txt file and then select the encoding manually.

Not wanting to switch to some other encoding nor making the users life a living hell I refused to go down the suggested path. Remembering that Notepad/Wordpad had a habit of including the BOM at the beginning of UTF-8 files I decided to give it a shot.

I included the “\xEF\xBB\xBF” (UTF-8 BOM) at the beginning of the CSV to see what would happen. And to my greatest surprise Excel was importing my UTF-8 CSV perfectly. I literally couldn’t believe it.

So, to recap. In order to get Excel play nicely with UTF-8 data in CSV files you need to:

  • Include the UTF-8 BOM at the start of the file,
  • Use “;” as the delimiter.

And that’s all it takes.

The downside of including the BOM is that in OpenOffice, when the first field of the file is “Spaced and Quoted”, the quotes are not removed for some reason, but that’s a small price to pay to get Excel working.

No Comment.

Add Your Comment