Export as Unicode text

Ádám Csillag 2 years ago in Home Portal updated by Jussi Rautio 2 years ago 9

Hi, we'd like to be able to export views as Unicode text to avoid CSV character corruption issues.



Satisfaction mark by Ádám Csillag 2 years ago

That is set in configuration.

Look for System Defaults and Default Encoding of CSV Files (Export). Go for UTF-8.

Thanks. The only thing I'm missing is an UTF-8 BOM option.

Without a BOM header, Excel doesn't recognize the encoding automatically.

Yes, we can use the text import wizard, but it would be nice to be able to open the file directly.


I found an interesting article on the subject. The author says two things:

  1. The biggest problem is not CSV itself, but that the primary tool used to interact with it is Excel. Excel handles CSV encodings badly.
  2. If one attempts to open a CSV file encoded as UTF-8 without a Byte Order Mark (BOM) as recommended, any non-ASCII characters are again scrambled. [...] If we try it again with a UTF-8 BOM prepended to the file and Excel will read it. This is deceptive because once saved the text will remain correctly encoded UTF-8, but bizarrely the BOM will be stripped causing the file to no longer be correctly readable.

So the question is should we adjust to Excel, or to general recommendations? :)

I'm guessing that a lot of people will open the CSV files in Excel directly.

If the encoding is UTF-8 BOM the content will be displayed correctly and you can save it as an Excel workbook (Excel cannot save to Unicode CSV).


My vote would be to adjust to Excel, since I would think that the majority of users will open CSV files in Excel.

I would vote for adapting it to Excel as well. Is this perhaps already being implemented?


Would it be feasible to have both UTF-8 and UTF-8 with BOM as export options?