Monday, August 03, 2009

Fixing Bad Text Encoding

I accidentally imported a UTF-8 text file into Excel as an ANSI file, then proceeded to do a lot of work on the Excel file. On closer inspection, a lot of characters were messed up. For example, apostrophe had became ’ and a lot of foreign characters were wrong. But I didn't want to re-import the file and redo the work. So, here's what I did:

1) I saved the Excel file as a text file. (Save As, Other Formats, *.csv) This saved the file as a UTF-8 text file.
2) I opened the CSV in Notepad, then saved it as "ANSI" text file. (File, Save As, Select ANSI in the combo box.) This saved the file as simple ASCII text file.
3) I then opened the text file in Notepad, but this time as UTF-8. (File, Open, Select UTF-8 in the combo box.) This opened the ASCII file as if it were UTF-8, which converted the strange two and three character runs back into the single character. I saved this file as UTF-8 with .CSV extension, then opened it in Excel.

Whew! I really didn't want to redo all that work.

0 Comments:

Post a Comment

<< Home