QuickLook chokes on BOM-less text files

Filed under Apple, Software on November 10th, 2007

Do you know the BOM? Short for Byte Order Mark, it refers to a few bytes in the beginning of a utf-8 file, used to denote the endianess (byte order), of the file. The BOM is (supposed to be) invisible when opening the file for editing.

It’s is, in my opinion, an abomination in the eyes of God and a plague upon decent, hard working, people. It messes with your files in subtle ways. It makes the unusable with certain programs. Is it any wonder that the only program I know of that insists on saving utf-8 files with the BOM is Windows Notepap?

After installing Leopard, I noticed that Quicklook chokes on several foreign language text-files I have, displaying garbage. When I open them with TextWrangler they look fine. What gives? It seems that if you save them without the BOM (like a decent person would), QuickLook either assumes that the BOM is still there (effectively truncating the first few bytes) or picks the wrong endianess (which would be silly, since the files were created, edited and opened on the same Intel OS X machine). [update: actually it does something equally silly, it assumes the file is in Mac OS Roman].

Of course, all of this does not apply to latin only utf-8 files.

tag icon

Leave a Reply