Saving files with different encodings & line endings

Now, this feature could be described as more of a text editor feature than a project
feature, and it’s not owned by my team, but it’s relevant & useful enough that
I think it’s worth mentioning here.

With VS you can save a text file with an arbitrary language encoding (ANSI, Shift-JIS,
Unicode, UTF-8, etc.) and arbitrary line endings (CR/LF, CR only, etc.).

If you’re using VS 2003 (I think these steps are identical for VS 2002),
then open up any scratch text file (e.g. .vb, .cs, .cpp, .txt, whatever).  Then
go to the File menu and pick Save <file> As (or hit alt-F followed by A) to
bring up the Save As dialog.

As with some of the prior tips I’ve mentioned, the magic lies in the obscure little
down-arrow at the right edge of the “Save” button.  Click on this little down-arrow
and click “Save with encoding”.  If you didn’t change the filename you’ll asked
to confirm that you want to overwrite the current file — just say “yes”, assuming
this isn’t a critical file that you didn’t want to churn; otherwise choose a different
filename and you won’t get that confirmation msgbox.

You will then reach the “Advanced Save Options” dialog.  You can now tweak the
save operation to change your encoding or your line endings.

Line endings are a simpler issue so let’s start there.  The dropdown lists four
options.  “Current Setting” means “leave your line endings alone”.  The
other three are self-explanatory, giving you a choice between CR/LR, CR, and LR as
your newline sequence.  These are useful if you want to share out your files
with someone on a Mac(TM) or Unix(TM) system.

Encoding is a thornier issue, and in the dropdown you will see a vast number of choices. 
The first thing I will say is that if you don’t know what you’re doing, don’t change
your encoding on important files because you could permanently lose data.

If you do know what you’re doing, here you will find lots of useful encodings
for your file.  Lots of MBCS encodings, many Unicode variants (UTF-16, UTF-8,
UTF-7, variants for big-endian, no signature, etc.), and lots of OEM and other encodings
whose subtleties I don’t claim to really understand.  Changing your encoding
lets you store international characters that you might not otherwise be able to store
in your file, but be careful not to switch to an encoding that the tools & editors
you want to use with the file can’t understand.  Also understand that if you
have characters in an encoding like Unicode, you could potentially lose data
if you save to another encoding which cannot express all the same characters. 
Be very careful because if you are randomly changing encodings you could lose data.

A fairly useful page on MSDN about language encoding issues is here
This page has several useful links at the bottom if you want to learn more.

That’s all for now. -Chris

Comments (0)