Command line utilities is the worst


Try this:
Open a command prompt window
Type “netsh” and press enter
Type “help” and press enter
Play around a little bit; jump around to different contexts, check syntax for a couple of commands
Appreciate the amount of power netsh gives you.
Appreciate how much time I’ve spent translating all them syntax messages to Swedish.

As a user, I think it’s great that we keep making it easier to use Windows from the command prompt. As a localizer, I’m not as thrilled… localizing command line utilities isn’t all that straight forward.

One thing to note is that these messages are really two-dimensional objects stored in one single string. This is how one single localizable resource is stored:

\nUsage: %1!s! [name=]\n             [[routing=](enabled|disabled|default)]\n\nParameters:\n\n       Tag              Value\n       name           - Interface name.\n       routing        - Whether to act as a router.\n\nRemarks: Sets 6to4 interface configuration information.\n\nExamples:\n\n       %1!s! "Private" enabled\n\n\n

And this is what it really represents:

Usage: %1!s! [name=]
             [[routing=](enabled|disabled|default)]

Parameters:

       Tag              Value
       name           - Interface name.
       routing        - Whether to act as a router.

Remarks: Sets 6to4 interface configuration information.

Examples:

       %1!s! "Private" enabled

(This is one of the shorter examples. Some are several thousand characters long.)

The markup in the string doesn’t cause me much problems – I have an editor that can show the string as it’s mean to be seen. Still, there are a good few things I need to keep in mind.

1) Row length. One of the constraints we place on ourselves is that any row in a syntax message should not be wider than 80 characters (we aim for 78). Since the default width of cmd.exe is 80 chars, if a string contains a row that’s too long, the text will be uglily wrapped. Because of this constraint, we may need to insert extra rows instead. It’s typically easy to make sure to follow this rule, and it’s fairly straight forward to write a script that scans my translations and reports any violations. Except of course when the string contains a placeholder. There are syntax messages with examples that will expand beyond 80 chars in runtime. It’s therefore pretty handy that the whole message is one single resource, as I can add and remove rows as needed.

2) Alignment. Above you have a nice table with two columns. We need to keep it looking nice. This isn’t always easy. If the first column expands in width, the second (which usually contains more text) needs to get narrower and this can look ugly too. Also, one needs to make sure not to mix spaces and tabs when aligning as the end result might surprise you. I don’t have a tool that can spot misalignments, but it’s been (way down) on my todo list for a while.

3) What do I really translate? Looking at the first two rows, only the words “Usage” and “string” should be translated. Anything else needs to be left untouched since the actual options are hard coded. Over translation here doesn’t have functional implications, but it can confuse the user. Now, this is a pretty straight forward example, but there are cases when it’s not at all clear what parts I shouldn’t touch.

4) Recycling & consistency. These messages aren’t recycling-friendly. Since aligning strings here means breaking up sentences over several rows, it’s very hard to recycle a part the string. You often end up re-translating the same thing over and over again, and this will give inconsistent translation. Not the end of the world, but it sure is a tedious way to waste time. Also on consistency, make sure that the items in the Tag column match whatever comes after “Usage” at the start of the string and the examples at the end.

5) Spell checking. Spell checking this junk is hard — a lot of elements are left in English, sentences are broken up over rows… not fun.

We struggle with all of these issues for every release. I’ve gotta try and do something about that…

Btw, one thing that’s good with these tools is that they can be automated. So, it’s entirely possible to write a script to call all kinds of command line utilities, analyze the output and find problems.

Comments (6)

  1. Would there be any chance of using something like the man pages markup?

    For example, one section of the manpage for Lynx looks like this:

    <pre>

    -cmd_script=FILENAME

    read keystroke commands from the specified file. You can use

    the data written using the -cmd_log option. Lynx will ignore

    other information which the command-logging may have written to

    the logfile. Each line of the command script contains either a

    comment beginning with "#", or a keyword:

    exit causes the script to stop, and forces lynx to exit

    immediately.

    key the character value, in printable form. Cursor and other

    special keys are given as names, e.g., "Down Arrow".

    Printable 7-bit ASCII codes are given as-is, and hexadecimal

    values represent other 8-bit codes.

    </pre>

    Looking at the raw manpage, it’s marked up like this:

    <pre>

    .B -cmd_scriptfR=fIFILENAME

    read keystroke commands from the specified file.

    You can use the data written using the fB-cmd_logfR option.

    Lynx will ignore other information which the command-logging may have

    written to the logfile.

    Each line of the command script contains either a comment beginning with "#",

    or a keyword:

    .RS 5

    .TP 5

    exit

    causes the script to stop, and forces lynx to exit immediately.

    .TP 5

    key

    the character value, in printable form.

    Cursor and other special keys are given as names, e.g., "Down Arrow".

    Printable 7-bit ASCII codes are given as-is,

    and hexadecimal values represent other 8-bit codes.

    .TP 5

    set

    followed by a "name=value" allows one to override values set in the

    lynx.cfg file.

    </pre>

    In other words, you wouldn’t need to consider the format when translating the stuff, you’d have a program to do that for you *after* it’s been translated.

  2. I see what you’re saying. That could probably solve some of the immediate problems, but you know what I’d really like to see though? The content being pulled from the help files instead & being formatted at runtime. I mean, this is basically UA and Help and Support Center already contains more detailed information for most of the contexts. Or if that’s not possible, why not let the netsh helper DLLs simply only contain this:

    Usage: %1!s! [name=]

    [[routing=](enabled|disabled|default)]

    Please refer to Help and Support for more information

    That would sure solve the problem for me anyway 🙂

    There are some benefits with the way things are right now though, like: the resources are normal win32 and we’re good at handling those; since the syntax messages and the functionality are bundled together, there’s less risk of them getting out of sync; it keeps me busy 🙂

  3. I really, really, hate it when a man page (or help text) is only a reference to another document, which usually can’t be seen in the same application as the one I’ve invoked the help command in. I do understand that it makes things easier for the people who make the documentation, but I still find it rude and it’s enough to make me choose another application/vendor where possible.

    It seems to me that the problem is that the help content is too intimately linked to the formatting. It shouldn’t be. If the help texts were created with any decent markup language, XML or whatever, the text would be completely independent of the formatting. You could then create different formats of the same content without affecting the content itself.

  4. Can’t say I disagree. That’s why I’d like to see the content being pulled from the help files instead.

    Then again, changing this in netshell wouldn’t help anyone but me, really. And there’s probably some value in me having full control over the formatting. So I will probably not see a change in these files.

    That said, if anyone is planning to include lengthy syntax messages in binaries the same way it’s done in netshell, please think about this for a while. It can really hurt your localizers.