Now for Something a Little Different: Translating Technet Wiki Articles into Spanish

I've read quite a few posts on the Wiki Ninjas blog discussing translating articles from English into other languages, mainly Italian and Brazilian Portuguese. I checked on Wiki: Non-English Language Title Guidelines, and there didn't seem to be a lot of translations about Entity Framework into Spanish (despite the copious translations by Ed Price on other subjects). Since I speak Spanish pretty well, having gone to 2ary school in Chile for 3 years, I decided to try translating the Entity Framework FAQs into Spanish. This is the story of how that journey went.

The first step is to use the Translation Widget on the wiki English page to create an automated translation. Next, you hand-correct it, since the translation quality varies widely. I had some initial unclarity over the exact mechanics of making this work: but once I figured it out (I may edit the wiki article on how to use the widget), it went smoothly: really, no different than learning any other new technology.

One bit of advice I was given was to write simple, uncomplicated sentence structures. Since I was translating an existing article, this wasn't an option. I could have rewritten the English version, but you don't know in advance what is going to cause problems, so are likely to do more work than is needed.

A problem in translation is technical terms. For example, a common term in the Entity Framework area is "Plain Old CLR Objects"", and you can just imagine what the Widget did with this: colloquial terms like "plain old" are hard to translate. Here at Microsoft we have a centralized glossary of technical terms and their translations into a variety of languages. Having this available both speeded up my manual translating significantly, and it also ensured consistency. If you don't work at Microsoft, and need to do a significant amount of translating, I advise compiling such a glossary.

Another problem is slang. For example, "good to go"' gets translated literally, word for word. As you likely know, slang is specific to a country or region. As a high school student in Chile, my friends and I produced some hilarious results in both languages because we didn't realize this at first.

The bottom line is that the Widget saves you a whole lot of time. Despite the occasional poor quality of translation, it is much faster than doing it entirely by hand. And if you have a technical glossary available, things go even faster.

 Examples of Translation Problems (Warning: non-English appearing soon)

How good is the automated translation? Here are a few examples of problematic results.

A general problem seems to be the amount of context that the Widget takes into account. It appears to have a narrow window of just a few words surrounding any given word, and so sometimes loses the overall syntax. At times it looks like the Widget has done a word for word translation. But often it does a good, or adequate, job.

Another problem is ambiguity. Here is a sentence from one of the FAQs:

For more details, and a full example of reading entities from a stored procedure, seë:

The word ""reading"" is clearly a verb, however the translator interprets it as an adjective modfiying ""entities"". So it evidently thinks that there are entities, and then there are special ""reading entities"". Here's the translation it produces:

Para obtener más detalles y un ejemplo completo de entidades de lectura desde un procedimiento almacenado, consulte:

The sentence also sounds slightly off, so I corrected the translation to this:

Para obtener información más detallada y un ejemplo completo que explica como leer entidades usando un procedimiento almacenado, consulte:

And then, every so often the widget seems to get very random. Here's a sentence where I can't really deduce what it did.

This is something we considered but in the end rejected

which turns into

Esto es algo que considera pero al final rechazada

For some reason the widget forgot how to conjugate verbs properly, and since the sentence isn't that complex, I can't really guess why.

For some more examples, use the widget on this page to translate this posting into a language you know. That should provide you with a good feel for the quality of what gets produced.