Transliteration Utility freely downloadable

[Version française ici]

Two colleagues from my group (Nick Cipollone and Andrea Jessee) very recently developed a tool called Transliteration Utility which allows you to convert one natural language script to another (like Serbian Latin to Serbian Cyrillic or Latin characters to Inuktitut). The tool, which uses a simple but powerful rule language, can also be used to create, edit, debug, and test your own natural language transliteration modules to convert one script to another.

It can be used either by

   1. Typing in one script in a field, which it will convert on the fly;

   2. Copying and Pasting text in a field, which it will convert automatically;

   3. Giving it a whole Unicode text file to convert;

   4. Converting a list of Unicode files by using its Command Line Interface.

A key feature of the tool is its Module Development Console, which allows anyone to author, edit, and/or test new or existing transliteration modules.

Microsoft Transliteration Utility is freely available for public download at

It comes with nine modules ready for use (and you can create your own modules):

Bosnian Cyrillic to Latin

Bosnian Latin to Cyrillic 

Serbian Cyrillic to Latin

Serbian Latin to Cyrillic

Hangul to Romanization

Inuktitut to Romanization

Romanization to Inuktitut

Malayalam to Romanization

Romanization to Malayalam


This is really a cool tool or, to say it in Malayalam script, ഠിസ് ഇസ് രെഅല്ല്യ് ചോല്റ്റോല്‍, or in Cyrillic script: Тхис ис реаллy а цоол тоол!


Thierry Fontenelle

Microsoft Speech & Natural Language


Comments (5)

  1. Leetia Janes says:

    Very good tool indeed…

  2. Regular reader KJK:Hyperion asked in the Suggestion Box:

    …when will Transliteration Utility support…

  3. dennispg says:

    not a big deal or anything.. i just thought it was funny and that id point out.. that the output doesnt really represent what it was input from. cant speak for the cyrillic, but i imagine the same must be true for that too.

    if you sound it out it actually sounds like "this is ray ahlly a chole tole"

    the reason for this is that the module is following some standard such as ITRANS and all the letters have a standard mapping.. for example, c doesnt go to "ക" like in crow it goes to "ച" like in church.

    the word "cool" to be properly transliterated should be input as "kuul"

    anyways just thought it was funny…

    speaking of the module though.. where is that? id sure like to tweak it to produce more natural transliterations in english script than having random capital letters in the middle of words as ITRANS produces.. such a module would probably be nothing like whats used here, but itd sure be nice to have an example to work from none the less…

    what gives, why arent these modules included anywhere? are they embedded as resources in the assembly? actually.. hmm, maybe ill try there next…

  4. [ English version here ] Deux collègues de mon groupe ( Nick Cipollone and Andrea Jessee ) viennent tout

  5. George says:

    Un autre utilitaire plus facile a utiliser:…/cyrillic.html