スパムフィルターと外国のスパム、第5部 - Japanese

Japanese is a complex language.  It's so complex that I thought I'd dedicate an entire post to do a very brief overview of it.  If you speak Japanese and something I say is incorrect, it's because I don't speak the language.  You may correct me if you so choose, and I shall reply "Ah so desu ka."

The Japanese language is written with a combination of three different types of scripts: modified Chinese characters called kanji (漢字), and two syllabic scripts made up of modified Chinese characters, hiragana (平仮名) and katakana (片仮名).  The Roman alphabet script is called romaji.

Kanji

Whereas western languages have an alphabet, and other languages such as Russian have a different alphabet but at least preserve the concept of one, Japanese uses characters.  These characters can represent ideas which have meaning in and of themselves.  Rather than rearranging letters into letters into thousands and thousands of permutation to represent thoughts, character based languages use thousands of characters, and combinations thereof, to give meaning.

Chinese characters first came to Japan on articles imported from China. These characters, called kanji, were borrowed around the 6th century.  It also has symbolic letters, which are actually syllabaries-- each one is a syllable, such as ka, ho, or mi. There are two complete syllabaries-- two ways of writing the same set of sounds--and together they are called kana. (Separately, they are called hiragana and katakana).

Between 5,000 and 10,000 Chinese characters, or kanji, are used in written Japanese. In 1981 in an effort to make it easier to read and write Japanese, the Japanese government introduced the jōyō kanji hyō (List of Chinese Characters for General Use), which includes 1,945 regular characters, plus 166 special characters used only for people's names. All government documents, newspapers, textbooks and other publications for non-specialists use only the these kanji. Writers of other material are free to use whatever kanji they want.

In modern Japanese, kanji are used to write parts of the language such as nouns, adjective stems and verb stems.

Katakana

The katakana syllabary was derived from abbreviated Chinese characters used by Buddhist monks to indicate the correct pronunciations of Chinese texts in the 9th century.  Katakana are characterized by short, straight strokes and angular corners, and are the simplest of the Japanese scripts.  I think (but don't know for sure) that the title of this post is in katakana, or at least part of it is.  Consisting of 46 letters, katakana can be distinguished from hiragana by its simpler signs compared to the kanji.

In modern Japanese, katakana are most often used for transcription of words from foreign languages.  Many of the foreign words are English, and can be recognized by sounding out the katakana. For instance, there is biiru (beer), and aisukuriimu (ice cream).  There are also words from other foreign languages.  For example, abekku means "couple", coming from the French avec (with). Arubaito o suru means "to work one's way through school." This comes from the German verb arbeiten, to work.

Katakana has come to be used for very common Japanese words (not foreign), especially when the kanji is difficult. The idea is that the word is so well-known that the kanji is not needed to distinguish the meaning. For example, sushi is sometimes written in katakana.

Hiragana

The hiragana syllabary consists of 46 symbols and is mainly used to write word endings, known as okurigana in Japanese. Hiragana are also widely used in materials for children, textbooks, animation and comic books, to write Japanese words which are not normally written with kanji, such as adverbs and some nouns and adjectives, or for words whose kanji are obscure or obsolete. 

Hiragana characters are often written next to unusual kanji characters to show their pronunciation in the same way that we have added roman characters to the sentence above. In this case the hiragana characters are referred to as furigana or yomigana. In addition, hiragana is also used to write native Japanese words that have no kanji of their own.

Hiragana syllables developed from Chinese characters. Hiragana were originally called onnade or 'women's hand' as were used mainly by women - men wrote in kanji and katakana. By the 10th century, hiragana were used by everybody. The word hiragana means "ordinary syllabic script".

In early versions of hiragana there were often many different characters to represent the same syllable, however the system was eventually simplified so that there was a one-to-one relationship between spoken and written syllables. The present orthography of hiragana was codified by the Japanese government in 1946. 


Japanese really is an interesting language, and by interesting I mean complicated.  Given how many symbols there are in the language the multiple ways of expressing thoughts, it's somewhat surprising that spammers haven't exploited it more than they already have.