E-mail Order Brideski

Thanks to the Exchange Team, 90-95% of spam never sees the light of my Inbox. Of the few per day that make it through: crafty viagra ads, begging sons of wealthy African royalty, and ambitious Russian college grads, I give them little more than a glace before hitting the Del button.

Today, an email from “Tayana” caught my eye. I’m not in the market for a mail order Russian bride but I am a word hound and a Scrabble fiend. This email cracked me up. Is this what happens when you ask a Russian script kiddie to implement an iDictionary interface…and leave it at that?

“Hello my hope!
I am not sure you get this message but if you got I want you to know that I want to travel to your country to work in two weeks and I just want to meet right man. I live in Russia and my goal is to leave this country because it is impossible to live here for young pretty woman. If you have not wife or girlfriend ,maybe we could try to meet? I am  Tayana ,I am  25 years old ,please write to me directly to my mail. See you soon!!!

teethed comet blister bankruptcy enable hesitate handyman tioga droopy moldavia drosophila borax swastika downstate cricket attend leverage parallax defuse canterbury jenkins flaxseed rick icy desecrate cindy biota sophia d’oeuvre recurred thereat alderman handel depend caustic shakespeare admit brian locale deactivate rotary rowley mixup cognizant simmons decompression makeshift throwaway cold congestion behalf vacuous yelp biology elastic antwerp citric ache danubian perihelion decadent sedge dynast burlesque dulse fredrickson envoy textural arroyo pageantry species delivery nihilism degrease sculptor mirage bundy egypt pugh filipino messrs nucleus hear abreact blush cramp unchristian embroidery befuddle.

Befuddle, indeed. Why would a spammer cram all these strange words into an email body?

Comments (8)

  1. EricB says:


    My understanding of why recent spam has lines (sometimes MANY) of un-related and various words at the bottom of the message is to "fool" the baysien (sp?) filters. The theory is that the extra "good" words will outweigh the "bad" words, and therefore, the message will get through without being caught as spam.



  2. geoff.appleby says:

    EricB: Damn. That’s what I was going to say.

    Except you said it a lot more eloquently than I could have. 🙂

  3. CRathjen says:

    It’s weighing, but not quite in that sense – each person’s Bayesian filter has a set of words and scores, built up based on their identified good and spam mail. So, if Korby has a filter, terms like "Korby", "Microsoft", "net" etc. might all have strong ‘good’ scores, while terms like ‘viagra’ and ‘nigerian princess’ would have strong ‘spam’ scores. Only the most frequent X (100 ish?) elements are included in each list (or only words appearing at least X times in the set of scored mails), to keep some confidence in the accuracy of the word being ‘good’ or ‘spammy’. Many filters don’t just use the message body, either – header elements like the sender ID field might get a good score while certain questionable domains or TLDs might get a spam score.

    This is an attempt to ‘get lucky’ and get a word or two that scored very highly, without too much risk of including words that scored low. Most of these words wouldn’t have a score at all — too uncommon — but if you do use any one of them frequently, it’s probably due to ‘good’ email rather than spam.

    Unless, of course, you get too many spam mails using the same set of words…

  4. MSDNArchive says:

    Chris, the breadth of your knowledge and interests never ceases to amaze me. 🙂