When it comes to spam filtering, it's the intangibles that make the difference.

Simply running a spam filter and hoping that it will filter out all of your spam is unrealistic because after years of experience, I have learned that defense-in-depth, when it comes to filtering, is the best option.  Using an IP blacklist is going to help with filtering but using it by itself will not give you the performance you need.  Using regular expressions as your content filter is useful but by themselves will not scale to the level you need because it requires humans to constantly monitor and update those regexes.

This comes back to what I call the "intangibles."  I define the intangibles as the little things that make the difference.  An example are regular expressions; foreign mail operates on a different set of rules than English language mail.  In English, spammers quite regularly use obfuscation techniques to evade filters.  Viagra becomes \/i@gr@, stock becomes 5t0ck, and so forth. 

Foreign languages are different than the English language.  In English, we use a lot of slang to communicate with one another.  Thus, obfuscated words in English make sense to use because we transliterate the obfuscated character with its actual equivalent.  However, in foreign languages, the slang is different.  Do we use figures of speech?  Does obfuscation work the same way?  Do spammy words in one language collide with legitimate words in another (slut in Swedish is finished in English)?

My current position on content filtering is that it should be used to fill in the gaps of more automated filtering.  My specialty is reputation filtering, but it doesn't get it all.  That's where content filtering comes in.  Regular expressions, written in foreign languages -- that account for local idiomatic expressions -- are what separate the mediocre services from the good ones.  Filling in those gaps requires human analysis and if you're willing to spend the time to create highly targeted phrases that are specific to certain languages, that's something you can use to differentiate yourself in the marketplace.

