Spam continues to drop

Below is a chart that shows the amount of inbound mail that we see, both spam and non-spam, over the past three and a half years.  This data also appears in the Microsoft Security Intelligence Report, but the data there is monthly (or half-yearly) whereas this data is weekly:

image

The charts are normalized to show the scale (i.e., the left hand scale is not 35,000 messages, but is 35,000 x some number).  In addition, the spam in red is plotted against the left Y-axis and the good mail in blue is plotted against the right Y-axis.

You can see in the above that the amount of good mail that we see has continued to increase over time.  This is because of an increased customer base, not because the total amount of good mail worldwide has gone up (although it has increased marginally as more and more people start using the Internet).  However, the amount of spam has plummeted from 23,000 in mid 2010 to 5000 now, a drop of over 75%.  The contrast couldn’t be starker – spammers are not spamming as much anymore. 

It almost looks like the battle against spam is almost over.  What’s still left to do?

Here’s a couple of things that are unique to spam and not other forms of communication:

  1. Generic bulk mail – this is a category of mail that is not quite spam but is definitely not legitimate.  It’s gray and is usually a dark shade of gray.  These are mailers that harvest list from other places or populate their lists in shady ways (single opt-in, tossing your business card into a bowl at a conference, and so forth).  These are mailers that cannot be blocked across an entire organization because there is some set of users who desire the mail.

    In other words, the mailers that can’t be bothered to be responsible are still problematic.

  2. Foreign language mail – When I say “foreign language” I mean mail in a language that is other than English.  I see a lot of complaints these days about Chinese spam, Japanese spam, Turkish spam, Portuguese spam and Spanish spam.  I don’t know what is it about spam in those languages, but they are more resistant to IP filtering than English language spam. 

    Writing spam rules and processing the stuff has been a challenge right since the day I joined, but I definitely see an uptick in it compared to a year ago at this time.

  3. Spear phishing – I debated putting generic phishing in here, but generic phishing is dealt with using regular antispam techniques (URL filtering, IP filtering, and content and keyword filtering).  But as spammers have moved from a “throw everything against the wall and see what sticks” mechanism, they have embraced the “target your prey and slip under the radar” model.  They are better at crafting their spam in order to deceive users, no doubt in part because of the proliferation of the Zeus botnet and malware kit.

    Spear phishing is not something that spam filters are going to be good at the way they are at pharmaceutical spam or stock spam.  Because spear phishers are actively trying to craft their content in order to get around one organization’s filters, a company must use both spam filtering and user education.

Eventually the first two will be handled.  Pesky bulk mailers will see their reputations dwindle down to nothing and they will get added to blocklists along with everyone else.  The second will be handled in the same way – as the spam traps start to attract more and more foreign language spam, they will populate their lists from URLs pointing to Portuguese spam sites, or IPs sending high volumes of spam.

The third is the most difficult.  Filters will continue to update quickly but products other than spam filters will be required in order to prevent these, such as traffic analysis tools and intrusion detection software.  That will open up a whole new niche for security vendors but will likely be plagued by even less collaboration than there is now (would Microsoft want to share their infrastructure layout with Google? I think not, nor vice versa).

That will take some creative thinking and is probably the next big trend in security.