A new(ish) spam in my inbox

I got a couple of spam messages in my Gmail inbox the other day.  Don’t let that come as a surprise to you (the fact that I have a Gmail account), I have multiple email accounts.  I have Gmail, Yahoo and Hotmail, as well as my Frontbridge and Microsoft accounts.  I use them all.  Very handy for research and comparison purposes.  I just use Gmail the most because it lets me POP my mail for free.

Anyhow, I did get some spam in my Gmail inbox.  It’s interesting because it is heavily obfuscated:

Dear friend : 

  I would like to introduce you a very good company and its website is (w w w).<something>.(c o m). It can offer you all kinds of electronic products that you may be in need,such as laptops ,gps ,TV LCD,cell  phones,ps3,MP3/4,motorcycles and etc……..

You can take some time to have a check ,there must be something interesting you ‘d like to  purchase .

The contact:  e m a i l  : something( @ ) <something>.( c o m )
              w e b s i t e:  (w w w).<something>.(c o m) 

  Hope you can enjoy yourself in shopping from that company !


First of all, with apologies to South Park, I’m not your friend, buddy. 

Second, this is a heavily obfuscated message with particularly special attention being done to hide the URL.  Rather than inserting it as a hyperlink, the spammer is clearly trying to avoid a URL reputation list.  This is where a spam filter would extract the URLs from a message and check them against an URL reputation list like URIBL, SURBL in Invaluement.  By hiding the hyperlink and obfuscating with characters, the spammer can hope to avoid the auto-harvesting of a primitive URL extractor that looks for http://…  Of course, a good spam rules engine would not only be able extract that URL, but also be able to detect that the spammer is obfuscating the URL and assign a spam weight to that characteristic.

The last bit about this piece of spam is that it highlights the necessity, still, for content filtering.  Even though reputation filtering can knock out 90% of your inbound spam, there is still a requirement for content filtering.  The reason is that reputation filtering could not have blocked this message (well, not without a ton of false positives).

This message came from an account in Hotmail.  The spammer (or rather, the IP — almost guaranteed to be a bot) was located in China and accessed the account.  They then sent spam to me.  Because Gmail sees Hotmail’s IP, they realize that they cannot block it straight up because that would also block far too much Hotmail-to-Gmail email conversations.  Neither DKIM nor a blocklist would help in this regard.  The only way to block it is for a content filter to look over the contents of the message and attempt to detect that it is spam.

The one caveat to the above is that it still may have been possible to use reputation filtering.  Hotmail does not hide the accessing IP of the sender, that is, it records where they logged in from.  This IP was logged in the following header:

X-Originating-IP: []

If you take this IP and use your handy-dandy geolocation tool, you’ll know that it is located in Beijing, China.  Furthermore, you know that this IP is currently on Spamhaus’ PBL.  So, in theory, you could look for and extract this IP and block based on the fact that it is listed on a public DNSBL.

The weakness is, of course, that you need to have that esoteric knowledge that Hotmail puts the accessing IP in that header.  In addition, in this case, you also need to know that you are checking an IP that is not the Gateway IP against the PBL.  PBL states that you should only check the accessing IP against the PBL, and this is not the accessing IP, it is the IP that accessed Hotmail.  So, there is a risk in using it that way. 

In my opinion, content analysis is the best way to handle a situation like this.

Comments (4)

  1. James says:

    Hang on — the PBL tries to list all consumer IP space. That means that almost all normal email sent from a home Internet connection using the web interface will have an X-Originating-IP address on the PBL.

    Which makes the combination of that header and that DNSBL virtually useless for stopping spam.

    For what it’s worth, SpamAssassin knows about that X-Originating-IP header and checks it against a limited set of DNSBLs.

  2. tzink says:

    Correct, James.  That was my point.  It is virtually useless, although you could use it that way if you wanted to do geolocation and if the IP is based in a certain country, do a PBL lookup.

  3. Nicolas Perrin says:

    With yahoo account the first ip smtp relay is virtual and is corresponding to the sender ip.

    I think some sort of reverse reputation filter would be a good way to improve the catch rate.

    The message id header is also a good way for spam from webmail.