Referral Spam and Movable Type Blacklist

Well, just in time for a wave of referral spam that is hitting my blog (mostly from I spent part of today writing a class that can consume the Movable Type Blacklist. The class will allow you to download this file from the server periodically (no more than once a day). I have written it such that anyone can integrate this into their .Net blogging package, or any other .Net program. I just checked this into the dasBlog 1.7 tree. The nice thing about this is that the Blacklist is maintained in real time, and you won’t have to rely just on content filtering (the stuff that Scott did) but you’ll get a pretty long and decent blacklist of bad sites. So far, in the past few hours I’ve gotten 100% of the referral spam and no false positives…

We are a few days away from releasing the final version of dasBlog 1.7. A very small number of folks have been running the bits over the weekend and as a result we’ve fixed a few bugs. A couple more days and we’ll post the bits to SourceForge.

When that happens I’ll post the MovableTypeBlacklist class. I’ve also considered writing an HttpModule to send these guys 404s, but didn’t really think that was appropriate. The list is basically loaded into a long string, delimited by “|” and passed into a Regex to match a url. Interestingly enough, when I tried to Compile the Regex, my little console app balooned to 150 MB and it never quite finished running. Using a static Regex with the long static string I was able to execute matches in 0 – 10 milliseconds.

Here is a dump of the class: : True
Executed in : 20 milliseconds : False
Executed in : 0 milliseconds : False
Executed in : 0 milliseconds : True
Executed in : 0 milliseconds : True
Executed in : 0 milliseconds : True
Executed in : 10 milliseconds

Comments (3)

  1. You may also want to look at utilising the surbl blacklist. Originally put up for spam assassin and other email tools it’s a list of commonly email spammed URLs. Like all other DNS based real time blacklists you query it via DNS lookups. You may of course want to build a local results cache, and not just rely on the caching the DNS client provides.

    If we can get bloggers reporting to that central resource as well as anti-email spammers then it becomes even more useful. Currently none of the URLs you mentioned above are listed.

  2. tom sherman says:


    I recognize referrer spam as a growing annoyance, and I’ve written a proposal for how to stop it. I’d love to hear your input. Thanks.

  3. jotsheet says:

    Referrer (or referer) spam has become a serious problem in the blogosphere. We need an intelligent way to eliminate this growing nuisance. I’ve thought about this for the past few days, and below I offer a proposal for a technological solution to this problem. It requires programming, and I am not a programmer, so I welcome suggestions, corrections, and improvements to this proposal….