Following up from my post yesterday, I thought I would take a look at how spammy each particular TLD is. At the moment, I only track 8 TLD’s – .cn, .ru, .com, .net, .org, .info, .biz and .name. To check to see which one is the spammiest, I took all of our post-IP blocked mail and determined how many times those messages occurred in email, and how many times that email was marked as spam. This marking occurs before the message is bifurcated into multiple recipients; if it happened afterwards, that could potentially skew the results because the amount of mail marked as spam by our content filter prior to bifurcation is about 1/3 of the email stream.
Anyhow, here are the results for how many times a message containing a particular URL is marked as spam (I omitted .name):
Looking at the numbers this way, the .ru domain is by far the spammiest domain as nearly every single message with a .ru in it is marked as spam. .cn has cleaned up its act this year but is still having problems. The .com domain is way below that in last place. Now, this does not necessarily mean that every message with a .com domain is clean, but rather, that we found characteristics in the mail such that the mail was likely to be non-spam rather than spam (we only count an occurrence of a domain once per message so if there are multiple .com’s per message, we only count it once). Looking at it this way it is clear that the .com TLD is actually one of the cleanest TLDs, the opposite of what McAfee’s report found.
However, this is not the best way to measure how risky the domain is. We should also measure prevalence. To do that, I counted up the total occurrences of a particular domain (i.e., their absolute count). I then multiplied the count by the % spam and then normalized the counts. The result is a Riskiness rating, with the table outlined below:
The way to interpret this table is that for every 1 message marked as spam that contained a .biz, 187 messages marked as spam contained a .com, 106 contained a .ru, and so forth. Going by this, the amount of .com’s that are spammy shoots straight to the top because while the proportion of abuse is smaller, the rate at which all kinds of spammers go for .com is very large. This chart illustrates that the .cn domain is still abused (lots of spammers pick it compared to non-spammers) but it just isn’t seen in the wild being abused in spam nearly as much as the .com domain. To put this another way, given a particular email message marked as spam that contains a domain, there is a 40% chance that the domain is a .com, and a 23% chance that it contains a .ru (assuming we only pick from these seven TLDs).
Going by this perspective, then the .com domain remains the most abused TLD but primarily because of its popularity with the general public, not necessarily because its security is lax. Lots of people use .com for legitimate purposes, whereas almost nobody uses .ru for legitimate purposes.