The flip side of outbound spam control

Over the past few years, I have written numerous blog posts about controlling outbound spam.  Here’s a summary of what we do:

  • We look for mailers who send high volumes of mail that are marked as spam.
  • We look for mailers who send sudden bursts of traffic.
  • We do not permit outbound commercial bulk mailing.

Because of this, we manage to keep outbound spamming under control.  Automated algorithms look for suspicious traffic and block automatically and send alerts.  I see these alerts but for the most part, there is little involvement from the engineering team.  We occasionally block accounts that we detect manually but that is abnormal.

The problem is implementing volume throttles.

Blocking large volumes of mail results in false positives.  We do this because sometimes we see zero-day spam that the spam filter does not catch and by the time we react to it, the damage has been done (spam runs used to last for three hours, and they still do but they vary in length from several minutes to a couple of hours).  For years, I resisted throttles based purely upon volume because I was concerned that we would endlessly manage a bunch of one-off exceptions.  This particular sender has extenuating circumstances – it’s a university that has to send communications to its students.  Or another sender has a mail tracking campaign and sends out messages but has done everything right (double opt-in, builds lists properly, doesn’t spam, etc).  Or yet another mailer has been sending large volumes of mail for years and suddenly has the rug pulled out from under them; in my view, it is unfair to change the rules on existing customers although new ones are fair game (and high risk senders who get compromised all the time).

The point is that there are lots of reasons to do exceptions because it is a legitimate mail scenario that does not involve commercial bulk mailing.  Instead, these situations occur because people have to communicate with large audiences from time to time about directly relevant information.  If you work for a big company, sometimes your CEO sends mail to everyone in the company.  And so forth.

And just like I predicted, because of our expanding customer base (moving into the edu space), we are now more and more managing one-off exceptions.  This is exactly what I thought would happen.

Here is the situation:

  • Relax the restrictions and run the risk of sending outbound spam, thereby reducing deliverability.

  • Tighten up the restrictions and forever generate a whitelist of special users who need to send mail (and risk alienating other customers who do not fall into our special exceptions).

Looking at both of these, it is more work to get off a blocklist than it is to run a whitelist.  On the other hand, I am conflicted because the process is manual and a step in the opposite direction.  I want to go from manual --> automatic instead of automatic –> manual.   I am loathe to automate this process of whitelisting because when you automate a solution, spammers  abuse it.  That’s exactly what we don’t want to happen – implement rate controls that constricts the legitimate people who obey it while the spammers automate around it.

In the end, we can’t make everyone happy.  Perhaps that is the sign of a successful solution.