A bit more on the spam chief interview

Following on from my previous post on the interview with the spam chief at Yahoo, I thought I'd respond to a couple more things that Mark Risher said.

bartonas: What is the effect, if any, other than putting it back in my in-box, of me selecting "not spam" for an email in the spam folder?

Mark: We've got some incredibly sophisticated systems trying to analyze the messages our users mark as "spam" and "not spam." We're constantly analyzing the feedback from users like yourself to figure out how we can improve.  The effect of clicking "not spam" on a message is that it sends a powerful signal to our systems that we've made a mistake. That's one of the best ways we can learn, both to ensure that we don't block messages from that sender in the future, and that our systems shouldn't block similar messages next time.

Here in Exchange Hosted Services, we have a couple of ways to access your spam.  One is through the Spam Quarantine web interface, and if you click "Not Spam" a copy of the message goes to the spam team for analysis in addition to being salvaged to your inbox.  While there is some automated filtering and sorting done on the back-end, unlike Yahoo, our techniques are not quite as sophisticated.  We rely more on human analysis to make decisions.  We do this because, in my opinion, human analysis on false positives is more accurate.

Humans look at messages and adjust spam rules, but they also make determinations about whether or not a message is spam or not.  Most submissions sent to the false positive alias are actually spam, so a great deal of pre-processing is required before adjusting messages to separate the wheat from the chaff.  After that, the spam analyst makes a decision to release the message and updates the spam rules or reputation filters according.

opher: SMTP requires a confirmed IP address between the sending and receiving servers. That means spammers can spoof the NAME of the sending server, but not the IP address. Since Yahoo knows the IP address of all of their mail servers, why not validate the IP address and when it does not match, drop the spoofed email?

Mark: Yahoo! has been a pioneer in advancing e-mail authentication — the ability to conclusively identify that a message that says it comes from somebody really comes from that somebody — and was the inventor of the open source DomainKeys and DKIM technologies.  As we see the adoption of these technologies continue to take off, we’re exploring ways to take action against messages that “spoof” a Yahoo! origin. You're right that IP address is one of the few, truly trustworthy parts of an inbound spam message, and it's a major factor in our determination of whether a message is spam.

What opher is asking is if Yahoo knows its IP addresses and an email comes from Yahoo but is not sent from any of its IP addresses, why not reject the message?  This is basically SPF/SenderID, something that Yahoo does not do (or if they do, you certainly couldn't tell and they don't publish SPF records either).

Far be it from me to criticize another antispam company, but I think that this is a flaw in Yahoo's spam filtering service.  SPF is a pretty basic way to filter spam.  DomainKeys and DKIM don't cut it because both only say what to do in the case an email is authenticated; it says nothing about what to do if a message fails a DomainKeys/DKIM check, and it says nothing if a message should even be signed.  In fact, it says treat it as unsigned mail (ie, neither confirm nor deny).  I seriously doubt spammers would take the time to DKIM sign their mail, so using those two technologies to fight spam would have minimal impact unless you did a custom DomainKeys/DKIM implementation.

In my view, Risher's comment is a diplomatic way of saying "Yes, we should use SPF but we don't."  It's better for fighting spam than DKIM as the moment.

Comments (5)
  1. adamo says:

    You write:  "SPF is a pretty basic way to filter spam."

    No.  And I copy from the openspf.org FAQ:

    "It is about giving domain owners a way to say which mail sources are legitimate for their domain and which ones aren’t. While not all spam is forged, virtually all forgeries are spam. SPF is not anti-spam in the same way that flour is not food: it is part of the solution."

    A spammer (defined as spammer by the receivers of the messages) can have legitimate SPF records.

    You also write:  "I seriously doubt spammers would take the time to DKIM sign their mail"

    Which is why if you have DKIM) signed email incoming messages you may apply less filters on them, saving CPU load on your systems.

    FWIW, I use both on my installations.  I think that Yahoo! does not use SPF because they want to push DKIM.

  2. Frank says:

    adamo wrote: "I think that Yahoo! does not use SPF because they want to push DKIM." It would miss the point, because as you said these techniques are orthogonal and can be combined.

    The focus of DKIM if combined with a not yet ready "signing practices" add-on is anti-phishing, the focus of SPF FAIL is anti-backscatter. SenderID (PRA) also tried to tackle phishing, that could be at the core of this old dispute.

    In theory SPF allows you to guess the outbound IPs of Yahoo!, and as receiver you are free to consider your own guess as good enough to reject mails from other IPs.

    At some point in time legit senders hopefully see that  inspiring guesswork is not in their interest.

  3. tzink says:


    When I say SPF is a basic way to fight spam, I don’t mean that’s what it is designed for.  What I mean is that if you’re going to fight spam, doing SPF checks is one of the basic techniques that you should be using in your filters.

    Second, I have never been an advocate of doing less filtering on authenticated mail.  You can only do that by combining it with a reputation filter, that is, if you authenticate that mail comes from a trusted sender/domain, *then* you can apply less filters.

  4. Norman Diamond says:

    "I seriously doubt spammers would take the time to DKIM sign their mail"

    Why do you doubt that?  Surely whatever burden the spammers have to place on random victims’ computers that joined their botnets, they will do.

Comments are closed.

Skip to main content