Sender authentication part 8: Best-Guess SPF

I've had a document sitting on my shelf (ie, the window-sill 10 feet away from my desk) for about 6 months now just waiting to be read.  It's entitled Sender Repuration in a Large Webmail Service.  It's by Bradley Taylor, at Google, and is available to be read at the past documents from the Conference in Email and Antispam, 2006.

Anyways, I finally got around to reading it this week.  Like everyone else, Gmail uses a lot of Sender Authentication to do their filtering.  One of the ways they authenticate mail is with SPF, and for domains without SPF they use an algorithm they dub "Best-Guess SPF" which is meant to be a temporary measure until more domains come onboard and start publishing their SPF records.  They readily admit that the technique isn't perfect, but it's not bad, either.  Basically, it works in the following manner:

1. Check the domain of the envelope sender.   If it doesn't publish SPF records, then check the MX-records and A-records of the sender's domain.  If the sending domain comes from the same range of IPs as the MX-record or A-record, then the sender has been authenticated.

Example 1 (using fictitious numbers)

Transmitting IP = 4.8.15.16
Envelope sender = me@lost.com
A-record of lost.com = 4.8.15.11
MX-record of lost.com = 4.8.15.0/27 (4.8.15.0 - 4.8.15.31)

Since the transmitting IP is within the range of the MX-records (an abnormally large MX record, but hey, this example is fictitious), we have an authentication.

2. If that doesn't work, get the reverse DNS of the sending IP.   If it matches the domain of the envelope sender, then the sender has been authenticated.

Example 2

Transmitting IP = 4.8.15.16
Envelope sender = me@lost.com
Reverse DNS of 4.8.15.16 = lost.com

The reverse DNS name matches the name of the domain in the envelope sender, so the sender is authenticated.

Example 3

Transmitting IP = 16.23.42.108
Envelope sender = me@others.com
Reverse DNS of 16.23.42.108 = island.com

The reverse DNS name does not match the envelope sender, therefore, no sender authentication.

3. If that doesn't work, use a technique that is referred to as PTR zone.   If the sender is a subdomain of the DNS PTR's zone, then it is authenticated as if the sender comes from the zone itself.  The example given in the document where I discovered this seems a bit backwards, so I'm going to clean it up a bit in order to conform to the description given.

Example 4

Transmitting IP = 16.23.42.108
Envelope sender = me@island.others.com
Reverse DNS of 16.23.42.108 = domain in PTR zone = airplane.others.com

This is close, but not an authentication.  The envelope sender (island.others.com) is not a subdomain of the domain in the PTR zone (airplane.others.com).

Example 5

Transmitting IP = 16.23.42.108
Envelope sender = me@island.others.com
Reverse DNS of 16.23.42.108 = domain in PTR zone = others.com

The domain of the sender (island.others.com) is a subdomain of others.com, and therefore we have an authentication.

Using this extra bit of authentication allows Gmail to authenticate almost twice as much mail as a standard SPF check.  That's actually pretty good.  As to whether or not this is a good idea, OpenSPF has this to say about it:

Best-guess processing is a crude, non-standard attempt at guessing the IP address range of a domain's outgoing mailservers.  "Non-standard" means it is not standardized and specific to the implementation.

...

Some find this remarkably good at detecting unforged messages from domains that have not yet published SPF records. Others consider it a security hole because it gives attackers a lot of additional potential targets (authorized hosts) to hack in order to abuse the domain.

From an anti-spam perspective, I think sender authentication is a good idea but it all depends on how it is used.  In my opinion, successful authentication is best used in conjunction with safe senders.  I first voiced my opinion a few weeks ago when I thought it was a security risk.  However, at the time I don't think I was thinking ahead; I think the way to implement a safelist is to allow a sender to be on a safelist (ie, bypass spam filtering) if the sender can be authenticated. That way, spoofing the sender doesn't work and if they do start spamming, there's a much more reliable paper trail.