Here at Microsoft, we quite regularly see people rotating in and out of various divisions. A new person will join and another person will leave. Recently we had another anti-spam manager join the group. Since I’ve been in the antispam product the longest in our division (who is currently still serving in the spam forces), he came and had a chat with me. I didn’t feel too special, he’s going around talking to everyone and not just myself.
I gave him an overview of what I do and how each of our various filters in the pipeline work. He asked me why I moved from a meager spam analyst to a Program Manager. I thought it over; when the role was offered to me, I was free to reject it if I so chose. But, I did not. The reason I took it is because it would give me the chance to implement a bunch of projects that I thought would really improve our service but were currently lacking.
I think that one of the things that I bring to the table is my emphasis on the Undo action. In Windows systems (at least), you can undo your last action by pressing Ctrl+Z. In spam filtering, we need the same thing. Well, not quite the same thing, we need the ability to easily fix mistakes.
For each spam filter component you use, you need a way to fix false positives. Because I did them for so long, I have become extremely sensitive to false positives and FP evasion. We have implemented a bunch of new spam filtering components without initial regard for how to fix a false positive caused by that component. Every FP processing component has always been developed as an afterthought. We have always had the ability to adjust the aggressiveness of the spam component, but fixing mistakes has always trailed the process.
An example would be a fingerprinting mechanism. We implemented a way of doing MD5 hashes on messages a few years ago; messages that matched these hashes would get marked as spam. This works so long as the central database of spam hashes are indeed all spam. As is inevitable in every new system, false positives occurred. I can’t tell you how many times I have heard "This new system won’t cause false positives" only to see that it does. Fingerprint MD5 false positives are not the end of the world so long as the mechanism for dealing with them is there.
In the case of a fingerprint mechanism, let’s think this through: assume that the fingerprints are created automatically by running a spam feed through a fingerprinting mechanism, and this mechanism then transports them to a central database. The spam filter hashes individual MIME parts and checks them against the central database. To fix a false positive, you need the following:
- A way of getting the FP from the user to the spam team for analysis
- A way of verifying that the FP the user sent is valid and not a spam submission
- A way of determining which MIME part caused the false positive if a message contains multiple MIME parts
- A way of determining which spam feed is responsible for the false positive
- A way of creating a whiteprint fingerprint on the local processing machine
- A way of transmitting that whiteprint fingerprint to the central database from the local processing machine
I’m not saying that any of these things are inherently difficult to do, but they typically come up as afterthoughts. We in the industry are often excited to bring new technology to the surface and we naturally assume that it will always block spam and never legitimate mail. We don’t consider the possibility that the spam feeds are polluted (with clean mail).
In my experience, whenever I design a new spam filtering component, I always consider both cases in the initial design: how to block spam and the flip side, how to undo mistakes.