As I was saying in my previous post, I thought I'd examine a little bit between the differences and similarities between spam researchers and virus researchers.
The two classes share some similarities. Both are involved in stopping Internet abuse and both are constantly creating signatures to update the latest antispam and antivirus engines. When processing spam, we rely on a lot of heuristics when examining messages. Given a particular spam message, we often say things like:
- Oh, that's phishing, I can tell by looking at the link in the message in that while it appears to go to Bank of America, by hovering over the mouse we can see that it goes to some site in Poland.
- That's an obfuscation of the word v.i^ag-r@
- This is clearly a 419 Nigerian money laundering scam, we simply look at the fact that the words "Ivory Coast", "please send a sum of money", etc, are in the context of the message.
In other words, we have seen so many of these before that we can quickly and easily ascertain if a message is legitimate or not. Most spam analysts have been doing it for so long that they can even tell the difference between legitimate and spam messages if the message is in a language they cannot speak. Sometimes, they can even write rules on these messages.
Virus researchers are similar. Given a particular virus and seeing it execute, by years of experience they can tell that such-and-such virus is executing such-and-such process in the operating system kernel, it is exploiting such-and-such flaw in this particular version of Microsoft office, or it is using a security hole patched last October 2007 in the Internet Explorer 6 browser. In other words, spam analysts can recognize spam based upon the language of the contents of the message, virus analysts can recognize exploits based upon observing the processes that the viruses execute on the native operating system.
It takes a great deal of intuition to be able to recognize this stuff. It also takes a great deal of care to know what is legitimate and what is not. An experienced virus researcher has to make sure not to create a signature based upon an identification block that occurs in every single Microsoft Excel file - that would cause thousands upon thousands of false positives. Similarly to spam analysts, there are lots of patterns that are legitimate that occur in commonly exploited file types (like zips, exe's, xls's, and so forth). With enough experience, a virus researcher knows enough to avoid those constructs within them; however, it takes a long period of time to ramp somebody up and instill in them that body of knowledge.