It’s been a long time since I took the unit on encryption in my 4th year Telecommunications class in university, but I did quite well in it (I believe I got 5/5 on the assignment). For you see, the concept of encryption is relevant to our next section on email authentication: DomainKeys, a method of authentication created by Yahoo.
Before we get into DomainKeys, we first need to understand the basics of encryption (please note that this is not intended to be an extensive tutorial on encryption, I only plan to skim over it). In the olden days, people needed ways of communicating messages securely between one another. At first, they could send their trusted companions on horseback. For example, a king would send a message with his assistant to the general out on the front lines. As time passed, people started sending messages electronically because this was much quicker and you could push through more data in a shorter period of time. But, the problem was security; if the message in transit was sensitive, if somebody intercepted the message then the secret information was no longer secret.
The idea behind encryption to encode the contents of the message such that even if the message is intercepted in transit, the person who intercepted it would be unable to read its contents. Consider the following message:
Ifmmp, J bn bo fodszqufe nfttbhf.
This text appears to be a bunch of gobble-de-gook (hmm, not unlike the hashbusters spammers put at the bottom of some of their messages), but it is actually an example of a substitution cipher. The key is that each letter is actually the subsequent letter in the alphabet, in other words, B is substituted for A, C is switched for B, and so forth. For the above, the message decrypted is the following:
Hello, I am an encrypted message.
Different types of substitutions can be used. Above, I used a 1-character algorithm, but others can be used like a 3-character substitution or an 11-character substitution. A 3-character substitution would be the following:
khoor, L dp dq hqfubswhg phvvdjh.
The thing about substitution ciphers is that they are very easy to break. The more text you have, the more you can use statistical analysis to break the cipher. For example, in the English language, the most common letter is the letter ‘e’. So, what you would do is look for the letter that occurs the most often and that would be a pretty good guess that it is the letter ‘e’. You could then look for a bunch of 3-letter-words and make a guess that the first letter is ‘t’ and the second letter is ‘h’. In this way, you’ve guessed the letters for the word ‘the’. Other commonly occurring consonants are r, s, l and n. Small, two-letter words are likely to be words such as in, of, on, at, it, and so forth. Once you start getting the smaller words you can use a process of elimination to work your way backwards to find the rest of the letters. Sometimes it is a process of trial-and-error to find the words that fit, but with enough iterations you can do it.
Computers are very good at iterating algorithms to find out patterns like this. You can use a more complicated algorithm in the substitution cipher by doing things like the first letter is one letter out, the second is two letters, the third is three letters and then repeat the sequence. However, given enough time, a computer could break this algorithm as well. It probably wouldn’t even take very long because substitution ciphers that work by switching out one letter for another are not complicated to reverse engineer. We need to find a better algorithm.