Digital Media Library of Babel


Not too long ago I read an intriguing short story by the Argentinean author Jorge Luis Borges called The Library of Babel (at least that’s the title of the English translation). The story tells of a honeycomb-shaped library filled with books that each contain 410 pages – in fact the library has a copy of every single book of that specific length – since each book is a combination of a fixed set of symbols repeating in different patterns over a fixed number of pages, the number of possible books is finite. Of course, most of the books make absolutely no sense to anyone, but hidden in the library are some books that do make sense (not that anyone in the story ever managed to find one of these) – in fact every possible book that makes sense (within the page limits) must be in that library somewhere.


The Library of Babel was written around 1944, and while the writing style takes a bit of getting used to, it’s still a great read. We like to think of human imagination as being without bounds, but the fact that the result of this information can be encoded in a way that is finite and enumerable presents quite a paradox. While Borges’s example was around books, the exact same logic can be applied in a number of other creative pursuits through the help of digital media. Think of all of the text, music, photos and movies you’ve seen on the internet or on your own computer – the photo of your kids, the Beatles singing A Day in the Life, the Mona Lisa or this month’s centerfold – all of these are encoded as a finite length of ones and zeros. Given some arbitrary size limit, say any file up to 5Mb in size, it would be possible to enumerate every possible combination of ones and zeros. Now let’s throw away the overwhelming majority of the files that we can’t make any sense of, and concentrate only one the ones that comply with well-known file formats such as JPEG or WMA. This little collection of ours now contains not only every image and song that has ever been produced, it contains every image and song that could ever be produced, subject to the quality restrictions governed by our chosen file size. So while John Lennon probably thought A Day in the Life was quite an achievement, it turns out the song was there all along – it wasn’t so much written as chosen – in much the same way (and with the same amount of creativity) as I can choose the number 45353459083459034553458!


While this may be an interesting philosophical exercise, I don’t think the writers, artists and musicians of the world have a lot to be worried about. While it is true that there are a finite number of permutations of ones and zeros up to any given size, the numbers are so big that for all intents and purposes we can consider them to be infinite. Let’s take a much smaller domain to investigate this. The handful of people who may have read or watched The Da Vinci Code will remember the “cryptex” that contains 5 wheels each adorned with the 26 letters. This device has 26^5 or 11,881,376 possibilities. That’s quite a lot of combinations, but it’s hard to say anything very profound in just 5 letters. So let’s try something a bit more interesting: messages that can be constructed using the characters A-Z and a space, up to 30 characters in length. Such a domain includes options such as:



  • I READ THE NEWS TODAY OH BOY
  • AUSTRALIA WINS WORLD CUP
  • MEIN HUT DER HAT DREI ECKEN

Please don’t read any meaning into these messages – they are but random choices from the finite set of options in this domain. But how many choices are there? That would be 27^30, or 8.7280 x 10^42. So how about our 5Mb computer files? That’s about 5 million bytes each containing 8 bits – so that gives a total of around 2^40,000,000 possible options. I don’t have the time, expertise or a big enough calculator to convert that to a base 10 power, but it’s got to be one hell of a lot more than the number of atoms in the universe, which is apparently estimated to be around 10^81. Even if you stripped out all of the “invalid” files (which is going to be almost all of them), the number that remains is still going to be mind boggling.


But even so, it makes you think. If you’re hoping to come up with next year’s #1 hit song, maybe learning the guitar and practising is the wrong approach. Instead, maybe a random number generator, Windows Media Player and a bit of luck is all you really need.

Comments (2)
  1. Gerd Hollander says:

    I may be your father, but I truly believe this is a fine piece of philosophical, intelligent writing, as was the one on restaurants. It’s good to know that technical people can have coherent and interesting thoughts beyond their field of employment.

  2. Keith Farmer says:

    The problem, though, is in finding such a meaningful sequence in a reasonable time.

    The fact that we can do so despite the random odds against it is a mark of intelligence.  Those that are skilled in finding particularly good sequences, frequently, we can call geniuses.

Comments are closed.

Skip to main content