I’ve just finished reading Jeff Hawkins’ book, On Intelligence, in which Hawkins develops his theory of how the brain works. In it, he describes what he calls the memory-prediction framework, and outlines how artificial intelligence researchers might leverage ideas from the study of the neocortex to enable the creation of more intelligent machines.
Hawkins speaks from a position of both practical and theoretical authority. He founded Palm Computing and Handspring, and personally developed the handwriting recognizer that ultimately became the basis of the Graffiti text entry system used by Palm handhelds.
His approach is very similar to the one I follow in my thesis (“It’s About Time: Temporal Representations for Synthetic Creatures”), in that it emphasizes the importance of feedback, time, and prediction. Like our work at Synthetic Characters in general, Hawkins is inspired by nature, but has more of a neural and particularly a neocortical emphasis. (Indeed, he “admits” to being a “neocortical chauvinist” early on, and the chapter in which he outlines his core thesis is titled How the Cortex Works. The majority of the Testable Predictions in his Appendix, rather than detailing advances in artificial intelligence, describe communication patterns between cells in particular cortical layers.)
But my interest in his work — and I am very interested in it — comes not from his predictions about the cortex itself, but instead from his proposed approach to artificial intelligence, and how, from his more neurological perspective, he arrives at conclusions very similar to ours from Synthetic Characters (a group which existed under the inspired guidance of Dr Bruce Blumberg at MIT’s Media Lab). Our overarching focus in that group was the design, development and exploration of an extensible architecture for the brains of synthetic creatures.
So, coming from this perspective, here are the big ideas I extracted from On Intelligence.
1. Intelligence isn’t behavior, it’s representation. Prediction is the proof of intelligence.
Hawkins invokes Searle’s classic Chinese Room thought experiment (among other diverse lines of reason) to demonstrate that behavioral intelligence is not sufficient. (“You can be intelligent just lying in the dark, thinking and understanding.”) (p29)
He offers an intriguing analysis of Turing’s famous test for computer intelligence, noting how significantly it differs from what we generally consider a valid test metric for human competence. IQ tests are very frequently based on our ability to make predictions: given a sequence of numbers, what should the next be? Given 3 views of a complex object, which of these is another view? “Cat” is to “kitten” as “dog” is to what? (p96)
He goes so far as to say that that is where Turing went wrong: prediction, not behavior, is the proof of intelligence.
2. Prediction is the primary function of the neocortex. All behavior is a byproduct of prediction.
3. The neocortex is uniform.
From this, Hawkins proceeds to conclude there is a “single, powerful algorithm implemented by every region of cortex.” (p55) That algorithm is capable of processing all kinds of sensory input. That is, perceptual processing is uniform across the senses.
4. The neocortex stores sequences of patterns.
He cites many examples (see around p70) from human experience that suggest that memories are stored as sequences. (You know the alphabet; try reciting it backwards. Or, visualize a room in your house — and then see how much more vividly you can recall it if you imagine walking through the front door, looking around, and walking to that room. Think of a tune that you know, any tune, and note how you can’t imagine the entire song at once; only in sequence.)
5. The patterns stored in the neocortex can be recalled in an auto-associative way.
Given only partial (or distorted) spatial and temporal patterns, our brains fill in the rest.
6. The patterns stored in the neocortex must be stored using an invariant representation.
A friend is identifiable regardless of the angle from which you’re observing her. Songs can be recalled independent of their key (and unless you have perfect pitch, you can’t possibly determine what key you first heard it in). In my motor cortex, I have an invariant representation of my autograph. Hawkins suggests that this abstraction is a general property of the neocortex — memories are stored in a form that “captures the essence of relationships, not the details.” (p82)
7. Perceptual processing is hierarchical.
Since the cortex is uniform, the processing that occurs at each level of that hierarchy must be the same. Hawkins expounds upon the importance of invariant representations that can be formed at multiple levels of the perceptual hierarchy.
(As an aside:) In wetware, everything has to be accomplished in remarkably few serial steps.
I include this point here because I’d never really grasped its significance before reading On Intelligence. Hawkins refers to this constraint as the “one hundred-step rule.” (p66)
A human can perform significant tasks in much less time than a second. But our neurons are slow, so that in half a second, information entering the brain can only traverse a chain approximate 100 neurons long. Therefore, regardless of how many parallel operations are being performed, a maximum of roughly 100 serial calculations can be involved in the brain’s computation! Isn’t that profound?! As Hawkins points out, 100 serial computer operations are barely enough to move a single character on the display, let alone do something interesting!
On Intelligence is a popular, rather than scientific, account, and as is typical in that genre, Hawkins is not particularly thorough when discussing prior work. He is very dismissive of the artificial intelligence community’s consideration of temporal data and feedback (“I don’t want to imply that everyone has ignored time and feedback,” he says on page 32, after devoting the better part of a chapter to doing just that.) He does cite work on auto-associative memories as one counterexample that “hinted at the potential importance of feedback.” But his survey of past work has a heavy emphasis on neural networks, and it doesn’t mention, for example, the planning community, where feedback, time and prediction are all fundamental.
Some of his conclusions, and his philosophical consideration of questions like “what is creativity?” read remarkably close to elements of my thesis. “Creativity is an inherent property of every cortical region. It is a necessary component of prediction,” he writes. Yes! He defines Creativity as “making predictions by analogy” and notes that this is (a) something that occurs everywhere in cortex and (b) something you do continually while awake.
This was precisely one of the most interesting things to fall out of my attempt to add a temporal predictive learning mechanism to our layered brain architecture. The learning mechanism for my creatures couldn’t exist without what I called at the time a “Creativity Drive,” which would propel the creatures to test the validity of salient “hypotheses” they had in the world, which were represented by Predictors, their fundamental representation of causality. Hawkins would be pleased to know that my Predictors stored sequences of patterns from all types of sensory input in an invariant way. They are remarkably akin to his cortical columns (p154), but I loved his concept of re-membering by flowing up and flowing down the hierarchy (p159) and have spent a good deal of time since reading his book considering how I would have implemented the same phenomenon. (Here is my best attempt to describe my Time-Rate learning mechanism outside of my thesis. Like On Intelligence, it is also a popular account. If you’re interested in the Synthetic Characters work, you should also check out the spatial prediction research that my colleague Damian Isla was developing concurrently.)
Hawkins occasionally stretches to conclusions that don’t necessarily follow in my mind. His subtitle predicts the creation of “truly intelligent machines,” and I suppose that in a popular account grandiose claims are to be expected (and perhaps encouraged). For example, he eventually asserts that “perception and behavior are almost one and the same” (p157), which, even in context on that page, I found to be a gross oversimplification, or at least, a clear indication of his self-professed neocortical chauvinism!
Hawkins freely concedes that many of the ideas in the book aren’t new. But the value he adds through this work is immense, and twofold:
First, in this lively popular account, he offers a range of vivid examples that engaged my imagination in a variety of ways. I found myself thinking new thoughts not just about AI, but also about the inner workings of my memory and my own creative process.
More importantly, the memory-prediction model is a remarkable framework for capturing the essence of a promising representation. Hawkins had me hooked at “the importance of feedback, time, and prediction.” His emphasis on the neocortex leads him to many interesting insights, provides focus to his work, and offered me personally a considerably new perspective.
If you’re not new to the field, come prepared with a thick skin, more than a few grains of salt, and a willingness to ride through a popular account. But this is highly recommended stuff that has single-handedly re-awakened my passion for AI.
As for his ideas, I predict you haven’t heard the last of the memory-prediction model. And at a time when I thought I was loosing the shackles of the “biological plausibility” that influenced (and permeated) our thinking in the Synthetic Characters group, I find myself once again looking inwards towards the cortex for inspiration.