Or, Why We Might Not Understand Intelligence Before Creating It
With the rapid popularization of generative AI in the wake of ChatGPT, I’ve started to see more and more talk about the potential for human-level AI or AGI. One particular opinion that I’ve seen on several occasions, even from some famous AI researchers and critics, is that AGI is currently unfeasible and “far away” since we do not understand intelligence. The argument goes something like “we cannot really define what intelligence is and we currently do not understand the principles underlying intelligence, therefore we cannot create intelligence. Thus AGI is far away or even impossible to create.” I believe that this is the wrong conclusion from the premise that we don’t understand intelligence, both based on historical examples and on the lessons from the current deep learning paradigm.
History has many examples of technologies that have been created without knowing the principles underlying the technology far in advance. The Wright brothers created the first motor-based airplane, and they did not know the exact underlying principles of what made airplanes possible. In fact, we still have a poor understanding of what actually makes airplanes stay in the air, at least according to this article (my knowledge of this type of physics is too poor to evaluate properly though).
Another historical example is the development of the first nuclear bomb. Was the idea of a powerful bomb based on nuclear chain reactions silly when it was expressed by Szilárd in the 1930s, or when H.G. Wells wrote about it in 1914? At the time they did not know the exact underlying principles making the bomb possible, yet they, and others, took the idea seriously and everyone that did was better prepared for the world that came after the Manhattan project.
It is not only history that should teach a lesson, but I also believe that the entire deep learning paradigm has shown that we do not need to understand a specific capability to train it into a deep learning model. It is not like we understand the principles underlying how humans generate language, yet humanity has trained thousands of models that are capable of generating language.
The lesson that deep learning has taught me, and I suspect others as well, is that using stochastic gradient descent (SGD) on a large model1 opens up parts of the latent model space that are completely alien to humans. The methods used in deep learning do oftentimes not have an analog in the human brain, the origin of intelligence, despite us using names like neurons and neural networks to describe the models. Backpropagation, the method used to update parameter weights, does not seem to be used in human brains (See e.g. Backpropagation and the Brain), yet it is a crucial part in creating capabilities that humans also have.2
In a not-very-literal sense, I view this process as us humans binding an alien intelligence to a GPU/TPU, an alien intelligence that has always existed in the latent space.3 What SGD and deep learning is doing is binding the intelligence to the silicon and expressing it through the parameter weights. We trap it so that we can use it for the tasks we want done.
Even though I’m half-joking here, the idea is that human intelligence may not tell us very much about intelligence in general, or about other ways of creating intelligence. Just because human intelligence is the only intelligence we know of does not mean that it is the only possible intelligence.
I agree with the skeptics. We do not understand human intelligence but that is not a necessary condition for binding intelligent artificial aliens to our silicon chips.
- Where a larger model, or rather more compute, allows you to probe model space deeper.
- I’m a bit uncharitable here, because it is completely possible that intelligence is independent of the way that it is implemented or created. And the people I’m criticizing likely know as well as believe that. I’m just trying to make a point about the danger of thinking that humans are the only possible model for these things.
- If this idea tickles your curiosity, you should read Permutation City by Greg Egan