RIP Neural Networks

Neural networks seem to be everywhere these days, driving our cars and showing us how to play Go. They are used for most of the tasks which people regard as AI, and are at the forefront of AI research, but would you trust one to make critical business decisions for you?

Perhaps we should start at why neural nets are so popular as an AI mechanism. We all know that computers are much better than people at things like dividing 114 by 3, but when it comes to simple stuff such as identifying your friend in a crowd or understanding a story, people still beat computers hands down.

The problem is, we don’t really understand exactly how we do this stuff; It all just seems to happen. So how do we train a computer to do it? The obvious approach is to mimic what happens in nature. The human brain is basically lots of neurone cells which connect to multiple other neurone cells across control gaps called synapses. Although no-one knows the exact model, knowledge seems to be encoded by modifying the strength and number of connections which each neurone has with others. We can reproduce something similar in a computer, and this is where neural networks came from. We have greatly improved our understanding of how to build neural networks over the last 20 years, however one big problem remains, trust.

Looking at how nature does things is often an excellent strategy, from creating new drugs for treating disease to designing ultra strong, lightweight materials based on spider silk or bone. The problem with applying this strategy to Artificial Intelligence is, if we don’t know how a body of knowledge is encoded, how can we explain decisions made on the basis of it? There is beginning to be some emphasis on creating neural networks which can try to explain why they made a particular decision, but it seems that the more explainable we make this sort of network, the lower the accuracy of the output becomes.  Currently these ‘traceable’ networks are only large enough to  model very simple problem spaces – such as identifying the relationships between objects. The more we make the network capable of saying why it thinks object A is in front of object B, the less likely it is to give a correct answer in the first place. If we modify the architecture of a neural net to try to make it’s decisions explainable, we are compromising it’s ability to make those decisions. It all comes down to the initial problem that we don’t really understand the underlying mechanism. If this is true in small models at the cutting edge of current research, how are we going to scale this to more a general purpose AI, say one capable of making real world business decisions? If we can’t trust AI, we are going to struggle to give it responsibility for important stuff. If we can’t apply it to important stuff, then maybe we are at a dead-end.

In building larger and more complex neural networks, we are getting better and better at building a framework to support emergent behaviour, without taking the time to understand the form that emergence takes. At present we run networks of the order of 16 million node networks on our supercomputers. This is roughly equivalent to a frog brain. To approach a human brain we would need to scale that by not ten times, or a hundred times, but five thousand times. If we struggle to balance accuracy with traceability in networks much smaller than our largest, how big is the problem going to get as we scale? Neural networks are far from dead; they are, and will remain, an important tool, but I think it is time to consider their limitations, and to think about what other approaches we might take.