Transformers Can Mock Part of Human Brain

In recent years, neuroscientists have tried many types of neural networks to model the firing of neurons in the human brain. In a recent project, the two researchers Whittington and Behrens found that the hippocampus, a structure of the brain critical to memory, works as a particular kind of artificial neural network called transformer.

The fact that we know these models of the brain are equivalent to the transformer means that our models perform much better and are easier to train.

This was said by Whittington who also said that transformers can greatly improve the neural network's ability to mimic the behavior of the brain and its computations. David Ha from Google Brain said that they are not trying to recreate the brain, but are trying to create a mechanism that can do what the brain does.

Transformers work with the self-attention mechanism in which every input is always connected to every output. An input can be a word, a pixel, or a number in a sequence. The difference with the other neural networks is that in other networks only certain inputs are connected with other inputs. Transformers first appeared five years ago with the BERT and GPT-3, a new revolutionary way for AI to process languages.

Whittington and Behrens tweak the approach of Hopfield network, modifying the transformers in a way to encode the memories as coordinates in higher-dimensional spaces rather than linear sequences as Hopfield and Dmitry Krotov did at MIT-IB Watson AI lab. The two researchers showed also that the model is mathematically equivalent to the model of the grid cell firing patterns that neuroscientists see in fMRI scans. Beharens said transformers as another step to understanding better the brain and having an accurate model, rather than the end of the quest.

I have got to be a skeptic neuroscientist here, I don’t think transformers will end up being how we think about language in the brain, for example, even though they have the best current model of sentences.

Schrimpf, a computational neuroscientist at MIT, who analyzed 43 different neural net models to understand how well they predicted behavior of human neural activity as reported by fMRI and electrocorticography, noted that even the best-performing transformers worked well with words or small sentences and not for larger-scale languages tasks. This is why he claims:

My sense is that this architecture, this transformer, puts you in the right space to understand the structure of the brain, and can be improved with training. This is a good direction, but the field is super complex.

About the Author

Claudio Masolo

Show moreShow less

InfoQ Software Architects' Newsletter

Write for InfoQ

About the Author

Claudio Masolo

Rate this Article

This content is in the AI, ML & Data Engineering topic

Related Topics:

Related Editorial

Related Sponsors

Popular across InfoQ

The InfoQ Newsletter