Abstract
Deep feedforward neural network models of vision dominate in both computational neuroscience and engineering. However, the primate visual system contains abundant recurrent connections. Recurrent signal flow enables recycling of limited computational resources over time, and so might boost the performance of a physically finite brain. In particular, recurrence could improve performance in vision tasks. Here we find that recurrent convolutional networks outperform feedforward convolutional networks matched in their number of parameters in large-scale visual recognition tasks. Moreover, recurrent networks can trade off accuracy for speed, balancing the cost of error against the cost of a delayed response (and the cost of greater energy consumption). We terminate recurrent computation once the output probability distribution has concentrated beyond a predefined entropy threshold. Trained by backpropagation through time, recurrent convolutional networks resemble the primate visual system in terms of their speed-accuracy trade-off behaviour. Moreover, their learned lateral connectivity patterns are consistent with those observed in primate early visual cortex. These results suggest that recurrent models are preferable to feedforward models of vision, both in terms of their performance at vision tasks and their ability to explain biological vision.
Author summary Deep neural networks (DNNs) provide the best current models of biological vision and achieve the highest performance in computer vision. Although originally inspired by the primate brain, these models are still missing important functional elements of their biological counterparts. One biological feature typically absent from models for visual object recognition is the ability to recycle limited neural resources by processing information recurrently. We report that including connections that let information flow in cycles can improve performance, even as the total number of connections is held constant. Recurrent processing also enabled DNNs to behave more flexibly and trade off speed for accuracy. Similar to the primate brain, the networks can compute longer to boost accuracy for objects that are more difficult to recognise. This work shows how a known feature of the primate brain contributes to its computational function and suggests that taking inspiration from biology can help us further improve artificial vision systems.
Footnotes
Fixes a typographical error in figure 4.