Abstract
An open question in systems neuroscience is which objective function (or computational “goal”) best describes the computations performed by the ventral stream (VS) of primate visual cortex. Substantial past research has suggested that object categorization could be such a goal. Recent experiments, however, showed that information about object positions, sizes, etc. is encoded with increasing explicitness along this pathway. Because that information is not necessarily needed for object categorization, this motivated us to ask whether primate VS may do more than “just” object recognition. To address that question, we trained deep neural networks, all with the same architecture, with three different objectives: a supervised object categorization objective; an unsupervised autoencoder objective; and a semi-supervised objective that combined autoencoding with categorization. We then compared the image representations learned by these models to those observed in areas V4 and IT of macaque monkeys using canonical correlation analysis (CCA). We found that the semi-supervised model provided the best match the monkey data, followed closely by the unsupervised model, and more distantly by the supervised one. These results suggest that multiple objectives – including, critically, unsupervised ones – might be essential for explaining the computations performed by primate VS.
Competing Interest Statement
The authors have declared no competing interest.