Deep neural networks and visuo-semantic models explain complementary components of human ventral-stream representational dynamics

Kamila M Jozwik; Tim C Kietzmann; Radoslaw M Cichy; Nikolaus Kriegeskorte; Marieke Mur

doi:10.1101/2021.10.25.465583

Abstract

Deep neural networks (DNNs) are promising models of the cortical computations supporting human object recognition. However, despite their ability to explain a significant portion of variance in neural data, the agreement between models and brain representational dynamics is far from perfect. Here, we address this issue by asking which representational features are currently unaccounted for in neural timeseries data, estimated for multiple areas of the human ventral stream via source-reconstructed magnetoencephalography (MEG) data. In particular, we focus on the ability of visuo-semantic models, consisting of human-generated labels of higher-level object features and categories, to explain variance beyond the explanatory power of DNNs alone. We report a gradual transition in the importance of visuo-semantic features from early to higher-level areas along the ventral stream. While early visual areas are better explained by DNN features, higher-level cortical dynamics are best accounted for by visuo-semantic models. These results suggest that current DNNs fail to fully capture the visuo-semantic features represented in higher-level human visual cortex and suggest a path towards more accurate models of ventral stream computations.

Competing Interest Statement

The authors have declared no competing interest.

The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license.