Abstract
The human visual cortex processes visual stimuli hierarchically. Early visual areas (V1, V2) of the ventral visual stream feed crude visual features (like orientation and edges) into later visual areas (V4, lateral occipital (Lat Occ), inferior temporal (IT)) that then encode complex visual features (like object form). Previous studies have reported a difference in fMRI responses between natural and urban landscapes in certain parts of the brain. Here we asked if this distinction in the representation of complex natural and man-made visual stimuli extends to the ventral visual stream and if state-of-the-art convolutional neural networks (CNN) provide a map to this categorical distinction. To assess this, we used an open source fMRI data set of V1-4 and LatOcc BOLD responses to 1750 passively viewed natural and man-made grayscale images. The same images were fed into the pre-trained CORnet-S, a CNN designed to model hierarchical human visual processing layer-wise. To identify differences in representations within and between the human visual cortex and the CNN, we computed representational dissimilarity matrices. The BOLD response to the manmade and natural images shows correlation differences for the two categories, which increase across V4 and LatOcc. Such differences were also observed in the model CORnet-S for the layers V4 and IT. Our results suggest that the human visual cortex processes natural and manmade images differently starting from V4 and that this representational difference is modeled in CORnet-S. In both, categorical representation is progressively established in the later processing stages. This might indicate that the brain has perhaps developed two-distinct systems for their representations which can be directed through evolution. Further analysis may elucidate the contributory role of evolution towards this paradigm.
Competing Interest Statement
The authors have declared no competing interest.