Abstract
Understanding the neural representation of spatial frequency (SF) in the primate cortex is vital for unraveling visual processing mechanisms in object recognition. While numerous studies concentrate on the representation of SF in the primary visual cortex, the characteristics of SF representation and its interaction with category representation remain inadequately understood. To explore SF representation in the inferior temporal (IT) cortex of macaque monkeys, we conducted extracellular recordings with complex stimuli systematically filtered by SF. Our findings disclose an explicit SF coding at single-neuron and population levels in the IT cortex. Moreover, the coding of SF content exhibits a coarse-to-fine pattern, declining as the SF increases. Temporal dynamics analysis of SF representation reveals that low SF (LSF) is decoded faster than high SF (HSF), and the SF preference dynamically shifts from LSF to HSF over time. Additionally, the SF representation for each neuron forms a profile that predicts category selectivity at the population level. IT neurons can be clustered into four groups based on SF preference, each exhibiting different category coding behaviors. Particularly, HSF-preferred neurons demonstrate the highest category decoding performance for face stimuli. Despite the existing connection between SF and category coding, we have identified uncorrelated representations of SF and category. In contrast to the category coding, SF is more sparse and places greater reliance on the representations of individual neurons. Comparing SF representation in the IT cortex to deep neural networks, we observed no relationship between SF representation and category coding. However, SF coding, as a category-orthogonal property, is evident across various ventral stream models. These results dissociate the separate representations of SF and object category, underscoring the pivotal role of SF in object recognition.
Competing Interest Statement
The authors have declared no competing interest.
Footnotes
In the revised manuscript, several key improvements have been made: analysis with a longer exposure time (200ms) to address contrast sensitivity concerns, efforts to improve category selectivity, matching of stimulus power in terms of spatial frequency (SF) and category, and a discussion on how convolutional neural networks (CNNs) do not capture the complexities of SF processing in the IT cortex. These revisions strengthen the findings on SF bias in IT neurons and provide a more comprehensive analysis of category coding and stimulus processing.





