TY - JOUR T1 - Many but not all deep neural network audio models capture brain responses and exhibit hierarchical region correspondence JF - bioRxiv DO - 10.1101/2022.09.06.506680 SP - 2022.09.06.506680 AU - Greta Tuckute AU - Jenelle Feather AU - Dana Boebinger AU - Josh H. McDermott Y1 - 2022/01/01 UR - http://biorxiv.org/content/early/2022/11/05/2022.09.06.506680.abstract N2 - Models that predict brain responses to stimuli provide one measure of understanding of a sensory system, and have many potential applications in science and engineering. Stimulus-computable sensory models are thus a longstanding goal of neuroscience. Deep neural networks have emerged as the leading such predictive models of the visual system, but are less explored in audition. Prior work provided examples of audio-trained neural networks that produced good predictions of auditory cortical fMRI responses and exhibited correspondence between model stages and brain regions, but left it unclear whether these results generalize to other neural network models, and thus how to further improve models in this domain. We evaluated brain-model correspondence for publicly available audio neural network models along with in-house models trained on four different tasks. Most tested models out-predicted previous filter-bank models of auditory cortex, and exhibited systematic model-brain correspondence: middle stages best predicted primary auditory cortex while deep stages best predicted non-primary cortex. However, some state-of-the-art models produced substantially worse brain predictions. The training task influenced the prediction quality for specific cortical tuning properties, with best overall predictions resulting from models trained on multiple tasks. The results suggest the importance of task optimization for explaining brain representations and generally support the promise of deep neural networks as models of audition.Competing Interest StatementThe authors have declared no competing interest. ER -