ABSTRACT
The perception of natural auditory textures, such as fire wind, and rain, has been proposed to arise through the integration of time-averaged summary sound statistics. Where and how the auditory system might encode these statistics to create internal representations of these stationary sounds, however, is unknown. Here, using natural textures and synthetic variants with reduced statistics, we show that summary statistics modulate the correlations between frequency organized neuron ensembles in the awake rabbit inferior colliculus. These neural correlations capture high-order sound structure and allow for accurate decoding in a single trial recognition task with evidence accumulation times approaching 1 s. In contrast, the average activity across the population (neural spectrum) provides a fast (tens of ms) and salient signal that contributes primarily to texture discrimination. Using studies in human listeners we find analogous trends: the sound spectrum serves as a fast and salient discrimination cue while high-order statistics have long integration time and contribute substantially more to recognition. These findings suggest that spectrum- and correlation-based sound cues are represented by distinct auditory midbrain response statistics, and that these statistics may have dissociable roles and time scales for the recognition and discrimination of sounds.
Footnotes
Funding: This work was supported by the National Institute On Deafness And Other Communication Disorders of the National Institutes of Health under award R01DC015138. The content is solely the responsibility of the authors and does not necessarily represent the official views of the NIH or NSF. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. HLR has ownership interest in Elemind Technologies, Inc. and this private company did not sponsor this research.