Trends in Neurosciences
ReviewEfficient Neural Coding in Auditory and Speech Perception
Section snippets
The Relevance of Efficient Neural Coding for Speech Perception
Speech has long been recognized as ‘special’ 1, 2, 3, 4, 5, 6. We prefer it over other sounds from birth onwards [6], and we are able to make fine-grained discriminations that allow us to convey an infinite amount of messages. The special status of speech has been studied from a variety of perspectives. Researchers of social cognition approach it as our species-specific communicative signal, and as the basis of learning and cultural transmission 7, 8. Others have claimed that speech is special
The Statistical Structure of Sounds
To test whether the mammalian auditory system codes sound in a mathematically optimal way, it is first necessary to describe the statistical structure of sounds. The space (in the mathematical sense) of all potential sounds is vast (Box 1). Within this space, natural sounds, including speech, comprise a compact yet multi-dimensional subspace. Analyses of statistical regularities in natural sounds have identified several prominent features. The temporal structure of many natural environmental
Nonredundant, Optimal Mathematical Models of Sounds
According to the efficient coding hypothesis, the brain has evolved to efficiently process and respond to stimuli that occur in nature, reducing redundancy in their neural representations [9]. This principle posits that the statistical properties of neuronal responses should match the statistical structure of natural stimuli, and should maximize the efficiency in representation 10, 33. This is best achieved if neuronal responses constitute a sparse, nonredundant code, meaning that the code
Sounds with Naturalistic Statistics are Special for the Mammalian Auditory System
According to the efficient coding hypothesis, identifying the statistical dependencies in the structure of sounds yields insight into the structure of the neuronal code. This was tested by constructing artificial codes that were optimized according to some set of constraints to best represent natural sounds, and then compared to experimental measurements of responses of neurons in the auditory pathway. Such advanced mathematical models were, for instance, used to better understand the structure
Can Efficient Coding Explain Perception?
Few studies to date have directly addressed whether efficient coding principles can account for auditory percepts. Among these, one series of studies 25, 26, 61 tested how human adults, infants, and newborns perceive water sounds generated by a mathematical model (Figure 4) that consisted of a population of randomly spaced gamma tone chirps from a wide range of frequencies [25]. This model generated scale-invariant sounds when the temporal structure of the chirps scaled relative to their center
Concluding Remarks and Future Perspectives
The research findings discussed in this review suggest that auditory perception may obey the principles of efficient neural coding, relying on the informational, theoretical notion of optimality. The existing studies demonstrate that the approaches for understanding the mathematical structure of sounds can yield predictions about neuronal encoding throughout the auditory pathway. The correspondence between neuronal responses and model predictions, conversely, is consistent with the notion that
Acknowledgements
This work was supported by Human Frontier in Science Foundation Young Investigator Award to M.N.G. and J.G.; National Institutes of Health (Grant numbers NIH R01DC014700, NIH R01DC015527), and the Pennsylvania Lions Club Hearing Research Fellowship to M.G.N.; an ERC Consolidator Grant 773202 ERC-2017-COG ‘BabyRhythm’, the LABEX EFL (ANR-10-LABX-0083) and the ANR grant ANR-15-CE37-0009-01 awarded to J.G. M.N.G. is the recipient of the Burroughs Wellcome Award at the Scientific Interface. We
References (72)
- et al.
The faculty of language: what’s special about it?
Cognition
(2005) - et al.
Natural pedagogy
Trends Cogn. Sci.
(2009) - et al.
Adult-like processing of time-compressed speech by newborns: a NIRS study
Dev. Cogn. Neurosci.
(2017) - et al.
Sound texture perception via statistics of the auditory periphery: evidence from sound synthesis
Neuron
(2011) Temporal modulations in speech and music
Neurosci. Biobehav. Rev.
(2017)- et al.
Sparse coding of sensory inputs
Curr. Opin. Neurobiol.
(2004) Tuning to natural stimulus dynamics in primary auditory cortex
Curr. Biol.
(2006)Neuronal responses in cat primary auditory cortex to natural and altered species-specific calls
Hear. Res.
(2000)Inhibitory plasticity in a lateral band improves cortical detection of natural vocalizations
Neuron
(2009)The neural correlates of processing scale-invariant environmental sounds at birth
Neuroimage
(2016)
The analysis of speech in different temporal integration windows: cerebral lateralization as asymmetric sampling in time
Speech Commun.
Perception of the speech code
Psychol. Rev.
On finding that speech is special
Birdsong and speech: evidence for special processing
Facilitation of multisensory integration by the “unity effect” reveals that speech is special
J. Vis.
Tuned to the signal: the privileged status of speech for young infants
Dev. Sci.
Joint attention and early language
Child Dev.
Some informational aspects of visual perception
Psychol. Rev.
Possible principles underlying the transformation of sensory messages
A mathematical theory of communication
Bell Syst. Tech. J.
Natural image statistics and neural representation
Annu. Rev. Neurosci.
Efficient coding of natural sounds
Nat. Neurosci.
Efficient coding in human auditory perception
J. Acoust. Soc. Am.
Efficient auditory coding
Nature
Temporal low-order statistics of natural sounds
Adv. Neural Inf. Process. Syst.
Naturalistic stimuli increase the rate and efficiency of information transmission by primary auditory afferents
Proc. Biol. Sci.
Speech perception in simulated electric hearing exploits information-bearing acoustic change
J. Acoust. Soc. Am.
The efficient coding of speech: cross-linguistic differences
PLoS One
‘1/f noise’ in music and speech
Nature
Modulation spectra of natural sounds and ethological theories of auditory processing
J. Acoust. Soc. Am.
Multiresolution spectrotemporal analysis of complex sounds
J. Acoust. Soc. Am.
Rapid task-related plasticity of spectrotemporal receptive fields in primary auditory cortex
Nat. Neurosci.
Auditory perception of self-similarity in water sounds
Front. Integr. Neurosci.
Category-specific processing of scale-invariant sounds in infancy
PLoS One
The ear as a frequency analyzer
J. Acoust. Soc. Am.
A review of the MTF concept in room acoustics and its use for estimating speech intelligibility in auditoria
J. Acoust. Soc. Am.
Cited by (23)
Innate frequency-discrimination hyperacuity in Williams-Beuren syndrome mice
2022, CellCitation Excerpt :The ability to distinguish acoustic frequencies from each other or from the surrounding auditory scene has been essential for survival throughout evolution, and in humans remains fundamental to everyday hearing, linguistics, and musicality (Feng and Ratnam, 2000; Gervain and Geffen, 2019; Peretz, 2016; Stewart, 2008).
Frontotemporal activation differs between perception of simulated cochlear implant speech and speech in background noise: An image-based fNIRS study
2021, NeuroImageCitation Excerpt :Despite myriad sources of distraction in daily life, listeners' perception of speech demonstrates surprising resilience. The robustness of speech perception owes to the neural redundancy within the auditory system, whereby subcortical neural firing strongly correlates with stimulus patterns and becomes increasingly discerning to specific feature combinations of speech at the level of the cortex (Gervain and Geffen, 2019; Schnupp, 2006). Likewise, comprehension of speech generally follows a hierarchy of processing such that acoustic sensory analyses begin at the temporal lobe, and higher level, attentional mechanisms of the frontal cortex are recruited to resolve more complicated speech information (Davis and Johnsrude, 2003; Friederici, 2011).
Do infants represent human actions cross-modally? An ERP visual-auditory priming study
2021, Biological PsychologyCitation Excerpt :Already at birth, infants’ auditory system is sufficiently developed to support the segregation of concurrent streams of sounds, and hence prepared for perceiving and representing distinct social sounds from their surrounding environment (Draganova et al., 2018; Graven & Browne, 2008; Hepper & Shahidullah, 1994; Winkler et al., 2003). They also have well developed abilities to process acoustic properties such as intensity and frequency, temporal relations, and melody (Baruch, Panissal-Vieu, & Drake, 2004; Berg & Boswell, 1998; Nazzi, Floccia, & Bertoncini, 1998; Plantinga and Trainor, 2005; Trainor & Trehub, 1992), which contribute to the extraction of the complex acoustic features and their integration into coherent percepts (Geangu et al., 2015; Gervain & Geffen, 2019; Gervain, Werker, Black, & Geffen, 2016; Gervain, Werker, & Geffen, 2014). Importantly, already at birth, the infant brain appears to process those acoustic properties that are relevant for the efficient discrimination and perceptual categorization of natural sounds, such as the similarity in the acoustic patterns at different levels of observation, or scale-invariance (Gervain & Geffen, 2019; Gervain et al., 2014; Gervain et al., 2016).
Quantifying the distribution of feature values over data represented in arbitrary dimensional spaces
2024, PLoS Computational BiologyEarly maturation of sound duration processing in the infant’s brain
2023, Scientific Reports