Opinion
Temporal coherence and attention in auditory scene analysis

https://doi.org/10.1016/j.tins.2010.11.002Get rights and content

Humans and other animals can attend to one of multiple sounds and follow it selectively over time. The neural underpinnings of this perceptual feat remain mysterious. Some studies have concluded that sounds are heard as separate streams when they activate well-separated populations of central auditory neurons, and that this process is largely pre-attentive. Here, we argue instead that stream formation depends primarily on temporal coherence between responses that encode various features of a sound source. Furthermore, we postulate that only when attention is directed towards a particular feature (e.g. pitch) do all other temporally coherent features of that source (e.g. timbre and location) become bound together as a stream that is segregated from the incoherent features of other sources.

Section snippets

The auditory scene analysis problem

Humans and other animals routinely detect, identify and track sounds coming from a particular source (e.g. someone's voice, a conspecific call) among sounds emanating from other sources (e.g. other voices, heterospecific calls, ambient music and street traffic) (Figure 1). The apparent ease with which they determine which components and attributes in a sound mixture arise from the same source belies the complexity of the underlying biological processes. By analogy with the scene segmentation

Temporal coherence in auditory scene analysis

Problems inherent to auditory scene analysis are similar to those found in visual scene analysis. However, there are a few notable unique aspects. In particular, whereas natural and artificial visual scenes often contain a large proportion of static or slow-moving elements, auditory scenes are essentially dynamic, containing many fast-changing, relatively brief acoustic events (referred to as tokens in Box 1) 30, 31. Therefore, an essential aspect of auditory scene analysis is the linking over

Is streaming a pre-attentive process?

A widely held view by which has emerged from electrophysiological studies in humans 28, 29, 67, 68, 69, 70, is that auditory streams are formed pre-attentively in the auditory system, much like the extraction of low-level features in early pre-cortical stages. Depending on the listener's intentions and guided by representations of previously encountered auditory objects (or streams) that are now stored in memory, attention would simply serve to enhance the perception of a particular stream in

Summary

Here, we proposed two ideas within an overall framework to explain the perception of auditory scenes. The first is that auditory stream formation is critically dependent on the temporal coherence between neural responses to sounds in the auditory cortex. Specifically, when stimulus-induced cortical responses are temporally coherent, the features they represent can potentially become perceptually unified (or bound) as one stream, distinct from other temporally incoherent responses. This

Acknowledgments

This work was supported by the following grants to the authors: NIH R0107657, MURI N000141010278, AFOSR FA9550-09-1-0234 and NSF CAREER award IIS-0846112.

Glossary

Auditory scene analysis
processes by which sequential and concurrent acoustic events are analyzed and organized into auditory streams.
Auditory stream
series of sounds perceived by the listener as a coherent entity and, as such, can be selectively attended to among other sounds. The word ‘stream’ emphasizes the fact that sounds usually unfold over time. Although sounds coming from different physical sound sources typically form separate streams, this is not always the case. For example, a choir

References (104)

  • D. Pressnitzer

    Perceptual organization of sound begins in the auditory periphery

    Curr. Biol.

    (2008)
  • M.L. Sutter

    Spectral processing in the auditory cortex

    Int. Rev. Neurobiol.

    (2005)
  • J.C. Middlebrooks

    Binaural response-specific bands in primary auditory cortex (AI) of the cat: topographical organization orthogonal to isofrequency contours

    Brain Res.

    (1980)
  • M. Elhilali

    Temporal coherence in the perceptual organization and cortical representation of auditory scenes

    Neuron

    (2009)
  • A. Luczak

    Multivariate receptive field mapping in marmoset auditory cortex

    J. Neurosci. Methods

    (2004)
  • T. Rahne

    A multilevel and cross-modal approach towards neuronal mechanisms of auditory streaming

    Brain Res.

    (2008)
  • B.G. Shinn-Cunningham

    Object-based auditory and visual attention

    Trends Cogn. Sci.

    (2008)
  • J.B. Fritz

    Auditory attention – focusing the searchlight on sound

    Curr. Opin. Neurobiol.

    (2007)
  • E. Niebur

    Synchrony: a neuronal mechanism for attentional selection?

    Curr. Opin. Neurobiol.

    (2002)
  • R. VanRullen

    Spike times make sense

    Trends Neurosci.

    (2005)
  • J.J. DiCarlo et al.

    Untangling invariant object recognition

    Trends Cogn. Sci.

    (2007)
  • A.S. Bregman

    Auditory Scene Analysis: The Perceptual Organization of Sound

    (1990)
  • E.C. Cherry

    Some experiments on the recognition of speech, with one and two ears

    J. Acoust. Soc. Am.

    (1953)
  • M.A. Bee et al.

    The cocktail party problem: what is it? How can it be solved? And why should animal behaviorists study it?

    J. Comp. Psychol.

    (2008)
  • J.H. McDermott

    The cocktail party problem

    Curr. Biol.

    (2009)
  • N. Marrone

    Evaluating the benefit of hearing aids in solving the cocktail party problem

    Trends Amplif.

    (2008)
  • D. Wang et al.

    Computational Auditory Scene Analysis: Principles, Algorithms, and Applications

    (2006)
  • R.R. Fay

    Sound source perception and stream segregation in nonhuman vertebrate animals

  • A. Bidet-Caulet et al.

    Neurophysiological mechanisms involved in auditory perceptual organization

    Front Neurosci.

    (2009)
  • J.S. Snyder et al.

    Toward a neurophysiological theory of auditory stream segregation

    Psychol. Bull.

    (2009)
  • W. Hartmann et al.

    Stream segregation and peripheral channeling

    Mus. Percep.

    (1991)
  • M.W. Beauvois et al.

    Computer simulation of auditory stream segregation in alternating-tone sequences

    J. Acoust. Soc. Am.

    (1996)
  • S. McCabe et al.

    A model of auditory streaming

    J. Acoust. Soc. Am.

    (1997)
  • J.S. Kanwal

    Neurodynamics for auditory stream segregation: tracking sounds in the mustached bat's natural environment

    Network

    (2003)
  • M.A. Bee et al.

    Primitive auditory stream segregation: a neurophysiological study in the songbird forebrain

    J. Neurophysiol.

    (2004)
  • Y.I. Fishman

    Auditory stream segregation in monkey auditory cortex: effects of frequency separation, presentation rate, and tone duration

    J. Acoust. Soc. Am.

    (2004)
  • M.A. Bee et al.

    Auditory stream segregation in the songbird forebrain: effects of time intervals on responses to interleaved tone sequences

    Brain Behav. Evol.

    (2005)
  • E. Sussman

    An investigation of the auditory streaming effect using event-related brain potentials

    Psychophysiol.

    (1999)
  • E.S. Sussman

    The role of attention in the formation of auditory streams

    Percept. Psychophys.

    (2007)
  • H. Attias et al.

    Temporal low-order statistics of natural sounds

    Adv. Neural Inf. Process. Syst.

    (1997)
  • N.C. Singh et al.

    Modulation spectra of natural sounds and ethological theories of auditory processing

    J. Acoust. Soc. Am.

    (2003)
  • J. Vliegen

    The role of spectral and periodicity cues in auditory stream segregation, measured using a temporal discrimination task

    J. Acoust. Soc. Am.

    (1999)
  • N. Grimault

    Auditory stream segregation on the basis of amplitude-modulation rate

    J. Acoust. Soc. Am.

    (2002)
  • B.C.J. Moore et al.

    Factors influencing sequential stream segregation

    Acta Acustica

    (2002)
  • S. Boehnke et al.

    The relation between auditory temporal interval processing and sequential stream segregation examined with stimulus laterality differences

    J. Percept. Psychophys.

    (2005)
  • C.E. Schreiner et al.

    Topography of excitatory bandwidth in cat primary auditory cortex: single-neuron versus multiple-neuron recordings

    J. Neurophysiol.

    (1992)
  • C.E. Schreiner

    Order and disorder in auditory cortical maps

    Curr. Opin. Neurobiol.

    (1994)
  • H. Versnel

    Ripple analysis in the ferret primary auditory cortex. III. Topographic and columnar distribution of ripple response parameters

    Aud. Neurosci.

    (1995)
  • N. Kowalski

    Analysis of dynamic spectra in ferret primary auditory cortex. I. Characteristics of single-unit responses to moving ripple spectra

    J. Neurophysiol.

    (1996)
  • N. Kowalski

    Analysis of dynamic spectra in ferret primary auditory cortex. II. Prediction of unit responses to arbitrary dynamic spectra

    J. Neurophysiol.

    (1996)
  • Cited by (311)

    View all citing articles on Scopus
    View full text