Abstract
How and where in the brain audio-visual signals are bound to create multimodal objects remains unknown. One hypothesis is that temporal coherence between dynamic multisensory signals provides a mechanism for binding stimulus features across sensory modalities in early sensory cortex. Here we report that temporal coherence between auditory and visual streams enhances spiking representations in auditory cortex. We demonstrate that when a visual stimulus is temporally coherent with one sound in a mixture, the neural representation of that sound is enhanced. Supporting the hypothesis that these changes represent a neural correlate of multisensory binding, the enhanced neural representation extends to stimulus features other than those that bind auditory and visual streams. These data provide evidence that early cross-sensory binding provides a bottom-up mechanism for the formation of cross-sensory objects and that one role for multisensory binding in auditory cortex is to support auditory scene analysis.