Abstract
The brain combines sounds from the two ears, but what is the algorithm used to achieve this fusion of signals? Here we take a model-driven approach to interpret both psychophysical increment detection thresholds and steady-state electrophysiology (EEG) data to reveal the architecture of binaural combination for amplitude modulated tones. Increment thresholds followed a ‘dipper’ shaped function of pedestal modulation depth, and were consistently lower for binaural than monaural presentation. The EEG responses were greater for binaural than monaural presentation, and when a modulated masker was presented to one ear, it produced only weak suppression of the signal presented to the other ear. Both data sets were well-fit by a computational model originally derived for visual signal combination, but with suppression between the two channels (ears) being much weaker than in binocular vision. We suggest that the distinct ecological constraints on vision and hearing can explain this difference, if it is assumed that the brain avoids over-representing sensory signals originating from a single object. These findings position our understanding of binaural summation in a broader context of work on sensory signal combination in the brain, and delineate the similarities and differences between vision and hearing.