Abstract
A fundamental challenge for audition is parsing the voice of a single speaker amid a cacophony of other voices known as the Cocktail Party Problem (CPP). Despite its prevalence, relatively little remains known about how our simian cousins solve the CPP for active, natural communication. Here we employed an innovative, multi-speaker paradigm comprising five computer-generated Virtual Monkeys (VM) whose respective vocal behavior could be systematically varied to construct marmoset cocktail parties and tested the impact of specific acoustic scene manipulations on vocal behavior. Results indicate that marmosets not only employ auditory mechanisms – including attention – for speaker stream segregation, but also selectively change their own vocal behavior in response to the dynamics of the acoustic scene to overcome the challenges of the CPP. These findings suggest notable parallels between human and nonhuman primate audition and highlight the active role that speakers play to optimize communicative efficacy in complex real-world acoustic scenes.
Competing Interest Statement
The authors have declared no competing interest.
Footnotes
Results section has been updated with new experiment to update on Baseline behaviors, new analyses supporting Communication Index (previously Conversation Index), and updating all hypothesis tests with more information like F-stat. As well, new sections in methods was added to explain how statistical tests were done, assumptions and tests of the models. Figure 2 has been split into figure 2 and 3. The former now looks at baseline versus fixed behavior, and Figure 3 looks at Communication Index. Data updated in the last three figures. Finally, data in Dryad has been updated for ease of use and verification of our analyses.