RT Journal Article SR Electronic T1 Faces and voices in the brain: a modality-general person-identity representation in superior temporal sulcus JF bioRxiv FD Cold Spring Harbor Laboratory SP 338475 DO 10.1101/338475 A1 Maria Tsantani A1 Nikolaus Kriegeskorte A1 Carolyn McGettigan A1 LĂșcia Garrido YR 2018 UL http://biorxiv.org/content/early/2018/07/13/338475.abstract AB Face-selective and voice-selective brain regions have been shown to represent face-identity and voice-identity, respectively. Here we investigated whether there are modality-general person-identity representations in the brain that can be driven by either a face or a voice, and that invariantly represent naturalistically varying face and voice tokens of the same identity. According to two distinct models, such representations could exist either in multimodal brain regions (Campanella and Belin, 2007) or in face-selective brain regions via direct coupling between face- and voice-selective regions (von Kriegstein et al., 2005). To test the predictions of these two models, we used fMRI to measure brain activity patterns elicited by the faces and voices of familiar people in multimodal, face-selective and voice-selective brain regions. We used representational similarity analysis (RSA) to compare the representational geometries of face- and voice-elicited person-identities, and to investigate the degree to which pattern discriminants for pairs of identities generalise from one modality to the other. We found no matching geometries for faces and voices in any brain regions. However, we showed crossmodal generalisation of the pattern discriminants in the multimodal right posterior superior temporal sulcus (rpSTS), suggesting a modality-general person-identity representation in this region. Importantly, the rpSTS showed invariant representations of face- and voice-identities, in that discriminants were trained and tested on independent face videos (different viewpoint, lighting, background) and voice recordings (different vocalizations). Our findings support the Multimodal Processing Model, which proposes that face and voice information is integrated in multimodal brain regions.Significance statement It is possible to identify a familiar person either by looking at their face or by listening to their voice. Using fMRI and representational similarity analysis (RSA) we show that the right posterior superior sulcus (rpSTS), a multimodal brain region that responds to both faces and voices, contains representations that can distinguish between familiar people independently of whether we are looking at their face or listening to their voice. Crucially, these representations generalised across different particular face videos and voice recordings. Our findings suggest that identity information from visual and auditory processing systems is combined and integrated in the multimodal rpSTS region.This work was supported by a research grant by the Leverhulme Trust (RPG-2014-392). We thank Matthew Longo for comments on a previous version of the manuscript, and Tiana Rakotonombana, Roxanne Zamyadi, and Rasanat Nawaz for help with preparing and piloting the stimuli. The authors declare no competing financial interests.