A linguistic representation in the visual system underlies successful lipreading

Aaron R Nidiffer; Cody Zhewei Cao; Aisling O’Sullivan; Edmund C Lalor

doi:10.1101/2021.02.09.430299

Abstract

There is considerable debate over how visual speech is processed in the absence of sound and whether neural activity supporting lipreading occurs in visual brain areas. Surprisingly, much of this ambiguity stems from a lack of behaviorally grounded neurophysiological findings. To address this, we conducted an experiment in which human observers rehearsed audiovisual speech for the purpose of lipreading silent versions during testing. Using a combination of computational modeling, electroencephalography, and simultaneously recorded behavior, we show that the visual system produces its own specialized representation of speech that is 1) well-described by categorical linguistic units (“visemes”) 2) dissociable from lip movements, and 3) predictive of lipreading ability. These findings contradict a long-held view that visual speech processing co-opts auditory cortex after early visual processing stages. Consistent with hierarchical accounts of visual and audiovisual speech perception, our findings show that visual cortex performs at least a basic level of linguistic processing.

Competing Interest Statement

The authors have declared no competing interest.

The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license.