Elsevier

Brain and Language

Volume 147, August 2015, Pages 66-75
Brain and Language

The influence of lexical statistics on temporal lobe cortical dynamics during spoken word listening

https://doi.org/10.1016/j.bandl.2015.05.005Get rights and content

Highlights

  • We studied lexical influences on neural representations of spoken word recognition.

  • Direct cortical electrode arrays recorded responses to auditory speech stimuli.

  • Neural responses in the temporal lobe varied by lexical status (word/pseudoword).

  • Both word and pseudoword cortical responses were modulated by lexical statistics.

  • Multiple lexical statistics affect spoken word processing from the earliest stages.

Abstract

Neural representations of words are thought to have a complex spatio-temporal cortical basis. It has been suggested that spoken word recognition is not a process of feed-forward computations from phonetic to lexical forms, but rather involves the online integration of bottom-up input with stored lexical knowledge. Using direct neural recordings from the temporal lobe, we examined cortical responses to words and pseudowords. We found that neural populations were not only sensitive to lexical status (real vs. pseudo), but also to cohort size (number of words matching the phonetic input at each time point) and cohort frequency (lexical frequency of those words). These lexical variables modulated neural activity from the posterior to anterior temporal lobe, and also dynamically as the stimuli unfolded on a millisecond time scale. Our findings indicate that word recognition is not purely modular, but relies on rapid and online integration of multiple sources of lexical knowledge.

Introduction

Many theoretical accounts of spoken word recognition posit a process of matching acoustic input with stored representations that have rich lexical and semantic structure. These models assume the existence of acoustic–phonetic, phonemic, and lexical targets in the brain that are activated when specific input is received (Marslen-Wilson, 1987, Marslen-Wilson, 1989, McClelland and Elman, 1986, Norris, 1994). For example, hearing the word “cat” evokes activity in neural populations that are selective to phonetic features like plosives and low front vowels, which in turn activate stored representations of the phonemes /k/, /æ/, and /t/. The activations of these phonemic representations are integrated over time, and serve as inputs to neurons at the lexical level that represent the word “cat”. The dynamic nature of this process has led many researchers to suggest that the representations in this hierarchy interact with and influence each other, meaning that word recognition is an iterative process where multiple targets at each level are active until the input is no longer consistent with those targets (e.g., /k-æ-p/) (Heald and Nusbaum, 2014, Marslen-Wilson and Welsh, 1978).

Several influential models of the neural basis of speech comprehension (Hickok and Poeppel, 2007, Scott and Wise, 2004) propose a set of cortical regions that perform the transformation from spectrotemporal representations of speech signals to abstracted lexical representations of words. These proposals are based on data from patients with lesions to various cortical areas (Dronkers & Wilkins, 2004), and on recent neuroimaging studies that support a distributed and interconnected network of cortical regions thought to be responsible for the representation of words and language (see, e.g., Davis and Gaskell, 2009, Turken and Dronkers, 2011). Many of these studies observe functional specialization of different regions in the temporal lobe, with acoustic–phonetic and phonemic representations in the posterior superior temporal cortex, and higher-order lexical representations in the middle, anterior, and ventral temporal cortex. This pathway is often referred to as the auditory ventral stream and is argued to link acoustic, phonemic, and lexical processing (see also DeWitt and Rauschecker, 2012, Okada et al., 2010, Lerner et al., 2011).

However, it remains unclear how this transformation occurs, and specifically how the ventral stream integrates high-level knowledge about the language with bottom-up acoustic input. In particular, the mental lexicon can be characterized by a number of features and statistics that relate the stored representations of individual words with one another, and also with lower-level features like phonemes and phonetic features. As a speech token unfolds, a cohort of forms stored in the lexicon that match the acoustic input is activated (Marslen-Wilson, 1987, Marslen-Wilson, 1989). This matching set of lexical forms (the cohort) will change over time as more of the target word is heard, thereby changing the lexical competition space on a moment-by-moment basis. It is therefore necessary to capture the temporal dynamics of these changing lexical statistics when describing the processes involved in word comprehension. A primary goal of the present study is to describe the spatiotemporal dynamics of spoken word recognition across the duration of a word, and across the auditory ventral stream.

In the present study, we compare neural responses to real words (e.g. ceremony, repetition) and novel forms, or pseudowords (e.g. moanaserry, piteretion), to examine how this lexical structure is encoded in the brain. Several studies have found differences in the hemodynamic response to real words and pseudowords (Davis and Gaskell, 2009, Mainy et al., 2008, Mechelli et al., 2003, Raettig and Kotz, 2008, Tanji et al., 2005). However, it is likely that the word/pseudoword difference is not purely binary; in particular, there is behavioral evidence that word-like forms may be processed as potential real words (De Vaan et al., 2007, Lindsay et al., 2012, Meunier and Longtin, 2007). Taken together, these findings suggest that while neural responses to pseudowords can be reliably distinguished from familiar word forms, the processing of novel forms may also rely on information stored in the lexicon, including high-level features like cohort statistics. Therefore, a second goal of this study is to explore how this type of stored lexical information can affect the processing of both pseudowords and real words.

Lexical statistics which capture aspects of lexical competition may be of particular importance for both real word and pseudoword processing. Cohort size (Magnuson et al., 2007, Marslen-Wilson, 1987, Marslen-Wilson, 1989) is defined as the number of words in the lexicon that match the phonemes the listener has heard up to any given point in a word. This provides an incremental metric of potential lexical forms which changes as acoustic input is received. Average cohort frequency, by contrast, is defined as the average lexical frequency of the words in a cohort. Finally, summed cohort frequency sums the lexical frequency of all words in a cohort, thus quantifying the number of words and their relative usage statistics in a single metric. The extent to which neural activity evoked by real words and pseudowords is modulated by these features allows us to explore the specific linguistic processes involved in the acoustic-to-lexical transformation.

To study the on-line processing of lexical forms and cohort statistics, we examined cortical responses to real words and pseudowords using data recorded from high-density electrocorticographic (ECoG) electrodes placed directly on the cortical surface. ECoG provides high spatial and temporal resolution with a relatively high signal-to-noise ratio at the individual electrode level. These properties are critical to our study goals of examining how the lexical status and cohort statistics of specific speech tokens affect neural activity as the input is being processed in real-time. Because the neural representations of words are complex, distributed, and likely represented in a high-dimensional space, these methodological advantages may be necessary to uncover the nature of lexical processing.

Section snippets

Subjects

Four human subjects underwent surgical placement of a 256-channel subdural electrocorticography (ECoG) array as part of clinical treatment for epilepsy. All electrode arrays were placed over the perisylvian region of the language-dominant hemisphere (left hemisphere for all but subject 3; no observable patterns differentiated the right hemisphere data from the left hemisphere data). All subjects gave informed written consent prior to surgery and experimental testing.

Stimulus design

The stimulus set consisted

Timing and location of responses to real words and pseudowords

Real words and pseudowords evoked different high-gamma neural responses across many temporal lobe electrodes (see Fig. 2a for electrode placement in one subject) in all participants, with typically stronger activity for pseudowords (Fig. 2b). This lexicality effect was significant between 320 and 1500 ms (bootstrap p < 0.05). In addition to these magnitude differences, there was a clear progression in the timing of the peak of the neural response from posterior to anterior temporal lobe electrodes

Discussion

We used high-resolution direct intracranial recordings to examine how the ventral stream for speech processes lexical information in both familiar and novel word forms. We found that neural activity recorded from the temporal lobe while participants listened to words and pseudowords reflects differences between these broad categories with relatively complex temporal dynamics. Further, responses to spoken stimuli were modulated by language-level features – cohort size, average cohort frequency,

Acknowledgments

The authors would like to thank Connie Cheung, Angela Ren, and Susanne Gahl for technical assistance and valuable comments on drafts of this work, and Stephen Wilson for stimulus design. E.S.C. was funded by a National Science Foundation Research Fellowship. M.K.L. was funded by a National Institutes of Health National Research Service Award F32-DC013486 and by an Innovative Research Grant from the Kavli Institute for Brain and Mind. E.F.C. was funded by National Institutes of Health Grants

References (46)

  • W. Marslen-Wilson et al.

    Processing interactions and lexical access during word recognition in continuous speech

    Cognitive Psychology

    (1978)
  • J.L. McClelland et al.

    The TRACE model of speech perception

    Cognitive Psychology

    (1986)
  • C.R. McDonald et al.

    Multimodal imaging of repetition priming: Using fMRI, MEG, and intracranial EEG to reveal spatiotemporal profiles of word processing

    Neuroimage

    (2010)
  • F. Meunier et al.

    Morphological decomposition and semantic integration in word processing

    Journal of Memory and Language

    (2007)
  • D. Mirman et al.

    Statistical and computational models of the visual world paradigm: Growth curves and individual differences

    Journal of Memory and Language

    (2008)
  • D. Norris

    Shortlist: A connectionist model of continuous speech recognition

    Cognition

    (1994)
  • R. Prabhakaran et al.

    An event-related fMRI investigation of phonological–lexical competition

    Neuropsychologia

    (2006)
  • L. Pylkkänen et al.

    Neuromagnetic evidence for the timing of lexical activation: An MEG component sensitive to phonotactic probability but not to neighborhood density

    Brain and Language

    (2002)
  • T. Raettig et al.

    Auditory processing of different types of pseudo-words: An event-related fMRI study

    Neuroimage

    (2008)
  • S.K. Scott et al.

    The functional neuroanatomy of prelexical processing in speech perception

    Cognition

    (2004)
  • M.S. Vitevitch et al.

    Probabilistic phonotactics and neighborhood activation in spoken word recognition

    Journal of Memory and Language

    (1999)
  • K.E. Bouchard et al.

    Functional organization of human sensorimotor cortex for speech articulation

    Nature

    (2013)
  • M. Brysbaert et al.

    Moving beyond Kučera and Francis: A critical evaluation of current word frequency norms and the introduction of a new and improved word frequency measure for American English

    Behavior Research Methods

    (2009)
  • Cited by (0)

    View full text