Elsevier

Cortex

Volume 77, April 2016, Pages 1-12
Cortex

Research report
Mental imagery of speech implicates two mechanisms of perceptual reactivation

https://doi.org/10.1016/j.cortex.2016.01.002Get rights and content

Abstract

Sensory cortices can be activated without any external stimuli. Yet, it is still unclear how this perceptual reactivation occurs and which neural structures mediate this reconstruction process. In this study, we employed fMRI with mental imagery paradigms to investigate the neural networks involved in perceptual reactivation. Subjects performed two speech imagery tasks: articulation imagery (AI) and hearing imagery (HI). We found that AI induced greater activity in frontal-parietal sensorimotor systems, including sensorimotor cortex, subcentral (BA 43), middle frontal cortex (BA 46) and parietal operculum (PO), whereas HI showed stronger activation in regions that have been implicated in memory retrieval: middle frontal (BA 8), inferior parietal cortex and intraparietal sulcus. Moreover, posterior superior temporal sulcus (pSTS) and anterior superior temporal gyrus (aSTG) was activated more in AI compared with HI, suggesting that covert motor processes induced stronger perceptual reactivation in the auditory cortices. These results suggest that motor-to-perceptual transformation and memory retrieval act as two complementary mechanisms to internally reconstruct corresponding perceptual outcomes. These two mechanisms can serve as a neurocomputational foundation for predicting perceptual changes, either via a previously learned relationship between actions and their perceptual consequences or via stored perceptual experiences of stimulus and episodic or contextual regularity.

Introduction

Sensory cortices can be activated without any external stimulation (e.g., Ji and Wilson, 2006, Wheeler et al., 2000). That is, perceptual neural representations can be reconstructed without perceptual processing (referred to as perceptual reactivation). Mental imagery, defined as an internally generated quasi-perceptual experience, is one such example (e.g., Kosslyn et al., 1999, Kraemer et al., 2005). The ability to form mental images has been hypothesized as a vehicle for generating and representing thoughts. This argument can be found as early as Plato's Theaetetus [427–347 BC] (1987) and Aristotle's De Anima [384–322 BC] (1986). In the age of enlightenment, mental imagery was considered analogous to perception by philosophers such as Descartes, 1642/1984, Hobbes, 1651/1968, Berkeley, 1734/1965a, Berkeley, 1734/1965b) and Hume (1969). Early experimental psychologists such as Wundt (1913) and James (1890) proposed that ideas were represented as mental images in both visual and auditory domains. Modern research in mental imagery has yielded insight on how thought is represented in cognitive systems (Kosslyn, 1994, Kosslyn et al., 2001, Paivio, 1971, Paivio, 1986, Pylyshyn, 1981, Pylyshyn, 2003).

Recently, an additional computational role of mental imagery has been proposed: a mechanism to plan possible future contingencies. That is, mental imagery has been modeled as a process in which perceptual consequences can be predicted to gain advantages in various aspects of perception, memory, decision making and motor control (Albright, 2012, Moulton and Kosslyn, 2009, Schacter et al., 2012, Tian and Poeppel, 2012). The reactivation of perceptual neural representations without any external stimulation is the key mechanism mediating this predictive ability (Moulton & Kosslyn, 2009). Internally induced neural representations, which are highly similar to the ones established in corresponding perceptual processing, have been observed in modality-specific areas, such as in visual (e.g., Kosslyn et al., 1999, O'Craven and Kanwisher, 2000), auditory (e.g., Kraemer et al., 2005, Shergill et al., 2001, Zatorre et al., 1996), somatosensory (e.g., Yoo et al., 2003, Zhang et al., 2004) and olfactory (e.g., Bensafi et al., 2003, Djordjevic et al., 2005) domains.

It is not clear how these neural representations are reconstructed. Preliminary evidence from an MEG study (Tian & Poeppel, 2013) suggests that imagining speaking (articulation imagery, AI) and imagining hearing (hearing imagery, HI) differentially modulated neural responses to subsequent auditory stimuli. These distinct modulation effects by different types of imagery suggest that similar auditory neural representations may be internally formed via different neural pathways. A dual stream prediction model (DSPM, Fig. 1) was proposed in which two distinct processes in parallel neural pathways can internally induce the corresponding perceptual neural representation (Tian and Poeppel, 2012, Tian and Poeppel, 2013).

In the simulation-estimation prediction stream (Fig. 1), the perceptual consequences of actions are predicted by simulating the movement trajectory, followed by estimating the perceptual changes that would be associated with this movement. AI has been hypothesized to implement the motor-to-sensory transformation for simulation-estimation mechanism (Tian & Poeppel, 2013). Specifically, during AI, a motor simulation process similar to speech motor preparation is carried out, but without execution and output (Palmer et al., 2001, Tian and Poeppel, 2010, Tian and Poeppel, 2012). Therefore, neural networks that mediate motor simulation should be similar to the ones implicated in motor preparation, including supplementary motor area (SMA), inferior frontal gyrus (IFG), premotor and insula (Bohland and Guenther, 2006, Palmer et al., 2001, Shuster and Lemieux, 2005). After motor simulation, a copy of the planned motor commands – known as the efference copy (Von Holst & Mittelstaedt, 1950/1973; for a review see Wolpert & Ghahramani, 2000) – is sent to the somatosensory areas and is used in a forward model to estimate the somatosensory consequences (Blakemore & Decety, 2001). This somatosensory estimation is hypothesized to be governed by the networks underlying somatosensory perception (Blakemore et al., 1998, Tian and Poeppel, 2010, Tian and Poeppel, 2012), including primary and secondary somatosensory regions, parietal operculum (PO) and the supramarginal gyrus (SMG). Moreover in the context of speech, we hypothesize that auditory consequences are predicted on the basis of somatosensory estimation, and this auditory estimation will recruit neural structures in temporal auditory cortices (Tian and Poeppel, 2010, Tian and Poeppel, 2012, Tian and Poeppel, 2013, Tian and Poeppel, 2015).

In the memory-retrieval prediction stream (Fig. 1), the internally induced neural representations are the result of memory retrieval processes – reconstructing stored perceptual information in modality-specific cortices (Kosslyn, 1994, Kosslyn, 2005, Wheeler et al., 2000). In particular, the retrieved object properties from long-term memory reactivate the sensory cortices that originally processed the object features (Kosslyn, 1994). In this experiment, we employed HI to probe this memory-retrieval stream. Auditory representations can be retrieved from various memory sources such as episodic memory, which presumably relies on hippocampal structures (Carr et al., 2011, Eichenbaum et al., 2012) with a possible buffer site in parietal cortex (Vilberg and Rugg, 2008, Wagner et al., 2005). Auditory representations can also be transformed from lexical and semantic information stored in semantic networks, including frontal (e.g., dorsomedial prefrontal cortex, IFG, ventromedial prefrontal cortex), parietal (e.g., posterior inferior parietal lobe) and temporal (e.g., middle temporal gyrus) regions (Binder et al., 2009, Lau et al., 2008, Price, 2012). Regardless of the divergent functional roles (episodic or semantic networks), frontal and parietal regions are reliably activated during memory retrieval processes. Therefore, neural activation in a frontal-parietal distributed network – the proposed memory-retrieval prediction stream – should be observed during HI.

This study uses fMRI to investigate three neuroanatomical/functional hypotheses that are generated from the DSPM. First, if the perfect simulation-estimation and memory-retrieval tasks were carried out, two distinct processing streams would be revealed separately. However, because speech imagery could involve both production and perception, we predict that both types of imagery will activate the simulation-estimation stream for simulating speech motor action (Tian & Poeppel, 2013). More importantly, we hypothesize that each type of imagery will recruit each prediction stream to a different extent. Specifically, we predict that AI will induce stronger activation in the simulation-estimation prediction stream, including SMA, IFG, premotor, insula for motor simulation, as well as primary and/or secondary somatosensory areas PO and SMG for subsequent estimation of somatosensory consequences. On the other hand, we predict that HI will have more activation in the memory-retrieval prediction stream, which is comprised of frontal, superior and inferior parietal cortices that are associated with memory retrieval (Binder et al., 2009, Lau et al., 2008, Price, 2012, Vilberg and Rugg, 2008, Wagner et al., 2005).

Second, we suggest that a more precise, detailed auditory prediction can be induced through simulation-estimation mechanisms, comparing to that obtained via memory-retrieval route (Hickok, 2012, Oppenheim and Dell, 2010, Tian and Poeppel, 2012, Tian and Poeppel, 2013). We propose that there is a one-to-one mapping between motor simulation and perceptual estimation via a bridge of somatosensory estimation in the simulation-estimation stream. Such a deterministic prediction mechanism, contrasted with the memory-retrieval prediction stream's probabilistic prediction mechanism (narrowing down the target features in distributions of stored memory), presumably suffers less interference and lateral inhibition from similar features and yields a stronger and robust representation (Tian and Poeppel, 2012, Tian and Poeppel, 2013). Based on this hypothesis of enriched auditory representations via simulation and estimation processes, we predict that auditory cortices will be more strongly activated in AI than in HI.

Finally, we hypothesize that the neural networks governing simulation within the simulation-estimation stream overlap with cortical regions underlying motor preparation during speech production (Tian & Poeppel, 2012). That is, the initial motor processes are the same during articulation (A) and AI until the processes diverge, specifically until the motor signals are not executed during imagery. Therefore, we predict that enhanced activity in SMA, IFG, premotor areas and insula, which has been observed during preparation of overt speech production (Brendel et al., 2010, Riecker et al., 2005), will be observed in both AI and A. The observation of overlapping neural networks will provide evidence towards potentially shared neural mechanisms between overt and covert speech production, and furthermore suggests that mental imagery of speech is a valid paradigm to research these shared motor processes.

Section snippets

Participants

Eighteen volunteers gave informed consent and participated in the experiment (10 males, mean age 28.2 years, range 20–44 years). All participants were right-handed, with no history of neurological disorders. The experimental protocol was approved by the New York University Institutional Review Board (IRB).

Materials

Two 600-msec duration consonant-vowel syllables (/ba/,/ki/) were used as auditory stimuli (female voice; sampling rate of 48 kHz). All sounds were normalized to 70 dB SPL and delivered through

Main effects of tasks A, AI and HI

Speech production networks were observed during A, which included bilateral anterior cingulate cortex (ACC), pre-SMA/SMA complex, sensorimotor cortex, middle frontal cortex (BA 46) and right posterior cingulate cortex. Cerebellum and subcortical regions, including thalamus and basal ganglia were also activated (see Table S1 for a complete activation list). Significant activations, as well as in all following analyses, surpassed a threshold of t > 3.65 (p < .001 uncorrected) with an extent of at

Discussion

We investigated the neural networks that mediate perceptual reactivation using fMRI with speech imagery paradigms. Whereas the neural networks that mediate AI and HI largely overlapped in frontal-parietal motor-sensory areas, different subsets of frontal and parietal regions were involved in each task. This differential involvement of neural networks suggests two possible mechanisms for reactivating perceptual neural representation.

Frontal-parietal neural networks were observed during both AI

Acknowledgments

We thank Keith Sanzenbach for his technical assistance with fMRI recording, Tobias Overath and Thomas Schofield for their discussion and guidance with fMRI analyses, and Jess Rowland for her comments on this manuscript. This study was supported by MURI ARO #54228-LS-MUR, NIH 2R01DC 05660, a grant from the GRAMMY Foundation®, Major Projects Program of the Shanghai Municipal Science and Technology Commission (STCSM) 15JC1400104 and National Natural Science Foundation of China 31500914.

References (78)

  • N. Picard et al.

    Imaging the premotor areas

    Current Opinion in Neurobiology

    (2001)
  • C.J. Price

    A review and synthesis of the first 20years of PET and fMRI studies of heard speech, spoken language and reading

    NeuroImage

    (2012)
  • Z. Pylyshyn

    Return of the mental image: are there really pictures in the brain?

    Trends in cognitive sciences

    (2003)
  • D.L. Schacter et al.

    The future of memory: remembering, imagining, and the brain

    Neuron

    (2012)
  • J.D. Schmahmann et al.

    Three-dimensional MRI atlas of the human cerebellum in proportional stereotaxic space

    NeuroImage

    (1999)
  • L.I. Shuster et al.

    An fMRI investigation of covertly and overtly produced mono-and multisyllabic words

    Brain and Language

    (2005)
  • J.A. Tourville et al.

    Neural mechanisms underlying auditory feedback control of speech

    NeuroImage

    (2008)
  • N. Tzourio-Mazoyer et al.

    Automated anatomical labeling of activations in SPM using a macroscopic anatomical parcellation of the MNI MRI single-subject brain

    NeuroImage

    (2002)
  • K.L. Vilberg et al.

    Memory retrieval and the parietal cortex: a review of evidence from a dual-process perspective

    Neuropsychologia

    (2008)
  • A.D. Wagner et al.

    Parietal lobe contributions to episodic memory retrieval

    Trends in Cognitive Sciences

    (2005)
  • J.M. Zarate et al.

    Neural networks involved in voluntary and involuntary vocal pitch regulation in experienced singers

    Neuropsychologia

    (2010)
  • J.M. Zarate et al.

    Experience-dependent neural substrates involved in vocal pitch regulation during singing

    NeuroImage

    (2008)
  • R.J. Zatorre et al.

    Mental concerts: musical imagery and auditory cortex

    Neuron

    (2005)
  • Aristotle

    De Anima (on the soul) (H. Lawson-Tancred, Trans.)

    (1986)
  • P. Belin et al.

    Lateralization of speech and auditory temporal processing

    Journal of Cognitive Neuroscience

    (1998)
  • M. Bensafi et al.

    Olfactomotor activity during imagery mimics that during perception

    Nature neuroscience

    (2003)
  • G. Berkeley

    Three dialogues between Hylas and Philonus

  • G. Berkeley

    A treatise concerning the principles of human knowledge

  • J.R. Binder et al.

    Where is the semantic system? A critical review and meta-analysis of 120 functional neuroimaging studies

    Cerebral Cortex

    (2009)
  • S.-J. Blakemore et al.

    From the perception of action to the understanding of intention

    Nature Reviews Neuroscience

    (2001)
  • S.-J. Blakemore et al.

    Central cancellation of self-produced tickle sensation

    Nature Neuroscience

    (1998)
  • M.F. Carr et al.

    Hippocampal replay in the awake state: a potential substrate for memory consolidation and retrieval

    Nature Neuroscience

    (2011)
  • J.L. Chen et al.

    Listening to musical rhythms recruits motor regions of the brain

    Cerebral Cortex

    (2008)
  • R. Descartes

    Meditations on first philosophy (J. Cottingham, R. Stoothoff & D. Murdoch, Trans.)

    (1642/1984)
  • H. Duvernoy

    The human brain: Structure, three-dimensional sectional anatomy and MRI

    (1991)
  • A.R. Halpern et al.

    When that tune runs through your head: a PET investigation of auditory imagery for familiar melodies

    Cerebral Cortex

    (1999)
  • S.C. Herholz et al.

    Neuronal correlates of perception, imagery, and memory for familiar tunes

    Journal of Cognitive Neuroscience

    (2012)
  • G. Hickok

    Computational neuroanatomy of speech production

    Nature Reviews Neuroscience

    (2012)
  • T. Hobbes
  • Cited by (92)

    • Inner speech as language process and cognitive tool

      2023, Trends in Cognitive Sciences
    View all citing articles on Scopus
    View full text