Abstract
The striking homogeneity of cerebellar microanatomy is strongly suggestive of a corresponding uniformity of function. Consequently, theoretical models of the cerebellum's role in motor control should offer important clues regarding cerebellar contributions to cognition. One such influential theory holds that the cerebellum encodes internal models, neural representations of the context-specific dynamic properties of an object, to facilitate predictive control when manipulating the object. The present study examined whether this theoretical construct can shed light on the contribution of the cerebellum to language processing. We reasoned that the cerebellum might perform a similar coordinative function when the context provided by the initial part of a sentence can be highly predictive of the end of the sentence. Using functional MRI in humans we tested two predictions derived from this hypothesis, building on previous neuroimaging studies of internal models in motor control. First, focal cerebellar activation–reflecting the operation of acquired internal models–should be enhanced when the linguistic context leads terminal words to be predictable. Second, more widespread activation should be observed when such predictions are violated, reflecting the processing of error signals that can be used to update internal models. Both predictions were confirmed, with predictability and prediction violations associated with increased blood oxygenation level-dependent signal in the posterior cerebellum (Crus I/II). Our results provide further evidence for cerebellar involvement in predictive language processing and suggest that the notion of cerebellar internal models may be extended to the language domain.
Introduction
While there is an emerging consensus on cerebellar involvement in a wide range of cognitive tasks, including language processing (Strick et al., 2009, but see Glickstein et al., 2011), a computational account of how this subcortical structure contributes to cognition remains elusive. Importantly, the striking microanatomical homogeneity of the cerebellum suggests a corresponding unity of function across motor and non-motor domains (Ramnani, 2006), and has inspired the idea that theories developed for motor control should be informative for understanding cerebellar contributions to cognition. One influential theory posits that the cerebellum encodes internal models, neural representations of the essential dynamic properties of an object (e.g., body part or tool) that can be used to predict and control actions involving that object within a particular context (Wolpert et al., 1995; Ito, 2006), or contribute to cognition by similarly encoding the context-specific dynamics of more abstract representations (Ramnani, 2006; Ito, 2008).
Internal models are acquired and formed by supervised, or error-based learning (Doya, 1999): the model is continuously modified if its output, the predicted state of the system, does not match the actual or observed state. Functional MRI (fMRI) studies of motor control have provided evidence of prediction error signals in the human cerebellum (Imamizu et al., 2000; but see Diedrichsen et al., 2005; Schlerf et al., 2012). Moreover, when controlling for error magnitude, changes in more focal cerebellar activation were consistent with the development of an acquired internal model (Imamizu et al., 2000). These findings suggest that activation patterns within the cerebellum reflect the generation of predictions and processing of prediction error signals, two key characteristics of internal models.
Neuroimaging studies of linguistic processing have consistently reported activation within the cerebellum (for review, see Murdoch, 2010). The functional correlates of these activation patterns, however, remain unclear, especially as motor-based accounts such as covert rehearsal have failed to hold up to experimental investigation (Chein and Fiez, 2001; Ravizza et al., 2006). Interestingly, the neuroscience of language has in recent years increasingly focused on predictive mechanisms (Poeppel et al., 2012), either treating language within the broader theoretical context of predictive coding (Gagnepain et al., 2012) or explicitly using the concept of internal models (Pickering and Garrod, 2013).
Thus, our aim in the current study was to test whether the notion of internal models can help shed light on the cerebellar role in language processing. Specifically, we reasoned that cerebellar internal models might aid sentence comprehension, by using the context of a partially presented sentence to predict the next word. A recent transcranial magnetic stimulation (TMS) study has provided support for this generalization of the internal model hypothesis to language (Lesage et al., 2012). Here, we use fMRI to obtain converging evidence and test additional predictions of this hypothesis.
Building on neuroimaging studies examining the role of the cerebellum in internal models for motor control, we tested two key predictions: (1) focal cerebellar activation, reflecting the engagement of a learned internal model, should be observed when the sentence context makes the final word predictable and (2) more widespread activation should be observed when this prediction is violated, reflecting the processing of error signals that can be used to update the internal model or create a novel internal model.
Based on recent meta-analyses, we expected these activations in the posterior cerebellum, predominantly in the right cerebellar hemisphere (Stoodley and Schmahmann, 2009; E et al., 2014).
Materials and Methods
Participants.
All participants were self-reported as right-handed, had normal or corrected-to-normal vision, no known neurological deficits, and were fluent in Norwegian. Of the 39 participants recruited for the study, two were excluded due to abnormalities discovered on MRI, and five others due to excessive head movement in the scanner, leaving a final sample of 32 (21 female, mean age 26.2 (SD = 9.08)). All participants provided oral and written informed consent. The study was approved by the Regional Ethics Committee and was conducted in accordance with ethical standards specified in the 1964 Declaration of Helsinki.
Experimental procedure.
An illustration of the task structure is given in Figure 1. Briefly, on each trial, the participant viewed a fixation cross, followed by a visual prompt (asterisk) and a sequence of five centrally presented words (in lower case). Each of these stimuli was presented for 750 ms, and there was no pause between successive stimuli (0 ms interstimulus interval). We used a fixed rate of stimulus presentation to minimize the disruptive effects of serial reading, while placing minimal demands on working memory.
The crucial experimental variable, the predictability of the terminal, target word, was manipulated by varying the context established by the initial four words. In the Congruent condition, sentences were constructed so that the target word was highly predictable (e.g., “two plus two is four.”). In the Incongruent condition, the sentences were also designed such that the target word was highly predictable, but the prediction was violated by presenting a terminal word that was inappropriate given the context (e.g., “[the water] had frozen to cars”). In the Scrambled condition, the initial four words did not establish a context for a grammatical sentence (e.g., “fast in clock plane”), and thus the target word was not predictable (e.g., “through”). We also included a Letter String condition to control for the visual and motor aspects of the task, replacing the words with meaningless letter strings of identical consonants (e.g., “rrr gggg nnnn pp kkkk”).
Immediately after the presentation of the target word (or consonant string), the question, “Was the sentence meaningful?” was presented on the screen, indicating that the participant should judge whether or not the sequence constituted a meaningful sentence (Congruent condition vs Incongruent, Scrambled, and Letter String conditions). This question was displayed for 3000 ms and the participant was required to respond within this time window by pressing one of two buttons with his/her right hand, using the index finger (“yes”) or thumb (“no”). Participants were instructed to wait for the question before answering, and were told that there was no need to respond quickly. The onset of the next trial followed directly after the offset of the question.
The entire experiment consisted of 30 trials per condition, plus 15 null trials in which an asterisk replaced the words/letters for the full trial duration. The order of the 135 trials was randomized. Stimuli were presented using E-Prime 2.0 software (Psychology Software Tools) and MR-compatible goggles with two LCD-displays (VisualSystems; NordicNeuroLab), while responses were collected using an MR-compatible response grip with two response buttons (ResponseGrip; NordicNeuroLab). The total duration of the single functional scanning run was ∼19 min.
The sentence stimuli used in the present experiment were constructed with the aim of maximizing the predictability of the final word in the congruent and incongruent conditions, with the additional constraint that the presented target word in the incongruent condition constitute a violation of these predictions. We confirmed this by presenting 100 participants with the context phrase (four initial words) for the 30 congruent and 30 incongruent sentences and asking them to generate a terminal, target word. Cloze probability, the ratio of participants who used the actual target word to complete the sentences, was 0.85 (SD: 0.19) for congruent sentences and 0 for incongruent sentences. Word frequency, defined as the number of occurrences per million words, was extracted from a large database of Norwegian words (The Text Laboratory, ILN, University of Oslo; http://www.tekstlab.uio.no/frekvensordlister/).
A repeated-measures ANOVA was conducted to compare the conditions in terms of word frequency given that this variable is known to influence the blood oxygenation level-dependent (BOLD) response (Chee et al., 2002; Carreiras et al., 2006; Grande et al., 2011). When all five words were included, there was no significant difference between conditions in word frequency (F(2,447) = 1.309, p = 0.271). However, if the analysis is restricted to the target words, the effect of condition was reliable (F(2,87) = 3.944, p < 0.05), with the difference occurring because a number of high-frequency function words (e.g., “and” and “in”) appeared in the target position for some of the “sentences” in the scrambled condition. Given the frequency differences, a post hoc procedure was applied to the stimulus sets to equate the conditions for target word frequency. We excluded six to eight sentences for each condition, eliminating those sentences in which the target word was at either extreme in terms of frequency (high or low frequency). This approach effectively equalized mean word frequency across conditions (F(2,67) = 0.241, p = 0.787; Means: Congruent: 16.56 [13.54]; Incongruent: 15.04 [15.83]; Scrambled: 13.27 [18.46]), with the penalty of slightly reducing power by reducing the number of trials per condition. All fMRI analyses were performed both with the full set of sentences and with the pruned sentence sets in which word frequency was better controlled.
Scan acquisition.
Scanning was conducted on a 3 T, Phillips Achieva whole-body scanner, with an 8 channel Philips SENSE head coil (Philips Medical Systems). Functional images were obtained with a single-shot T2* weighted echo planar imaging sequence (repetition time (TR): 2000 ms; slice echo time (TE): 30 ms; field of view (FOV): 240 × 240 × 108; imaging matrix: 80 × 80; flip angle 80° 36 axial slices, interleaved at 3 mm thickness, no gap, voxel size 3 × 3 × 3 mm). The scanning session consisted of 563 volumes, synchronized to the onset of the experiment. To obtain complete coverage of the cerebellum, the slice orientation was adjusted to be ∼45° relative to the line running from the anterior to posterior commissure. This orientation resulted in parts of the posterior frontal lobe and superior parietal lobe falling outside the FOV. A T1 weighted anatomical image with a voxel size of 1 × 1 × 1 mm was recorded for registration of the functional images (180 sagittal slices; TR: 8.5 ms; TE: 2.3 ms; FOV: 256 × 256 × 180; flip angle: 7°).
Imaging analysis.
Functional images were converted to 4D NIfTI files (http://lcni.uoregon.edu/∼jolinda/MRIConvert/) and analyzed using SPM8 (http://www.fil.ion.ucl.ac.uk/spm/software/spm8). Images were corrected for slice timing, realigned to correct for residual head movement, and coregistered to the anatomical image. Following these preprocessing steps, the analysis stream was split into a cerebellum-specific analysis to address the main hypotheses of the study, and a whole-brain analysis to examine cerebral activation patterns.
For the cerebellum-specific analysis, unsmoothed images were first analyzed in native space using a general linear model (GLM). Event-related regressors, modeled as delta functions time locked to the onset of the target word, were created for the four trial types (congruent, incongruent, scrambled, letter strings). These functions were convolved with the canonical hemodynamic response function. Low-frequency drifts were removed using a high-pass filter (cutoff 128 s) and six head motion parameters from the realignment step were included as additional regressors. Serial correlations in fMRI time series were accounted for by the autoregressive AR(1) model.
The anatomical images were normalized to a high-resolution cerebellar template Spatially Unbiased Infratentorial Template (SUIT; Diedrichsen, 2006) allowing us to bring the functional contrast images (the weighted sums of single β-images) into a common template space, resliced to 2 × 2 × 2 mm voxels. The normalized contrast images were smoothed with a 3D Gaussian kernel (4 mm full-width at half-maximum; FWHM). Statistical analyses were performed using random-effects analyses on these images.
For the whole-brain analysis, we normalized the anatomical images to the MNI template using the unified segmentation and normalization algorithm implemented in SPM8 (Ashburner and Friston, 2005). The resulting transformation parameters were then applied to the functional images. Images were smoothed with a Gaussian kernel of 8 mm FWHM and analyzed using the same GLM as was used in the cerebellum-specific analysis.
We recognize that the presentation of single words can be sufficient to generate predictions of, or prime, related words (Meyer and Schvaneveldt, 1971); indeed, presentations of pairs of unrelated words elicits an electrophysiological response (N400) associated with semantic prediction errors (Ortu et al., 2013). Consequently, the Scrambled condition is likely to involve predictions and prediction errors to some degree. Note, however, that the crucial comparisons in the present study involve the final target word and that extensive research on the N400 response (for review, see Kutas and Federmeier, 2011) has shown that predictability in higher level language structure (such as sentences or discourse) can amplify and even override lower level priming effects (e.g., word-pair associations or repetition priming). Based on these findings, we assume that predictions generated in response to contextually isolated words in the Scrambled condition are considerably weaker (i.e., less precise) than the strong predictions generated after the presentation of the initial four words in the Congruent and Incongruent conditions (see also the behavioral validation of prediction strength, i.e., cloze probability, below). Thus, our main focus for assessing predictions and violations of predictions will be on the comparison of the Congruent and Incongruent conditions.
Even though response speed was de-emphasized in the current experiment, we expected we would observe reaction time (RT) differences between the experimental conditions (Debruille and Renoult, 2009). RT differences between conditions can influence the resulting activation patterns in at least two ways (Grinband et al., 2008). First, neural activity related to the actual response (motor preparation, sensory feedback) can be shifted in time between short and long RTs, potentially affecting the hemodynamic response to the preceding stimuli in differential ways. Second, RT differences might reflect differences in time spent on the task, leading to a larger summation of the BOLD response for conditions associated with long RTs (Grinband et al., 2008). We addressed this issue using a two-step approach. First, we included the single trial RTs as additional regressors in the first-level analysis, using these values to modulate each of the four condition regressors. That is, each condition was modeled by a constant regressor representing the task condition, and an orthogonalized regressor that accounted for effects of trial-by-trial variability in RT. Second, since mean RT would still be reflected in the constant condition regressor, we also calculated, on an individual basis, the mean RT difference between any pair of contrasted conditions. The difference scores were then included as a covariate in the group analyses of the contrast images. Importantly, this approach makes no assumptions regarding the nature or consequences of RT differences (e.g., temporal shift or differential time on task). Rather, the procedure assumes that, if such differences do influence the BOLD response, these effects will be captured in the regressors modeling trial-by-trial variability in RT, as well as in the second-level covariates modeling mean RT-differences between conditions.
A significance level of 5% (FDR corrected for multiple comparisons) was adopted for all analyses. To this end, we set a voxelwise cluster-forming threshold of p < 0.005 (uncorrected), with statistical significance assessed by evaluating the volume of the active clusters (Chumbley and Friston, 2009). Unthresholded statistical maps were uploaded to the NeuroVault.org database and are available at http://neurovault.org/collections/25/.
Results
Behavior
Overall accuracy across conditions was 96%, indicating that participants judged congruent sentences as meaningful, and judged incongruent sentences, scrambled words, and letter strings as meaningless. As expected, RTs differed considerably across conditions, (F(3,93) = 17.779, p < 0.0001). RTs were much faster in the Congruent condition (mean: 704.3; SD: 260.3) compared with the Incongruent condition (mean: 900.9; SD: 309.0). The means for the Scrambled (mean: 783.7; SD: 358.4) and Letter String conditions (mean: 807.3; SD: 227.7) fell between the means for the congruent and incongruent conditions. Note that participants in the scrambled and control condition could anticipate their response well in advance of the target word.
RT effects on the BOLD signal
Figure 2a depicts regions in which the BOLD response was sensitive to trial-to-trial variability in response time, averaged over the four RT modulator regressors. This analysis revealed significant positive relationships in a number of brain areas, indicating a pattern of increased BOLD response with increased RT. Consistent with previous studies, this effect was most evident in medial frontal cortex (Yarkoni et al., 2009; Grinband et al., 2011), as well as anterior insula, prefrontal and parietal cortices, and several subcortical areas (Yarkoni et al., 2009). The cerebellar-specific analysis confirmed RT-related effects in several cerebellar clusters. No regions showed a negative relationship between the BOLD response and RT.
BOLD activations related to visuomotor aspects of the task and processing of single words
Figure 2 further show cerebellar and cerebral activations in (b) the Letter String (perceptual control) condition relative to the implicit experimental baseline and (c) the contrast of Scrambled words versus Letter Strings. As expected, the Letter String condition primarily activated areas involved in visual perception and motor control–including strong activations in cerebellar lobules V and VIII. Areas showing greater activation in the Scrambled condition compared with the Letter String condition were predominantly left-lateralized cerebral areas associated with linguistic processing (Price, 2010), as well as extensive–primarily right-lateralized–cerebellar activations.
Contextual priming effects on the cerebellar BOLD response
Figure 3 and Table 1 show the results related to the core hypotheses of this study. We first examined the effects of predictability on the cerebellar BOLD signal by contrasting the Congruent and Scrambled conditions (here and elsewhere, regressing out the effect of RT). This analysis revealed a significant cluster in the right posterior cerebellum, Crus I and II (Fig. 3a), with the BOLD response larger when the context made the target word predictable. While additional clusters were also observed in right lobule IX and bilateral brainstem, these did not remain significant when the analysis was restricted to sentences in which word frequency was equated between conditions.
To examine the effects of violating predictions in sentence comprehension, a contrast was performed between the Incongruent and Congruent conditions. This analysis revealed greater activation in the Incongruent condition across an extensive region of the posterior cerebellum, bilaterally (Fig. 3b). We also looked at hemodynamic correlates of prediction violations by comparing the Incongruent and Scrambled conditions. Notably, in both of these conditions the actual target word is unpredictable. However, only in the Incongruent condition does the presentation of the target word constitute a violation of a (strong) prediction, established by the preceding context. Consistent with the contrast of the Incongruent and Congruent conditions, cerebellar activation was stronger in the Incongruent condition compared with the Scrambled condition, and these effects were also bilateral (Fig. 3c).
In addition to providing an assay on prediction violations, the comparison of the Incongruent and Scrambled conditions provides a second probe on regions correlated with the generation of predictions, given the assumption that the first four words of the Incongruent condition allows for the generation of a strong (but violated) prediction, whereas the Scrambled condition does not. As shown in Figure 3d, there was substantial spatial overlap between the Congruent > Scrambled and Incongruent > Scrambled contrasts, with the extent of activation greater in the latter contrast.
Contextual priming effects on cerebral BOLD signal
Figure 4 and Table 2 present the results from the whole-brain analysis for activations in the cerebral cortex. For the contrast examining prediction generation, activation was greater in the Congruent compared with Scrambled condition in bilateral superior medial cortex, bilateral inferior frontal gyrus, bilateral supramarginal/postcentral cortex, and left middle temporal gyrus, as well as bilateral caudate, thalamus, and brainstem. A broadly similar pattern was observed in the contrast of the Incongruent versus the Congruent condition, although the effects were stronger and also extended into dorsolateral prefrontal cortex. Contrasting the Incongruent over the Scrambled condition confirmed that violations of semantic predictions produces a broad activation pattern, with the BOLD signal significantly greater in superior medial frontal cortex, bilateral inferior frontal gyri, and lateral prefrontal cortex, as well as in the left posterior middle temporal gyrus, left fusiform gyrus, and left angular gyrus.
Discussion
Various lines of evidence have implicated the cerebellum in language processing, with the most compelling evidence coming from an extensive body of neuroimaging studies showing cerebellar activation during a range of linguistic tasks (for review, see Murdoch, 2010). Several functional hypotheses have been proposed, including ideas that highlight language-specific functions such as the activation of potential phonological codes (Nicolson et al., 2001), covert articulation (Chen and Desmond, 2005), and the coordination of lexical search (Desmond et al., 1998), to more generic ideas, for instance, concerning a role for the cerebellum in supporting attentional shifts (Allen et al., 1997). Here we build on the hypothesis that the cerebellum plays a central role in predictive behavior, learning internal models that can be used to generate context-specific expectancies (Ramnani, 2006), exploring a potential functional parallelism across the domains of motor control and language. Our extension of the internal model hypothesis to language is further grounded in the observation that, despite its generative capacity, language is highly redundant, with communication facilitated through the predictive interactions between speaker and listener (Pickering and Garrod, 2007).
We tested this hypothesis using a contextual semantic priming task. As hypothesized, the BOLD signal in the right posterior cerebellum increased when the target word was predictable. We recognize that, while cerebellar activation was modulated by predictability, the actual predictions could be generated elsewhere in the brain; indeed, the whole-brain analysis revealed that a predictable linguistic context engaged a broad network of cortical and subcortical areas (Fig. 4, Table 3). Importantly, Lesage et al. (2012) recently provided compelling TMS evidence for a role of the cerebellum in the generation of linguistic predictions. Participants listened to spoken sentences and were required to look, as quickly as possible, at one of four pictures that corresponded to the last word. The sentences either provided a context that strongly predicted the final word or created a context in which all of the pictures were equally plausible. Crucially, repetitive TMS applied over the right cerebellar hemisphere selectively slowed saccade RTs in the predictive condition. A domain-general role of the cerebellum in predictive functions is further supported by studies demonstrating that patients with cerebellar lesions show impairments on tests of predictive motor control (Bastian, 2006) and abnormal effects of predictability on electrophysiological measures of basic auditory processing (Knolle et al., 2012, 2013).
Our second main finding was the increased cerebellar activation when contextual expectancies were violated. Error-based learning has been a central tenet of cerebellar computational theories (Marr, 1969; Albus and Branch, 1971; Doya, 1999; Ito, 2006). Consistent with this idea, cerebellar patients show deficits on sensorimotor adaptation tasks (Tseng et al., 2007), where errors experienced during one trial lead to corrective adjustments in the motor output on the next trial (Wolpert et al., 2011). Furthermore, error-related increases in BOLD have been observed in the cerebellum in studies of sensorimotor control (Schlerf et al., 2012) and learning (Imamizu, 2000; Imamizu and Kawato, 2012), as well as during the perception of visual sequences (Bubic et al., 2009). The present findings extend this line of research by providing evidence for cerebellar error-related activity during language processing. While not the main focus of the current study, the cerebral activation patterns were in general agreement with previous studies investigating the processing of semantic violations (Lau et al., 2008).
The cerebellar results in our language task showed intriguing similarities with those previously reported in an fMRI study of sensorimotor learning (Fig. 3e; Imamizu et al., 2000). In the motor study, a spatially widespread activation pattern was observed at the start of training, correlated with the magnitude of performance errors. This activation decreased, both in intensity and extent, with learning. However, even after prolonged learning–and a stabilization of error magnitude–a focal area of activation remained, which we attributed to the recruitment of an acquired internal model (Fig. 3b). These results cannot be directly compared with the present findings, given that language regularities are highly overlearned, unlike the novel tool used in the Imamizu study. Nonetheless, together, the results suggest similar cerebellar activation dynamics across task domains.
The current findings are in general agreement with previous studies of cerebellar involvement in language processing. As noted above, Lesage et al. (2012) were able to disrupt anticipatory responses due to sentential priming by applying rTMS over the right cerebellar hemisphere. Using magnetic encephalography, Kujala et al. (2007) found increased coherence between the cerebellum and the left temporal pole when participants read meaningful sentences compared with reading the same words in scrambled order. Similarly, an fMRI study reported increased activation in the right posterior cerebellum when subjects read sentences and narratives relative to the same words presented in random sequences (Xu et al., 2005). In all of these studies, the critical variable was the predictability of the presented words–with increased cerebellar involvement for predictable relative to unpredictable conditions.
More generally, neuroimaging (Stoodley and Schmahmann, 2009; Buckner et al., 2011) and neuroanatomical (Strick et al., 2009) studies have linked the cerebellar areas activated in the current study to higher cognitive function. For example, Balsters et al. (2013) observed activation changes in these areas when people were required to use abstract higher order rules. Rule-based reasoning likely entails some degree of linguistic encoding. Conversely, language, by its nature, entails complex, hierarchically nested representations, similar to the kinds of representations required for understanding abstract, higher order rules. An interesting question for future research will be to design tasks that seek to compare the internal model and rule representation hypotheses. Perhaps cerebellar activation observed when people are asked to reason with complex rules reflects the operation of an anticipatory thought process required to simulate possible outcomes generated from a set of nested rules. Further, prediction is a ubiquitous feature of nervous systems, a point emphasized in the general theories of predictive coding (Rao and Ballard, 1999) and the Bayesian brain (Friston, 2010). Thus, another important set of remaining questions concerns the unique characteristics of cerebellar predictive mechanisms. Current theoretical accounts–based on microcircuit anatomy, neuroimaging, and patient studies–suggest context specificity (Ramnani, 2006), automaticity (Ito, 2008), and precise timing (Ivry and Spencer, 2004) as potential constraints for characterizing the predictive capabilities of the cerebellum.
Some limitations with the present study need to be addressed. First, word frequency differed between conditions, with higher frequency words more likely to occur in the final position in the scrambled condition. However, we find it unlikely that the results can be attributed to word frequency effects since the main reported findings remained significant in the control analyses. Second, since we used fixed stimulus intervals, the BOLD response to the preceding sentence might have influenced the BOLD response to the target word. However, this cannot account for the difference between the incongruent and congruent sentences as these conditions only differed with respect to the final target word.
Third, the RT differences across conditions raise the possibility that the cerebellar activations were more related to differences in motor preparation than to differences in language processing. We addressed this concern by including trial-by-trial RTs and condition RT differences in the analyses, a procedure which proved sufficiently sensitive to detect effects of RT variability, including within some cerebellar foci. Nonetheless, the cerebellar activations related to both linguistic predictability as well as violations of linguistic predictions remained reliable even when we included the RT data as regressors in our model. Moreover, the cerebellar activations associated with word predictability were located in areas more closely associated with cognitive functions than motor control (Stoodley and Schmahmann, 2009).
Fourth, we recognize that word predictability is confounded, to some extent, with other variables; for example, predictable sentences are also likely to be experienced as more meaningful and incongruent target words are likely to not only generate an error signal, but also engage brain areas involved in sentence comprehension. Future studies will be required to examine the relationships between such variables as prediction, meaningfulness, error processing, and comprehension. We note, however, that our interpretation is consistent with results from numerous studies of cerebellar contributions to language, perception, and motor control. Thus, while alternative explanations cannot be conclusively ruled out, we believe that the generation of predictions in the cerebellum constitutes the most parsimonious account of the present findings.
Finally, and perhaps most challenging, the current results, along with those obtained in numerous other imaging studies demonstrating cerebellar activations during language tasks, stand in apparent contradiction to the clinical literature. Pronounced language comprehension deficits are not observed in patients with acquired cerebellar pathology (Alexander et al., 2012). However, it is important to note that the motor deficits observed in ataxia tend to become most evident during complex movements. By analogy, cerebellar predictions might aid, but in a strict sense not be essential for, language comprehension. As a case in point, while words can certainly be understood when presented alone, presenting them within a predictive context significantly aids comprehension; for instance, prediction can increase reading speed by allowing the reader to skip predictable words, or facilitate comprehension in noisy environments (Pickering and Garrod, 2007). Future neuropsychological studies should investigate the effects of cerebellar lesions on language comprehension in more complex contexts (such as noisy environments) or by comparing conditions in which the degree of predictability is manipulated.
In conclusion, we found that the predictability of visually presented words modulated the BOLD response in the posterior cerebellum. The observed activation patterns matched our expectations based on previous imaging studies of internal models in sensorimotor control. Thus our results are in line with the idea of a generalized role for the cerebellum in encoding internal models, a hypothesis that offers a unified perspective on cerebellar function in motor and non-motor tasks.
Footnotes
This work is supported by the National Institutes of Health Grant HD060306 (R.B.I) and the US-Norway Fulbright Grant Program (T.M.). We thank Markus Handal Sneve and Haakon Engen for valuable feedback on early drafts of this paper.
The authors declare no competing financial interests.
- Correspondence should be addressed to Torgeir Moberget, Department of Psychology, University of Oslo, 0317 Oslo, Norway. torgeir.moberget{at}gmail.com