Abstract
To understand what you are reading now, your mind retrieves the meanings of words from a linguistic knowledge store (lexico-semantic processing) and identifies the relationships among them to construct a complex meaning (syntactic or combinatorial processing). Do these two sets of processes rely on distinct, specialized mechanisms or, rather, share a common pool of resources? Linguistic theorizing and empirical evidence from language acquisition and processing have yielded a picture whereby lexico-semantic and syntactic processing are deeply inter-connected. In contrast, most current proposals of the neural architecture of language continue to endorse a view whereby certain brain regions selectively support lexico-semantic storage/processing whereas others selectively support syntactic/combinatorial storage/processing, despite inconsistent evidence for this division of linguistic labor across brain regions. Here, we searched for a dissociation between lexico-semantic and syntactic processing using a powerful individual-subjects fMRI approach across three sentence comprehension paradigms (n=49 participants total): responses to lexico-semantic vs. syntactic violations (Experiment 1); recovery from neural suppression across pairs of sentences differing in lexical items vs. syntactic structure (Experiment 2); and same/different meaning judgments on such sentence pairs (Experiment 3). Across experiments, both lexico-semantic and syntactic conditions elicited robust responses throughout the language network. Critically, no regions were more strongly engaged by syntactic than lexico-semantic processing, although some regions showed the opposite pattern. Thus, contra many current proposals of the neural architecture of language, lexico-semantic and syntactic/combinatorial processing are not separable at the level of brain regions – or even voxel subsets – within the language network, in line with strong integration between these two processes that has been consistently observed in behavioral language research. The results further suggest that the language network may be generally more strongly concerned with meaning than structure, in line with the primary function of language – to share meanings across minds.
Introduction
What is the functional architecture of human language? A core component is a set of knowledge representations, which include knowledge of words and their meanings, and the probabilistic constraints on how words can combine to create compound words, phrases, and sentences. During comprehension (decoding of linguistic utterances), we look for matches between the incoming linguistic signal and these stored knowledge representations in an attempt to re-construct the intended meaning, and during production (encoding of linguistic utterances), we search our knowledge store for the right words/constructions and combine and arrange them in a particular way to express a target idea.
How is this rich set of representations and computations structured? Which aspects of language are functionally dissociable from one another? Traditionally, two principal distinctions have been drawn: one is between words (the lexicon) and rules (the grammar) (e.g., Chomsky, 1965, 1995; Fodor, 1983; Pinker & Prince, 1988; Pinker, 1991, 1999); and another is between linguistic representations themselves (i.e., our knowledge of the language) and their online processing (i.e., accessing them from memory and combining them to create new complex meanings and structures) (e.g., Chomsky, 1965; Fodor et al., 1974; Newmeyer, 2003). Because these two dimensions are, in principle, orthogonal, we could have distinct mental capacities associated with i) knowledge of word meanings, ii) knowledge of grammar (syntactic rules), iii) access of lexical representations (in comprehension or production), and iv) parsing (in comprehension) or construction (in production) of syntactic structures (Fig. 1a).
A (non-exhaustive) set of theoretically possible architectures of language. Distinct boxes correspond to distinct brain regions (or sets of brain regions; e.g., in 1a-d, “syntactic/combinatorial processing” may recruit a single region or multiple regions, but critically, this region or these regions do not support other aspects of language processing, like understanding word meanings). The architectures differ in whether they draw a (region-level) distinction between the lexicon and grammar (a vs. b-f), between storage and access of linguistic representations (1a-b vs. 1c-f), and critically, in whether syntactic/combinatorial processing is a separable component (1a-d vs. 1e-f).
However, both of these distinctions have been long debated. For example, as linguistic theorizing evolved and experimental evidence accumulated through the 1970s-90s, the distinction between the lexicon and grammar began to blur, for both linguistic knowledge representations and processing (e.g., Fig. 1b; see Snider & Arnon, 2012, for a summary and discussion). Many have observed that much of our grammatical knowledge does not operate over highly general categories like nouns and verbs, but instead requires reference to particular words or word classes (e.g., Lakoff, 1970; Bybee, 1985, 1998, 2010; Levin, 1993; Goldberg, 1995, 2002; Jackendoff, 2002, 2007; Sag et al., 2003; Culicover & Jackendoff, 2005; Levin & Rappaport Hovav, 2005). As a result, current linguistic frameworks incorporate lexical knowledge as part of the knowledge of the grammar, although they differ as to the degree of abstraction that exists above and beyond knowledge of how particular words combine with other words (see e.g., Hudson, 2007, for discussion), and in whether abstract syntactic representations (like the double object, passive, or question constructions) are always associated with meanings or functions (e.g., Pinker, 1989; Goldberg, 1995; cf. Chomsky, 1957; Branigan & Pickering, 2017a; see Jackendoff, 2002, for discussion).
In line with these changes in linguistic theorizing, experimental and corpus work in psycholinguistics have established that humans i) are exquisitely sensitive to contingencies between particular words and the constructions they occur in (e.g., Clifton et al., 1984; MacDonald et al., 1994; Trueswell et al., 1994; Garnsey et al., 1997; Traxler et al., 2002; Reali & Christensen, 2007; Roland et al., 2007; Jaeger, 2010), and ii) store not just atomic elements (like morphemes and non-compositional lexical items), but also compositional phrases (e.g., “I don’t know” or “give me a break”; e.g., Wray, 2005; Evert, 2008; Arnon & Snyder, 2010; Christiansen & Arnon, 2017) and constructions (e.g., “the X-er the Y-er”; Goldberg, 1995; Culicover & Jackendoff, 1999). The latter suggested that the linguistic units people store are determined not by their nature (i.e., atomic vs. not) but instead, by their patterns of usage (e.g., Bybee 1998, 2006; Goldberg 2006; Barlow and Kemmer 2000; Langacker 1986, 1987; Tomasello 2003). Further, people’s lexical abilities have been shown to strongly correlate with their grammatical abilities – above and beyond shared variance due to general fluid intelligence – both developmentally (e.g., Bates et al., 1995; Bates & Goodman, 1997; Dixon & Marchman, 2007; Hoff et al., 2018) and in adulthood (e.g., Dabrowska, 2018). Thus, linguistic mechanisms that have been previously proposed to be distinct are instead tightly integrated with one another or, perhaps, are so cognitively inseparable as to be considered a single apparatus.
The distinction between stored knowledge representations and online computations has also been questioned (see Hasson et al., 2015, for a recent discussion of this issue in language and other domains). For example, by using the same artificial network to represent all linguistic experience, connectionist models dispense not only with the lexicon-grammar distinction but also the storage-computation one, and assume that the very same units that represent our linguistic knowledge support its online access and processing (e.g., Rumelhart and McClelland, 1986; Seidenberg, 1994; see also Goldinger, 1996; Bod, 1998, 2006, for exemplar models, which also abandon the storage-computation divide).
Alongside psycholinguistic studies, which inform debates about linguistic architecture by examining the behaviors generated by language mechanisms, and computational work, which aims to approximate human linguistic behavior using formal models, a different, complementary approach is offered by cognitive neuroscience studies. These studies examine how the relevant cognitive mechanisms are neurally implemented. Here, the assumption that links neuroimaging (and neuropsychological patient) data to cognitive hypotheses is as follows: to the extent that two mental capacities are functionally distinct, they may be implemented in distinct brain regions or sets of regions. Such brain regions would be expected to show distinct patterns of response, and their damage should lead to distinct patterns of deficits.
A (non-exhaustive) set of theoretically possible neural architectures is schematically illustrated in Figure 1, with distinct boxes corresponding to distinct brain regions (or sets of regions). These architectures differ in whether they draw a (region-level) distinction between the lexicon and grammar (1a vs. 1b-f), between storage and access of linguistic representations (1a-b vs. 1c-f), and in whether syntactic/combinatorial processing is a separable component (1a-d vs. 1e-f). (Here, and in subsequent discussions of possible linguistic architectures, we talk about not just “syntactic”, but “syntactic/combinatorial” processing given that combining words into complex – phrase- and sentence-level – representations requires both syntactic structure building but also semantic composition, and different researchers construe this combinatorial process with different foci / at different grain levels. However, in the discussion of the paradigms used in the current study, we talk about “syntactic” processing, following the terminology of prior studies that have relied on these paradigms.) Over the years, a number of paradigms have been developed in an effort to constrain these architectures. Some paradigms have varied the presence or absence of lexico-semantic and syntactic information in the signal; others have tried to more strongly tax the processing of word meanings vs. syntactic/combinatorial processing; yet others have made the meaning of a particular word vs. the structure of the sentence more salient / task-relevant. In many cases, researchers have not been clear as to whether their manipulation specifically targeted linguistic knowledge (i.e., knowledge of word meanings vs. syntactic structures), online processing (i.e., access of word meanings vs. syntactic rules vs. combining linguistic elements to create new complex representations), or both, perhaps because the storage/computation distinction is difficult to evaluate empirically (cf. Mirman and Britt, 2013). Nevertheless, if a brain region (or set of regions) engages selectively, or at least preferentially (more strongly), in response to lexico-semantic information or processing demands, and another brain region (or set of regions) selectively/preferentially responds to syntactic/combinatorial information or processing demands, this would give weight to architectures that draw a distinction between the two (i.e., 1a-d) over those that do not (i.e., 1e-f).
Indeed, in the 1990s and 2000s – when brain imaging methods became available – many have searched for and claimed to have observed a dissociation between brain regions that support lexico-semantic storage/processing and those that support syntactic, or more general combinatorial (e.g., compositional semantic), processing (e.g., Dapretto & Bookheimer, 1999; Embick et al., 2000; Friederici et al., 2000; Noppeney & Price, 2004; Cooke et al., 2006, among others). The alleged syntax-selective regions have sparked particular excitement due to claims that syntax is what makes human language unique (e.g., Hauser et al., 2002; Berwick et al., 2013; Friederici, 2018). However, taking the available evidence from cognitive neuroscience en masse, the picture that has emerged is rather complex.
First, the specific regions that have been argued to support lexico-semantic vs. syntactic/combinatorial processing, and the construal of these regions’ contributions, differ widely across studies and proposals (e.g., Friederici, 2011, 2012; Baggio and Hagoort, 2011; Tyler et al., 2011; Bemis & Pylkkanen, 2011; Duffau et al., 2014; Ullman, 2004, 2016). Second, although a number of diverse paradigms have been used across studies to probe lexico-semantic vs. syntactic/combinatorial processing, any given study has typically used a single paradigm, raising the possibility that the results reflect paradigm-specific differences between conditions rather than a general difference between lexico-semantic and syntactic/combinatorial computations. Further, many studies that claimed to have observed a dissociation have not reported the required region by condition interactions, as needed to argue for a functional dissociation (Nieuwenhuis et al., 2011). Third, many studies that have argued for syntax selectivity did not actually examine lexico-semantic processing, focusing instead on syntactic complexity manipulations (e.g., Stromswold et al., 1996; Ben-Shachar et al., 2003; Bornkessel et al., 2005; Fiebach et al., 2005; see Friederici, 2011, for a meta-analysis). Although such studies (may) establish sensitivity of a brain region to syntactic complexity, they say little about its selectivity for syntactic over lexico-semantic processing. Finally, a number of neuroimaging studies have failed to observe a dissociation between lexico-semantic and syntactic processing (e.g., Keller et al., 2001; Roder et al., 2002; Fedorenko et al., 2010; Fedorenko et al., 2012; Bautista & Wilson, 2016; Blank et al., 2016; Fedorenko et al., 2016). Relatedly, studies of patients with brain damage have failed to consistently link syntactic deficits with a particular locus within the language network, leading some to argue that syntactic processing is supported by the language network as a whole, including regions that are implicated in lexico-semantic storage/processing (e.g., Caplan et al., 1996; Dick et al., 2001; Wilson and Saygin, 2004; Mesulam et al., 2015).
Here we build on prior neuroimaging studies to search for a potential dissociation between lexico-semantic and syntactic storage/processing within the left-lateralized high-level language network (Fedorenko et al., 2010) using fMRI. Critically, the current study goes beyond prior fMRI studies in two important aspects. First, we adopt a powerful individual-subjects analytic approach, where all the key comparisons are performed within individual participants. This approach contrasts with traditional fMRI analyses, which average individual activation maps in a common anatomical space and assume voxel-wise functional correspondence across individuals (e.g., Holmes & Friston, 1998). These traditional analyses stand the risk of missing dissociations between brain regions/voxels selective for lexico-semantic vs. syntactic processing even if these are present in each individual brain, because precise activation loci exhibit high inter-individual variability, especially in the lateral frontal and temporal cortex (e.g., Fischl et al., 2008; Frost & Goebel, 2011); this risk also characterizes meta-analyses of activation peaks (e.g., Rodd et al., 2015; Hagoort & Indefrey, 2016). The individual-level functional localization approach we adopt here has been shown to yield higher sensitivity and functional resolution (e.g., Saxe et al., 2006; Nieto-Castañon & Fedorenko, 2012; Glezer & Riesenhuber, 2013) and thus gives us the best chance to discover selectivity for lexico-semantic or syntactic processing if such exists within the language network.
And second, we systematically examine three paradigms from the literature: responses to lexico-semantic vs. syntactic violations (Experiment 1); recovery from neural suppression across pairs of sentences differing in lexical items vs. syntactic structure (Experiment 2); and same/different meaning judgments on such sentence pairs (Experiment 3). Each paradigm thus has a condition that targets lexico-semantic processing, and another condition that targets syntactic processing. Experiments 1 and 3 are designed to tax lexico-semantic vs. syntactic processing more strongly by having a critical word be incompatible with the context in terms of its meaning or structural properties (in Experiment 1), or by forcing participants to focus on the meanings of the critical words vs. the structure of sentences (in Experiment 3). Experiment 2 relies on the well-established neural adaptation to the repetition of the same information across stimuli: here, repeating the words vs. the structure. Given that all three manipulations target the same theoretical distinction, they should, in principle, yield similar patterns of response. And if the observed patterns are indeed stable across paradigms, we can more confidently take them to reflect lexico-semantic vs. syntactic processing demands, as opposed to potentially paradigm-specific differences between conditions. Note that each of these paradigms can be potentially criticized for some flaw(s). However, as discussed in more detail below, these are exactly the paradigms that have been used in prior studies to argue for a dissociation between lexico-semantic and syntactic processing. We thus follow the literature in adopting these paradigms.
To foreshadow the results, we find that every brain region in the language network supports both lexico-semantic and syntactic processing. No region (or even set of non-contiguous voxels within these regions) shows a consistent preference, in the form of a stronger response, for syntactic processing, although some regions show the opposite preference. These results are in line with current linguistic theorizing, psycholinguistic evidence, and computational modeling work that all suggest tight integration between the lexicon and grammar at the level of both knowledge representations and processing, and contrary to the view that syntactic storage/processing is a separable component within the language architecture.
Materials and Methods
The potential dissociation between lexico-semantic and syntactic processing was evaluated across three language comprehension paradigms. The first paradigm is commonly used in ERP investigations of language processing and relies on violations of expectations about an incoming word set up by the preceding context. In particular, the critical word does not conform to either the lexico-semantic or the syntactic expectations (e.g., Kutas & Hilliyard, 1980; Osterhout & Holcomb, 1992; Hagoort et al., 1993). This paradigm has been used in a number of prior fMRI studies (e.g., Embick et al., 2000; Cooke et al., 2006; Friederici et al., 2010; Herrmann et al., 2012). The second paradigm relies on neural adaptation, wherein repeated exposure to a stimulus leads to a reduction in response, and a change in some feature(s) of the stimulus leads to a recovery of response (see e.g., Krekelberg et al., 2006, for a general overview of the approach). This paradigm has also been used in prior fMRI studies that examined adaptation, or recovery from adaptation, to the lexico-semantic vs. syntactic features of linguistic stimuli (e.g., Noppeney & Price, 2004; Santi & Grodzinsky, 2010; Menenti et al., 2012; Segaert et al., 2012). Finally, the third paradigm was introduced in a classic study by Dapretto & Bookheimer (1999): pairs of sentences vary in a single word vs. in word order / syntactic structure. Either of these manipulations can result in the same meaning being expressed across sentences (if a word is replaced by a synonym, or if a syntactic alternation, like active→passive, is used) or in a different meaning (if a word is replaced by a non-synonym, or if the thematic roles are reversed). Participants make same/different meaning judgments on the resulting sentence pairs. Note that all three paradigms use sentence materials, which necessarily require both lexico-semantic and syntactic processing. As a result, all conditions are expected to elicit reliable (above-baseline) responses throughout the language network. The critical question is whether we will observe a consistent preference (stronger responses) for lexico-semantic over syntactic conditions in some regions, and the opposite preference in other regions.
In an effort to maximize sensitivity and functional resolution (e.g., Nieto-Castanon & Fedorenko, 2012), we adopt an approach where all the key contrasts are performed within individual participants. We perform two kinds of analyses. In one set of analyses, we identify language-responsive cortex in each individual participant with an independent language localizer task based on a contrast between the reading of sentences vs. sequences of nonwords (Fedorenko et al., 2010), and examine the engagement of these language-responsive areas in lexico-semantic vs. syntactic processing in each critical paradigm. (The use of the same language localizer task across experiments allows for a straightforward comparison of their results, obviating the need to rely on coarse anatomy and reverse-inference reasoning for interpreting activations in functional terms (Poldrack, 2006, 2011).) To further ensure that we are not missing the critical dissociation by averaging across (relatively) large sets of language-responsive voxels, we supplement these analyses with analyses where we only use data from the critical paradigms. In particular, we use some of the data from a given critical task to search for the most lexico-semantic-vs. syntactic-selective voxels (e.g., in Experiment 1, voxels that respond more strongly to lexico-semantic than syntactic violations), and then test the replicability of this selectivity in left-out data (as detailed below).
Participants
Forty-nine individuals (age 19-32, 27 females) participated for payment (Experiment 1: n=22; Experiment 2: n=14; and Experiment 3: n=15; 2 participants overlapped between Experiments 1 and 3). Forty-seven were right-handed, as determined by the Edinburgh handedness inventory (Oldfield, 1971), or self-report; the two left-handed individuals showed typical left-lateralized language activations in the language localizer task (see Willems et al., 2014, for arguments for including left-handers in cognitive neuroscience experiments). All participants were native speakers of English from Cambridge, MA and the surrounding community. One additional participant was scanned (for Experiment 2) but excluded from the analyses due to excessive sleepiness and poor behavioral performance. All participants gave informed consent in accordance with the requirements of MIT’s Committee on the Use of Humans as Experimental Subjects (COUHES).
Design, stimuli, and procedure
Each participant completed a language localizer task (Fedorenko et al., 2010) and a critical task (35 participants performed the localizer task in the same session as the critical task, the remaining 14 performed the localizer in an earlier session; see Mahowald & Fedorenko, 2016, for evidence that localizer activations are highly stable across scanning sessions). Most participants completed one or two additional tasks for unrelated studies. The entire scanning session lasted approximately 2 hours.
Language localizer task. The task used to localize the language network is described in detail in Fedorenko et al. (2010). Briefly, we used a reading task that contrasted sentences and lists of unconnected, pronounceable nonwords in a standard blocked design with a counterbalanced order across runs (for timing parameters, see Table 1). This contrast targets higher-level aspects of language, including lexico-semantic and syntactic/combinatorial processing, to the exclusion of perceptual (speech or reading-related) and articulatory processes (see e.g., Fedorenko & Thompson-Schill, 2014, for discussion). Stimuli were presented one word/nonword at a time. For 19 participants, each trial ended with a memory probe and they had to indicate, via a button press, whether or not that probe had appeared in the preceding sequence of words/nonwords. The remaining 30 participants read the materials passively and performed a simple button-pressing task at the end of each trial (included in order to help participants remain alert). Importantly, this localizer has been shown to generalize across different versions: the sentences > nonwords contrast, and similar contrasts between language and a degraded control condition, robustly activates the fronto-temporal language network regardless of the task, materials, and modality of presentation (Fedorenko et al., 2010; Fedorenko, 2014; Scott et al., 2016).
Timing parameters for the different versions of the language localizer task.
Critical experiments. The key details about all three experiments are presented in Figure 2 (sample stimuli), Figure 3 (trial structure), and Table 2 (partitioning of stimuli into experimental lists and runs). To construct experimental stimuli, we first generated for each experiment a set of “base items” and then edited each base item to create several, distinct versions corresponding to different experimental conditions. The resulting stimuli were divided into several experimental lists following a Latin Square design, such that in each list (i) stimuli were evenly split across experimental conditions, and (ii) only one version of each base item was used. Each participant saw materials from a single list, divided into a few experimental runs. All experiments used an event-related design. Condition orders were determined with the optseq2 algorithm (Dale, 1999), which was also used to distribute inter-trial fixation periods so as to optimize our ability to de-convolve neural responses to different experimental conditions. The materials for all experiments are available from OSF (link to be added).
Stimulus and Procedure details for each experiment
Sample stimuli for each condition in Experiments 1-3. Two examples are provided for each condition in each experiment. For Experiment 1, the top row shows the beginning of a sentence, and the next rows show different possible continuations. For Experiments 2-3, the top row shows one sentence from a pair, and the next rows show different possibilities for the other sentence in that pair. For Experiment 2, three versions of each base item are presented for illustrative purposes (corresponding to the Lexico-semantic, Syntactic, and Global meaning conditions). However, in the original stimuli set, each base item only had one of these versions (and, thus, belonged to only one of the three conditions). Red: Lexico-semantic condition; Blue: Syntactic condition; Green: other experimental conditions; Black: control condition.
Trial structure for Experiments 1-3. One sample trial is shown for each experiment.
Experiment 1: Lexico-semantic vs. syntactic violations. Participants passively read stimuli, and their expectations were violated in several ways. Base items were 10-word sentences, and each item had four versions (Figure 2) that differed in whether the critical verb (i) resulted in a lexico-semantic violation (stimuli that typically elicit an N400 component in ERP studies; see Kutas & Federmeier, 2011, for a review); (ii) resulted in a morpho-syntactic violation (stimuli that typically elicit a P600 component in ERP studies; e.g., Osterhout & Holcomb, 1992; Hagoort et al., 1993); (iii) was presented in a different font (i.e., a low-level oddball violation, included as a baseline for the two previous conditions); or (iv) contained no violations (control condition). Lexico-semantic violations were created by shuffling the critical verbs across base items. Syntactic violations were created by either omitting a required morpheme (30%) or adding an unnecessary morpheme (70%).
The materials consisted of 240 base items. They included 139 base items with a sentence-final critical verb, taken (and sometimes slightly edited) from Kuperberg et al. (2003), as well as 61 additional items (to increase power) constructed in a similar manner. Further, to render the timing of violations less predictable, we adapted another 40 base items from Kuperberg et al. such that the critical verb appeared before the final (10th) word: 6 items had the verb in each of the 3rd through 8th positions, and 4 items had it in the 9th position. Critical verbs were not repeated across the 240 base items, with two exceptions (“practice” and “read” were used twice each). For each participant, 10 additional sentences were included in each of the four conditions to serve as fillers. These fillers were followed by a memory-probe task (deciding whether the probe word appeared in the preceding sentence; Figure 3) to ensure that participants paid attention to the task; they were excluded from data analysis.
Experiment 2: Recovery from adaptation to word meanings vs. syntactic structure. Participants were asked to attentively read pairs of sequentially presented sentences and perform a memory probe task at the end of each pair (i.e., decide whether a probe word appeared in either of the two sentences). Base items were pairs of simple transitive sentences consisting of an agent, a verb, and a patient. Because of constraints on these materials (as elaborated below), we constructed three sets of base items (Figure 2): (i) sentence pairs that differed only in lexical items (but had the same syntactic structure and global meaning), created by replacing the verb and the agent and patient noun phrases with synonyms or words closely related in meaning; (ii) pairs that differed only in their syntactic structure (but had the same lexical items and global meaning), created by using the Active / Passive alternation; and (iii) pairs that differed only in the global meaning (but had the same lexical items and syntactic structure), created by switching the two noun phrases, leading to opposite thematic meanings. The third set was included in order to examine sensitivity to overall propositional meaning and to probe combinatorial semantic processing. Overall, there were 432 base items (144 per set).
In each set, each sentence pair {A,B} had six versions (Table 2): sentence A followed by sentence B, and sentence B followed by sentence A (“Critical” condition); sentence A followed by sentence A, and sentence B followed by sentence B (“Same” condition); and, finally, sentence A followed by a completely different sentence X (lexical items, syntactic structure, and global meaning were all different), and sentence B followed by a completely different sentence Y, where the pair {X,Y} was taken from another base item (“Different” condition). Every sentence was used once in the Different condition of some other base item. Therefore, within each of the three sets of base items, every sentence appeared twice in each condition (Critical, Same, Different). Across the three sets, there were overall 5 experimental conditions: Critical Lexico-semantic, Critical Syntactic, Critical Global meaning, Same, and Different. In order to clearly mark the distinctness of the two identical sentences in the Same condition, trials across all conditions included a brief visual mask between the two sentences.
To keep the materials diverse, items in the first two sets were constructed to be evenly distributed among three types of agent-patient relationships: (1) animate agent + inanimate patient; (2) animate agent + animate patient, where the relationship is biased so that one of the noun phrases is much more likely to be the agent (e.g., The hit man killed the politician); and (3) animate agent + animate patient, where the two nouns are equally likely to be the agent (e.g., The protestor quoted the leader). By virtue of the manipulation of global meaning in the third set, all items had to be semantically reversible (i.e., of the third type).
Experiment 3: Same-different meaning judgment on sentences that differ in word meanings vs. syntactic structure. This experiment was adapted from Dapretto & Bookheimer (1999). Participants were asked to decide whether or not a pair of sequentially presented sentences had roughly the same meaning. Base items were 80 sentence pairs, and each pair had four versions (Figure 2; Table 2): two versions in which the sentences differed in a single word (Lexico-semantic condition), replaced by either a synonym (Same meaning version) or a non-synonym (Different meaning version); and two versions (Syntactic condition) in which the sentences were either syntactic alternations differing in both structure and word order (Same meaning version), or in only structure / only word order (Different meaning version). Half of the items used the Active / Passive constructions (as in Dapretto & Bookheimer), and half – the Double Object (DO) / Prepositional Phrase Object (PP) constructions.
A number of features varied and were balanced across stimuli (Figure 2). First, the construction was always the same across the two sentences in the Lexico-semantic condition (balanced between active and passive for the Active / Passive items, and between DO and PP for the DO / PP items). However, in the Syntactic condition, the construction was always different in the Same-meaning version because this is how the propositional meaning was preserved (again, balanced between active and passive for the Active / Passive items, and between DO and PP for the DO / PP items). For the Different-meaning version, the construction could either be the same (in which case the order of the two relevant nouns was switched) or different (in which case the order of the two relevant nouns was preserved). In cases where the construction differed across the two sentences, we balanced whether the first sentence was active vs. passive (for the Active / Passive items), or whether it was DO vs. PP (for the DO / PP items). The second feature that varied across the materials was whether the first-mentioned noun was a name or an occupation noun. All base items contained one instance of each, with order of presentation balanced across stimuli. And third, for the Lexico-semantic condition, we varied how exactly the words in the second sentence in a pair differed from the words in the first. (This does not apply to the Syntactic condition because the content words were identical across the two sentences within each pair.) In particular, for the Active / Passive items, either the occupation noun or the verb could be replaced (by a synonym or a word with a different meaning); and for the DO / PP items, either the occupation noun or the direct object (inanimate) noun could be replaced.
Data acquisition, preprocessing, and first-level modeling
Data acquisition. Whole-brain structural and functional data were collected on a whole-body 3 Tesla Siemens Trio scanner with a 32-channel head coil at the Athinoula A. Martinos Imaging Center at the McGovern Institute for Brain Research at MIT. T1-weighted structural images were collected in 176 axial slices with 1mm isotropic voxels (repetition time (TR) = 2,530ms; echo time (TE) = 3.48ms). Functional, blood oxygenation level-dependent (BOLD) data were acquired using an EPI sequence with a 90° flip angle and using GRAPPA with an acceleration factor of 2; the following parameters were used: thirty-one 4.4mm thick near-axial slices acquired in an interleaved order (with 10% distance factor), with an in-plane resolution of 2.1mm × 2.1mm, FoV in the phase encoding (A >> P) direction 200mm and matrix size 96 × 96 voxels, TR = 2000ms and TE = 30ms. The first 10s of each run were excluded to allow for steady state magnetization.
Preprocessing. Data preprocessing was carried out with SPM5 (using default parameters, unless specified otherwise) and supporting, custom MATLAB scripts. (Note that SPM was only used for preprocessing and basic modeling – aspects that have not changed much between versions; we used an older version of the SPM software because we have projects that use data collected over the last 15 years, and we want to keep preprocessing and first-level analysis the same across the ~800 subjects in our database, which are pooled in the analyses for many projects. For several datasets, we have directly compared the outputs of data preprocessed and modeled in SPM5 vs. SPM12, and the outputs are nearly identical.) Preprocessing of anatomical data included normalization into a common space (Montreal Neurological Institute (MNI) template) and resampling into 2mm isotropic voxels. Preprocessing of functional data included motion correction (realignment to the mean image of the first functional run using 2nd-degree b-spline interpolation), normalization (estimated for the mean image using trilinear interpolation), resampling into 2mm isotropic voxels, smoothing with a 4mm FWHM Gaussian filter and high-pass filtering at 200s.
Data modeling. For both the language localizer task and the critical tasks, a standard mass univariate analysis was performed in SPM5 whereby a general linear model (GLM) estimated the effect size of each condition in each experimental run. These effects were each modeled with a boxcar function (representing entire blocks/events) convolved with the canonical Hemodynamic Response Function (HRF). The model also included first-order temporal derivatives of these effects, as well as nuisance regressors representing entire experimental runs and offline-estimated motion parameters.
Definition of language-responsive functional regions of interest (fROIs)
For each participant (in each experiment), we defined a set of language-responsive functional ROIs using group-constrained, participant-specific localization (Fedorenko et al., 2010). In particular, each individual map for the sentences > nonwords contrast from the language localizer task was intersected with a set of six binary masks. These masks were derived from a probabilistic activation overlap map for the language localizer contrast in a large set of participants (n=220) using the watershed parcellation, as described in Fedorenko et al. (2010), and corresponded to relatively large areas within which most participants showed activity for the target contrast. These masks covered the fronto-temporal language network: three in the left frontal lobe falling within the IFG, its orbital portion, and the MFG, and three in the temporal and parietal cortex (Figure 5). In each mask, a participant-specific language fROI was defined as the top 10% of voxels with the highest t values for the localizer contrast. This top n% approach ensures that fROIs can be defined in every participant and that their sizes are the same across participants, allowing for generalizable results (e.g., Nieto-Castañón and Fedorenko, 2012).
Critical analyses
Before examining the data from the critical experiments, we ensured that the language fROIs show the expected signature response (i.e., that the greater response to sentences than nonwords is reliable). To do so, we used an across-runs cross-validation procedure (e.g., Nieto-Castañon & Fedorenko, 2012), where one run of the localizer is used to define the fROIs, and the other run to estimate the responses (e.g., Kriegeskorte et al., 2009).
We then estimated the responses in the language fROIs to the conditions of the critical experiment: the Control condition, Lexico-semantic violations, Syntactic violations, and Font violations in Experiment 1; the Same condition, Different condition, and three Critical conditions (different in lexical items, syntactic structure, or global meaning) in Experiment 2; and the Lexico-semantic and Syntactic conditions (each collapsed across same and different pairs) in Experiment 3. Statistical comparisons were then performed on the estimated percent BOLD signal change (PSC) values.
In Experiment 1, in each region, we first used two-tailed paired-samples t-tests to compare the response to each critical violation (lexico-semantic or syntactic) against a) the Control condition with no violations, and, as an additional baseline, b) the Font violations condition. These results were corrected for the number of regions (six) using the False Discovery Rate correction (Benjamini & Yekutieli, 2001). We then directly contrasted the Lexico-semantic and Syntactic violations conditions. If a brain region is preferentially engaged in lexico-semantic processing, then we would expect to observe a reliably stronger response to the Lexico-semantic violations condition compared to the Control condition and the Font violations condition, and, critically, compared to the Syntactic violations condition. Similarly, if a brain region is preferentially engaged in syntactic processing, we would expect to observe a reliably stronger response to the Syntactic violations condition compared to the Control condition and the Font violations condition, and, critically, compared to the Lexico-semantic violations condition. These results were not corrected for the number of regions because we wanted to give the dissociation between lexico-semantic and syntactic processing the best chance to reveal itself.
In Experiment 2, in each region, we first used two-tailed paired-samples t-tests to compare the responses to the Same and Different conditions (a reality check to test for recovery from adaptation in the language regions when all the features of the sentence change). We also compared each of the Critical conditions to the Same condition to test for recovery from adaptation when only one of the features (critically, lexical items or syntactic structure) changes. If a brain region is sensitive to lexical information, then we would expect to observe a reliably stronger response to the Lexico-semantic condition than the Same condition. Similarly, if a brain region is sensitive to syntactic information, then we would expect to observe a reliably stronger response to the Syntactic condition than the Same condition. All of these results were corrected for the number of regions (six) using the False Discovery Rate correction (Benjamini & Yekutieli, 2001). Finally, we directly contrasted the Lexico-semantic and the Syntactic conditions. If a region is preferentially sensitive to lexical information, then the Lexico-semantic condition should elicit a stronger response than the Syntactic condition. Similarly, if a region is preferentially sensitive to syntactic information, then the Syntactic condition should elicit a stronger response than the Lexico-semantic condition. As in Experiment 1, these latter results were not corrected for the number of regions because we wanted to give the dissociation between lexico-semantic and syntactic processing the best chance to reveal itself.
Finally, in Experiment 3, in each region, we first used two-tailed paired-samples t-tests to compare the response to each condition (Lexico-semantic and Syntactic) against the low-level fixation baseline (a reality check to ensure robust responses in the language regions to sentence comprehension). (Note that fixation was used here because, unlike in the other two experiments, there was no other baseline condition following Dapretto & Bookheimer’s (1999) design.) These results were corrected for the number of regions (six) using the False Discovery Rate correction (Benjamini & Yekuteli, 2001). We then directly contrasted the Lexico-semantic and Syntactic conditions. If a brain region is preferentially engaged in lexico-semantic processing, then we would expect to observe a reliably stronger response to the Lexico-semantic condition than the Syntactic condition. Similarly, if a brain region is preferentially engaged in syntactic processing, we would expect to observe a reliably stronger response to the Syntactic condition than the Lexico-semantic condition. As in the other two experiments, these latter results were not corrected for the number of regions because we wanted to give the dissociation between lexico-semantic and syntactic processing the best chance to reveal itself.
One potential concern with the use of language fROIs is that each fROI is relatively large and the responses are averaged across voxels (e.g., Friston et al., 2006). Thus, fROI-based analyses may obscure underlying functional heterogeneity and potential selectivity for one or the other component of language processing. For example, if a fROI contains one subset of voxels that show a stronger response to lexico-semantic than syntactic processing, and another subset of voxels that show a stronger response to syntactic than lexico-semantic processing, we may not detect a difference at the level of the fROI as a whole. To circumvent this concern, we supplemented the analyses of language fROIs, with analyses that i) use some of the data from each critical experiment to directly search for voxels – within the same broad masks encompassing the language network – that respond more strongly to lexico-semantic than syntactic processing (i.e., top 10% of voxels based on the Lexico-semantic>Syntactic contrast), or vice versa, and then ii) examine the replicability of this pattern of response in a left-out portion of the data. We performed this analysis for each of Experiments 1-3. If any (even non-contiguous) voxels with reliably stronger responses to lexico-semantic or to syntactic processing exist anywhere within the fronto-temporal language network, this analysis should be able to detect them. For these analyses, we used one-tailed paired-samples t-tests because the hypotheses were now directional. In particular, when examining voxels that show stronger responses to lexico-semantic than syntactic processing to test whether this preference is replicable in left-out data, the critical contrast was Lexico-semantic>Syntactic, and when examining voxels that show stronger responses to syntactic than lexico-semantic processing to test whether this preference is replicable in left-out data, the critical contrast was Syntactic>Lexico-semantic. These results were not corrected for the number of regions because we wanted to give the dissociation between lexico-semantic and syntactic processing the best chance to reveal itself.
Results
Behavioral results
Accuracies and reaction times (RTs) in each of the three experiments are summarized in Figure 4. Performance on the memory probe task in the filler trials in Experiment 1 was close to ceiling (between 95.4% and 96.6% across conditions), with no reliable difference between the critical, Lexico-semantic and Syntactic, conditions (t(21)<1, n.s.). In Experiment 2, performance on the memory probe task varied between 72.6% and 95.7% across conditions. As expected, participants were faster and more accurate in the Same condition, where the same sentence was repeated than in the Different condition, where the two sentences in the pair differed in all respects (ts(13)>15.1, ps<0.001). Furthermore, participants were faster and more accurate in the Syntactic condition than in the Lexico-semantic condition (ts(13)>3.55, ps<0.005), plausibly because the lexico-semantic content was repeated between the two sentences in the pair in the Syntactic, but not Lexico-semantic condition. However, the difference was small, with high performance (>89.7%) in both conditions. Finally, in line with the behavioral results in Dapretto & Bookheimer (1999), in Experiment 3, performance on the meaning judgment task did not differ between the Lexico-semantic and Syntactic conditions (t(14)<1, n.s.). In summary, in each of the three experiments, performance was high across conditions, with no reliable differences between the Lexico-semantic and Syntactic conditions in Experiments 1 and 3, and only a small difference in Experiment 2. We can thus proceed to examine neural differences between lexico-semantic and syntactic processing without worrying about those differences being driven by systematic differences in processing difficulty.
Summary of the behavioral results from Experiments 1-3. Significant differences between the lexico-semantic and syntactic conditions are marked with *’s.
Responses in language fROIs to the conditions in Experiments 1-3. Responses are measured as PSC relative to the fixation baseline. Significant differences between lexico-semantic and syntactic conditions are marked with *’s. The brain images show the broad masks used to constrain the selection of the individual fROIs.
fMRI results
Validating the language fROIs. As expected, and replicating prior work (e.g., Fedorenko et al., 2010; Fedorenko et al., 2011; Blank et al., 2016; Mahowald & Fedorenko, 2016), the language fROIs showed a robust sentences > nonwords effect (ts(48)>8.44; ps<0.0001, FDR-corrected for the six regions).
Responses of the language fROIs to the conditions of the critical experiments. The results for the three experiments are summarized in Figure 5 and Table 3.
Responses of language fROIs in Experiments 1-3: mean PSC with standard error (by participants), effect size (Cohen’s d), t-value, and p-value.a
Experiment 1. The Lexico-semantic violations condition elicited a reliably stronger response than the Control (no violations) condition in each of six language fROIs (ts(21)>2.77, ps<0.05), and the Font violations condition in five of the six fROIs (ts(21)>3.43, ps<0.005), with the MFG fROI not showing a significant effect (t=1.81, p=0.08). The Syntactic violations condition elicited a reliably stronger response than the Control (no violations) condition in two language fROIs: IFGorb and IFG (ts(21)>2.84, ps<0.01). However, the Syntactic violation condition did not reliably differ from the Font violations condition in any of the fROIs (ts(21)<1.45, ps>0.10), suggesting that language regions are not recruited more strongly when people encounter syntactic violations (at least, these local morpho-syntactic violations; but see Mollica et al., submitted) than they are when people encounter low-level perceptually unexpected features in the linguistic input (see also Vissers et al., 2006; van de Meerendonk et al., 2011).
Critically, a direct comparison between the Lexico-semantic and Syntactic conditions revealed reliably stronger responses to the Lexico-semantic condition in all language fROIs except for the MFG fROI (ts(21)>2.12, ps<0.05).
Experiment 2. The Different condition – where the two sentences in a pair differed in lexical items, syntactic structure, and global meaning – elicited a reliably stronger response than the Same condition, where the two sentences in a pair were identical, in four language fROIs: IFG, MFG, AntTemp, and PostTemp (ts(13)>3.45, ps<0.005); the effect was not reliable in the IFGorb and AngG fROIs. Further, the Lexico-semantic condition elicited a response that was reliably stronger than the Same condition in all language fROIs, except for the AngG fROI (ts(13)>2.71, ps<0.05), and the Syntactic condition elicited a response that was reliably stronger than the Same condition in all language fROIs (ts(13)>2.33, ps<0.05), except for the AngG fROI (t=2.16, p=0.05).
Critically, no language fROI showed reliably stronger recovery from adaptation in the Lexico-semantic than Syntactic condition or vice versa (ts(13)<1.44, n.s.).
It is worth noting that, similar to the Lexico-semantic and Syntactic conditions, the Global meaning condition also elicited a response that was reliably stronger than the Same condition in all language fROIs (ps<0.01). This effect provides evidence that language regions are sensitive to subtle differences in complex meanings above and beyond the meanings of individual words (given that the only thing that differs between the sentences in a pair in the Global-meaning condition is word order).
Experiment 3. Both experimental conditions elicited responses that were reliably above the fixation baseline (Lexico-semantic: ts(14)>4.58, ps<0.001; Syntactic: ts(14)>2.66, ps<0.05).
Critically, in two language fROIs – IFGorb and AntTemp – we observed a stronger response to the Lexico-semantic than Syntactic condition (t(14)>2.27, p<0.05). No language fROI showed the opposite pattern.
Searching for voxels selective for lexico-semantic or syntactic processing. The results for the three experiments are summarized in Figure 6 and Table 4.
Replicability (in left-out data) of the critical contrasts in Experiments 1-3.a,b
Responses in language fROIs defined by the Lexico-semantic>Syntactic contrast (L-S) or by the Syntactic>Lexico-semantic contrast (S-L) to the critical conditions in Experiments 1-3.
Responses are measured as PSC relative to the fixation baseline. Significant differences between lexicosemantic and syntactic conditions are marked with *’s; suggestive effects (0.05<p<0.10) are marked with *s above tildes.
Experiment 1. When we defined the individual fROIs by the Lexico-semantic>Syntactic contrast, we found a replicable (across runs) Lexico-semantic>Syntactic effect within all the language masks (ts(21)>2.75, ps<0.01), except for the MFG mask, consistent with finding stronger responses for the Lexico-semantic than Syntactic condition in the language fROIs defined by the language localizer contrast. When we defined the fROIs by the Syntactic>Lexico-semantic contrast, we did not find a replicable Syntactic>Lexico-semantic effect within any of the masks.
Experiment 2. Neither the fROIs defined with the Lexico-semantic>Syntactic contrast nor those defined with the Syntactic>Lexico-semantic contrast showed replicable selectivity for lexico-semantic or syntactic processing.
Experiment 3. Similar to Experiment 1, when we defined the individual fROIs by the Lexico-semantic>Syntactic contrast, we found a replicable Lexico-semantic>Syntactic effect within all the language masks, except for the MFG mask (ts(14)>2.14, ps<0.05). When we defined the fROIs by the Syntactic>Lexico-semantic contrast, we found a small Syntactic>Lexico-semantic effect within the PostTemp mask (t=1.84, p<0.05).
Responses are measured as PSC relative to the fixation baseline. Significant differences between lexico-semantic and syntactic conditions are marked with *’s; suggestive effects (0.05<p<0.10) are marked with *s above tildes.
Discussion
We conducted three fMRI experiments to search for a dissociation between lexico-semantic and syntactic storage/processing, a distinction that continues to permeate proposals of the neural architecture of language (e.g., Friederici, 2012; Baggio and Hagoort, 2011; Tyler et al., 2011; Duffau et al., 2014; Ullman, 2016). Each used a paradigm from the prior literature that included conditions targeting lexico-semantic vs. syntactic processing. Our results can be summarized as follows. First, as expected given the use of sentence-level materials, both lexico-semantic and syntactic processing elicited robust responses throughout the language network across experiments. Further, in Experiment 1, we found sensitivity to both lexico-semantic and syntactic violations, although the latter did not elicit a response stronger than that elicited by low-level font violations. In Experiment 2, we found that changes in either the lexical items or syntactic structure led to recovery from adaptation. Second, in Experiments 1 and 3, we found a bias (i.e., stronger responses) for lexico-semantic processing across language fROIs (defined using a language localizer; Fedorenko et al., 2010), with no fROIs showing the opposite pattern (i.e., stronger responses for syntactic processing). And third, when we searched for the most lexico-semantic-selective and syntactic-selective voxels, we again found replicably stronger responses to lexico-semantic than syntactic processing (in Experiments 1 and 3), but not the opposite pattern (except a small effect in one region in Experiment 3).
The (in-)separability of lexico-semantic and syntactic processing
As discussed in the introduction, two distinctions have been prominent in discussions of human linguistic architecture: a distinction between the lexicon and grammar, and a distinction between linguistic knowledge and its online access and combinatorial processing (e.g., Fig. 1). Current linguistic theorizing, psycholinguistic evidence, and computational modeling work all suggest tight integration between the lexicon and grammar for both knowledge representations and processing (see e.g., Snider & Arnon, 2012, for a review). Yet, all prominent proposals of the neural architecture of language (e.g., Friederici, 2012; Baggio and Hagoort, 2011; Tyler et al., 2011; Duffau et al., 2014; Ullman, 2016) continue to postulate a separate syntactic/combinatorial component. This component is argued to store our grammatical knowledge, and/or access structural representations from memory, and/or combine words, phrases, and clauses in the course of sentence comprehension, but critically not to support the storage or processing of individual word meanings, as illustrated in the architectures in Fig. 1a-d.
Any proposal that postulates a distinct syntactic/combinatorial component predicts that the brain region(s) associated with this component should exhibit a functional profile different from other regions of the language network (e.g., from regions that support the storage/processing of individual word meanings). As discussed in the Introduction, some neural evidence had already put into question the existence of such a component. In particular, both lateral temporal and inferior frontal language regions are robustly sensitive to both individual word meanings (e.g., they respond more strongly to real words than nonwords) and syntactic/combinatorial information (e.g., they respond more strongly to structured representations, like phrases/sentences, than lists of unconnected words, and even to meaningless Jabberwocky sentences compared to lists of nonwords; e.g., Keller et al., 2001; Fedorenko et al., 2010; Pallier et al., 2011; Bedny et al., 2011; Mollica et al., submitted), including when using temporally-sensitive methods like ECoG (Fedorenko et al., 2016; Nelson et al., 2017). In line with those earlier studies that manipulated the presence/absence of different kinds of information in the signal, here – across three paradigms each of which included a condition targeting lexico-semantic vs. syntactic processing more strongly, and using robust individual-subject analyses – we found that all language regions are sensitive to both kinds of manipulations, and no region shows stronger responses to syntactic manipulations than lexico-semantic ones. In other words, no language region, or even a set of non-contiguous voxels anywhere within the language network, shows a response profile consistent with selective or preferential engagement in syntactic/combinatorial processing.
How do we reconcile our results with prior neuroimaging studies that have reported dissociation between lexico-semantic and syntactic processing? Most prior studies suffer from limitations that undermine their conclusions. First, many prior studies have relied on observing an effect (e.g., sensitivity to syntactic processing) in brain region x, and not brain region y, to argue that the former but not the latter region supports the relevant mental process. Such reasoning has been used to argue that syntactic processing is localized to one particular region within the language network (see Blank et al., 2016, for discussion). Similarly, to argue that some brain region is sensitive to one manipulation (e.g., a syntactic one) but not another (e.g., a lexico-semantic one), prior studies have relied on observing a reliable effect for the former but not the latter. However, inferences of this kind are not valid (Nieuwenhuis et al., 2011). For example, to argue that one region and not another is sensitive to the manipulation of interest, a region by condition interaction is required, and such tests are rarely, if ever, reported. Second, some of the earliest studies (e.g., Dapretto & Bookheimer, 1999) appear to have relied on fixed-effects analyses, which means the results cannot be generalized beyond the sample tested (e.g., Holmes & Friston, 1998). Indeed, our recent attempt to replicate Dapretto & Bookheimer’s study was not successful (Siegelman et al., 2017). Third, to the best of our knowledge, all prior studies that have argued for a dissociation between lexico-semantic and syntactic processing have each relied on a single paradigm. However, to compellingly argue that a brain region is selective for one or the other mental process, it is important to generalize beyond a single paradigm to ensure that the effects are not driven by paradigm-specific between-condition differences. In summary, we argue that no prior study has convincingly established that some language region selectively supports lexico-semantic processing, whereas some other language region selectively supports syntactic processing. In the current experiments, we also did not observe such a dissociation across three paradigms, in spite of our use of sensitive analysis methods (searching for selectivity within individual participants, which may have been missed in prior group studies or meta-analyses).
Two other lines of research deserve discussion. First, the early ERP literature on language processing appeared to have provided evidence of distinct components associated with lexico-semantic processing (N400; Kutas & Hilliyard, 1980) vs. with syntactic processing (P600; Osterhout & Holcomb, 1992; Hagoort et al., 1993). However, the interpretation of the P600 as an index of syntactic processing has long been challenged (e.g., Coulson et al., 1998), and the current dominant interpretation of this component is as a domain-general error detection or correction signal (e.g., Kolk & Chwila, 2007; Vissers et al., 2007; van de Meerendonk et al., 2010). The N400 component plausibly arises within the language-selective network (e.g., Lau et al., 2008), in line with the bias for lexico-semantic processing we observed in the current study and earlier studies.
And second, syntactic priming – the re-use of a syntactic frame based on recent linguistic experiences (Bock, 1986; see Pickering and Ferreira, 2008; Branigan & Pickering, 2016 for reviews) – has often been cited as evidence of abstract syntactic representations independent of meaning (e.g., Bock & Loebell, 1990), including in relatively recent cognitive neuroscience papers (Pallier et al., 2010). However, a large body of work has now established that the effect is strongly modulated by lexical overlap (e.g., Mahowald et al. 2016; Scheepers et al., 2017) and driven by the meaning-related aspects of the utterance (e.g., Hare & Goldberg, 1999; Cai et al., 2012; Ziegler & Snedeker, 2018; Ziegler et al., 2018).
Finally, it is worth saying a few words about language production. In the current study, we focused on language comprehension. Although we plausibly access the same knowledge representations to interpret and generate linguistic utterances, the computational demands of language production differ substantially from those of language comprehension. In particular, the goal of comprehension is to infer the intended meaning from the linguistic signal, and abundant evidence now suggests that the representations we extract and maintain during comprehension are probabilistic and noisy (e.g., Ferreira et al., 2002; Levy et al., 2008; Gibson et al., 2013). In contrast, in production, the goal is to express a particular meaning, and to do so, we have to utter a precise sequence of words where each word takes a particular morpho-syntactic form, and the words appear in a particular order. This pressure for linearization of words, morphemes, and sounds might lead to a clearer temporal, and perhaps spatial, segregation among the different stages of the production process. Indeed, recent evidence from intracranial stimulation suggests that a small region in posterior superior temporal cortex may be selective for encoding and enacting morpho-syntactic inflections (Lee et al., 2018; see Fedorenko et al., 2018, for further discussion). It is therefore possible that some aspects of language production are implemented in focal and functionally selective regions. However, this conjecture remains to be evaluated further in future work.
The bias toward lexico-semantic processing
In two of our experiments, lexico-semantic conditions elicited numerically, and sometimes reliably, stronger responses than syntactic conditions. In contrast, no language fROIs showed consistently (across paradigms) stronger responses during syntactic than lexico-semantic processing. This result is in line with two prior findings. First, using multivariate analyses, we have previously found that lexico-semantic information is represented more robustly than syntactic information in the language system (Fedorenko et al., 2012). In particular, pairs of conditions that differ in whether or not they contain lexico-semantic information (e.g., sentences vs. Jabberwocky sentences, or lists of words vs. lists of nonwords) are more robustly dissociable in the fine-grained patterns of activity than pairs of conditions that differ in whether or not they are structured (e.g., sentences vs. lists of words, or Jabberwocky sentences vs. lists of nonwords). Further, Frankland & Greene (2015; see also Wang et al., 2016) found that activation patterns in temporal cortex distinguish thematic roles (agent/patient) but not grammatical positions (subject/object). And second, in ECoG, we observed reliably stronger responses to conditions that only contain lexico-semantic information (word lists) than conditions that only contain syntactic information (Jabberwocky) in many language-responsive electrodes (Fedorenko et al., 2016), but no electrodes showed the opposite pattern. Along with the current study, these results demonstrate that the magnitude and spatial organization of responses in the human language network are determined more by meaning than structure. Thus, language mechanisms may be primarily concerned with extracting meaning from the linguistic signal (see also Mollica et al., submitted).
This bias toward lexico-semantic processing fits with the view that the goal of language is communication, i.e., the transfer of meanings (e.g., Hurford, 1998, 2007), and with the fact that most information in language is carried by content words rather than structural information (e.g., Shannon & Weaver, 1949; Mollica & Piantadosi, submitted). And it is not consistent with syntax-centric views of language (e.g., Chomsky and DiNozzi, 1972; Pinker, 1995; Hauser et al., 2002; Friederici et al., 2006; Berwick et al., 2013; Friederici et al., 2017; Friederici, 2018). One important implication of these, and earlier behavioral, results is that artificial grammar learning and processing paradigms (e.g., Reber, 1967) – where structured sequences of meaningless units (e.g., syllables) are used in an attempt to approximate human syntax (e.g., Friederici et al., 2006; Petersson et al., 2010; Wang et al., 2015) – may have limited utility for understanding human language, given that syntactic representations and processing seem to be inextricably linked with representations of linguistic meaning (see also Fedor et al., 2012).
Beyond lexico-semantic and syntactic processing
Language processing encompasses a broad array of computations in both comprehension and production. Here, we have argued that during language comprehension the same mechanisms process the meanings of individual words and the structure of sentences. However, other aspects of language are clearly dissociable. For example, lower-level speech perception and reading processes as well as speech production (articulation) recruit areas that are robustly distinct from the high-level areas that we focused on here. In particular, speech perception recruits parts of the auditory cortex in the superior temporal gyrus and sulcus (e.g., Scott et al., 2000, Mesgarani et al., 2014; Overath et al., 2015), and these areas are highly selective for speech over many other types of auditory stimuli (Norman-Haignere et al., 2015). Reading recruits a small area on the ventral surface of their temporal lobe (see McCandliss et al., 2003, for a review), and this “visual word-form area” is highly selective for letters in a familiar script over a broad range of other visual stimuli (Baker et al., 2007; Hamame et al., 2013). And articulation draws on a set of areas, including portions of the precentral gyrus, supplementary motor area, inferior frontal cortex, superior temporal cortex, and cerebellum (e.g., Wise et al., 1999; Bohland and Guenther, 2006; Eickhoff et al., 2009; Basilakos et al., 2017). Moreover, discourse-level processing appears to draw on areas distinct from those that support word and sentence-level comprehension (e.g., Ferstl & von Cramon, 2001; Lerner et al., 2011; Jacoby et al., 2018), and aspects of non-literal language have been argued to draw on brain regions in the right hemisphere (e.g., Joanette et al., 1990). Thus, many aspects of language are robustly dissociable, in line with distinct patterns of deficits reported in the aphasia literature (e.g., Goodglass, 1993). However, lexico-semantic and syntactic processing do not appear to be separable during language comprehension.
Conclusions
To conclude, across three fMRI experiments, we found robust responses to both lexico-semantic and syntactic processing throughout the language network, with generally stronger responses to lexico-semantic processing, and no regions, or even sets of non-contiguous voxels within those regions, that respond reliably more strongly to syntactic processing. These results constrain the space of possible neural architectures of language. In particular, they rule out architectures that postulate a distinct region (or set of regions) that selectively supports syntactic/combinatorial processing (i.e., architectures shown in Fig. 1a-d). These constraints on neural architectures, in turn, inform cognitive theories. Of course, the lack of regional, or even voxel-level, dissociations between mental processes need not imply cognitive inseparability – there may exist neuronal assemblies selective for syntactic or combinatorial storage/processing at the sub-voxel level (e.g., the architecture in Fig. 1e). However, to the extent that prior fMRI studies have been taken as establishing syntax-selective mechanisms, those findings do not seem to be robust given our data.
We have here focused on cortical mechanisms. May syntax selectivity be present at the level of white-matter tracts? Indeed, some have argued that the arcuate / superior longitudinal fasciulus (the dorsal tracts connecting posterior temporal and inferior frontal language areas) may be selective for syntactic processing (e.g., Friederici, 2009; Brauer et al., 2011; Papoutsi et al., 2011; Wilson et al., 2011). However, others have implicated this tract in non-syntactic linguistic computations, including, most commonly, articulation (e.g., Duffau et al., 2003; Hickok & Poeppel, 2007; Rauscheker & Scott, 2009), but also aspects of semantic processing (e.g., Glasser & Rillings, 2008) and even reading (Yeatman et al., 2011). Thus, at present, no clear evidence of syntax selectivity exists for white matter tracts either.
In summary, taking all the available data into consideration, it seems that a cognitive architecture whereby the processing of individual word meanings is not separable from syntactic processing is most likely. The fact that connectionist networks, especially deep neural nets, have been shown to achieve remarkable performance on a wide variety of tasks (e.g., Mikolov et al., 2010; Sutskever et al., 2014; Bahdanau et al., 2016), including those that involve complex syntactic phenomena (e.g., Linzen et al., 2016; Gulordava et al., 2018; Futrell et al., 2018), may be taken to further support the latter kind of an architecture.
Author contributions
EF conceived and designed the study. All authors collected the data and performed data analyses. MS and IB created the figures. EF drafted the manuscript and IB provided critical revisions, with MS and ZM providing additional comments. All authors approved the final version of the manuscript.
Conflict of interest
The authors declare no competing financial interests.
Acknowledgements
We would like to acknowledge the Athinoula A. Martinos Imaging Center at the McGovern Institute for Brain Research at MIT, and its support team (Steve Shannon, Atsushi Takahashi, and Sheeba Arnold). We thank former and current EvLab members (especially Zuzanna Balewski and Brianna Pritchett) for their help with data collection, Gina Kuperberg for providing the materials used in adapted form in Experiment 1, Michael Behr for creating the script for Experiment 1, Zuzanna Balewski for help with creating the materials and script for Experiment 2, Nancy Kanwisher for discussions of the experimental design for all three experiments, and Ted Gibson, Adele Goldberg, Inbal Arnon, Jayden Ziegler, and Yonatan Belinkov for comments on the manuscript. We also thank the audience at the 2017 CUNY Sentence Processing conference (Cambridge, MA) for feedback. EF was supported by award R01-DC-016607 from NIH and by a grant from the Simons Foundation to the Simons Center for the Social Brain at MIT.