Abstract
The functional neuroimaging literature has become increasingly complex and thus difficult to navigate. This complexity arises in part from the rate at which new studies are published, but also from the heterogenous terminology that varies widely from study-to-study and even more so from discipline-to-discipline. Therefore, it is important to clearly identify the primary research domains of neuroimaging and their most commonly reported brain regions. To do so we analyze the multivariate semantic structure of abstracts in Neurosynth and found that there are six primary domains of the functional neuroimaging literature each with their own preferred reported brain regions. Furthermore these domains appear to be influenced by time (because of, e.g., changes in terminology, “popularity” of domains). Finally we note that our techniques and results form the basis of a “recommendation engine” that could help navigate the neuroimaging literature.
Introduction
Because terminology varies widely from study-to-study, and even more so from discipline-to-discipline, the neuroimaging literature is particularly difficult to synthetize. For example: (1) terminology changes over time (e.g., alcoholism to alcohol use disorders), (2) a single term can have many—or even amorphous—definitions (e.g., MVPA as multi-voxel or multivariate pattern analysis, which itself spans numerous different techniques), and (3) multiple terms describe the same concept (e.g., in vision studies “striate cortex,” “calcarine sulcus,” “V1,” “primary visual cortex,” and “Brodmann area 17,” all describe, essentially, the same brain region in functional neuroimaging). Such a diversity of terminologies makes interpretations, meta-analyses and even reviews, of the literature difficult to perform and consume.
In recent years, several meta-analytic approaches have been developed that link reported brain activations with keywords and topics such as coordinate-based meta-analysis (CBMA). CBMA was specifically developed for aggregating and synthesizing neuroimaging data reported in a standard format (P. T. Fox et al. 2014). Some of the most prominent CBMA tools used in research are BrainMap (Laird et al. 2005), SumsDB (Van Essen et al.2009), Brede (Nielsen 2003) and NeuroSynth (Yarkoni et al. 2011)—the database of interest in this paper. The main functionalities of many CBMA tools are to (a) store coordinate information by study, and (b) provide spatially consistent meta-analytic activation maps. For example, Nielsen and colleagues (Nielsen et al. 2004) analyzed 121 neuroimaging papers with 2,655 reported activations loci using probability models followed by a non-negative matrix factorization-latent semantic analysis to associate brain coordinates with the words used in the papers’ abstract (e.g., “pain” was strongly associated with the anterior cingulate). More recently, Poldrack and colleagues (Poldrack et al. 2012) analyzed more than 5,800 papers to model the associations between topics derived from the full text of studies and the reported peak coordinates via “topic mapping.” Poldrack and colleagues showed that—with a “topic mapping” approach to semantic analysis—CBMA could reveal new relationships between brain activation and cognitive processes or psychiatric disorders (for different flavors of CBMA, see e.g., Rubin et al. 2016; de la Vega et al. 2016). Furthermore, some extensions of CBMA can link additional data types (e.g., gene expression) with brain regions and keywords (Mesmoudi et al. 2015; A. Fox et al. 2014; Gorgolewski et al. 2014; Rizzo et al. 2016).
Although the main functionalities of many of the CBMA tools are to (a) store coordinate information by study, and (b) provide meta-analytic activation maps (often based on terminology usage, e.g., which regions are most associated with “vision”, “anger”), CMBA tools fall short of revealing the primary domains of neuroimaging and the brain regions most associated with these domains (although CMBA tools are designed exactly for that goal, i.e., meta-analyses.
In this work, we define the primary domains of neuroimaging based on the semantics of the literature (i.e., abstracts), an approach which is a necessary step towards the definition of formal brain or cognitive ontologies. Our study is designed to achieve three major goals:(1)define the “semantic space” of the neuroimaging literature, (2) identify semantically defined domains within the literature, and (3) map brain activations onto these domains. To do so, we used correspondence analysis (CA)—a technique similar to principal components analysis (PCA) that was originally created for analyzing the co-occurrences of words in a corpus (Lebart et al. 1998; Benzecri 1976; Escofier-Cordier 1965, Abdi & Béra, 2014)—to identify neuroimaging domains from co-occurrences of words in the neuroimaging literature as identified in the Neurosynth database (Yarkoni et al. 2011).
First we apply CA to a 10,898 studies × 3,114 words matrix; because CA on this matrix generates thousands of components, we used split-half resampling (SHR) to identify the most reliable and replicable components. Next we applied hierarchical clustering (HC) within the subspace (i.e., components) identified by SHR to identify the primary subdomains in functional neuroimaging. We then investigated how these clusters change over time. Finally, we generated brain maps (in MNI space) conditional to both the components and clusters, which highlight the brain regions most commonly associated with the components and clusters we identified. We also include a comparison of our brain maps with recent maps from Yeo et al., (2015).
METHODS
Data and preprocessing
Neurosynth is an open source and open science initiative—hosted via the website www.neurosynth.org—that facilitates meta-analyses and reviews of the functional neuroimaging literature (Yarkoni et al. 2011). Neurosynth, at the time of this writing, contains more than 11,406 articles from the functional neuroimaging literature. When we began this work, Neurosynth contained 10,903 articles (from 43 journals), which were the basis of this study. As an aside, some articles in our data set no longer appear in Neurosynth because Neurosynth periodically updates its content for exclusion (e.g., to remove structural only studies) and public release. See http://github.com/neurosynth/neurosynth-data and http://www.neurosynth.org/ for details
Neurosynth uses automated webcrawlers to fetch data (e.g., abstract text, peak coordinates) and metadata (e.g., PubMed ID, title, year published, journal) of neuroimaging studies. For our study, we created and analyzed two data tables derived from Neurosynth data: (1) a “studies × words” matrix and (2) a “studies × voxels” matrix, where studies are identified by their PubMed ID (PMID) number. To achieve our three goals, our study comprised three major parts that correspond to each goal, wherein each major part has several steps. All analyses were conducted with a variety of publicly available packages (noted in relevant sections) or in-house scripts written in MATLAB (MathWorks, Natick, MA), R (R Core Team 2014), and Python (Python Software Foundation, https://www.python.org/) languages and environments.
To create a studies-by-words matrix for analysis, we acquired information from Neurosynth and PubMed (http://www.ncbi.nlm.nih.gov/pubmed/). With PMIDs from the Neurosynth database, we obtained from PubMed the text in all abstracts associated with the studies in the Neurosynth database. Next, we used the tm package in R (Feinerer 2011) to conduct several preprocessing steps that were used in previous works (Ailem et al. 2016) and that consisted in: (1) converting all text to lower-case, (2) removing all punctuations, numbers, emails and web addresses, (3) removing all words of length one or two, (4) removing step words, meta-words and words that describe numbers, quantities, nationalities, cities, or names (e.g., “publisher,” “article,” “date,” “ten,” “zero,” “weeks,” “european,” “montreal,” “welcome”), (5) converting British English to American English, (e.g., “behaviour” to “behavior”) and finally (6) stripping out white spaces. Once the data were cleaned, some words with different meanings were updated so they did not have the same “stems.” For example, “posit,” “positive,” “positively,” and “position,” “positioning” would correspond to the same stem “posit;” therefore some words were altered so they would only have the same stem if they had (in general) the same meaning. In a final step, we went through all remaining words individually to identify words that were potentially missed in the previous steps (e.g., “science,” “academic,” “publishing”). With a final set of words, we created a matrix of studies (rows) by words (columns). Each cell of this matrix contained the frequency of a word used in the abstract of a study. For example, the abstract of PMID 17360197 used the word “cold” 28 times. Finally, the studies-by-words matrix went through one more cleaning step: we eliminated infrequent words. Words with frequencies below the 3rdquartile (in our case: less than 16 occurrences) were removed. This step was followed by the removal of two studies that had withdrawal notices from the publisher. The final studies- by-words matrix contained 10,898 studies and 3,114 words (the full data table is provided in Supplemental Material and at https://github.com/fahd09/neurosynth_semantic_map).
Correspondence Analysis
Because our data are the number of occurrences (i.e., counts), we decided to use correspondence analysis (CA)—a technique designed specifically to analyze co-occurrences and often described as a version of PCA tailored for qualitative data. Like PCA, CA decomposes a matrix into orthogonal components rank-ordered by explained variance (Greenacre 1984; Abdi & Béra 2014; Lebart et al. 1984). CA is a bi-factor analysis that analyzes the relationship between the row and the column of a contingency table data matrix. Each row (study) and column (word) item is assigned a “component score” (a.k.a. factor score) that reflects the amount of variance this item contributes to a given component. CA places emphasis on rare items so that they contribute a high amount of variance, while frequent items contribute little variance (see Greenacre 2017); this is particularly useful for our study because frequent words (e.g., “brain”) will be near the origin (i.e., zero) of the components whereas rare words (e.g., “polymorphism”) will be far from the origin and thus are the sources of variance for the components. CA operates under the same assumptions as χ2that is how far the observed data deviate from independence. Finally, because both rows and columns are represented in the same space (with the same variance), we can interpret the relationships within row items and within column items as well as the relative relationships between row and column items. Finally, because we wanted to identify brain regions most associated with semantically-defined domains, we used a technique called supplementary projection (Greenacre 2017; Abdi & Béra 2014) that allows us to predict a supplementarydata set (i.e., studies × voxels) from the component structure of the active data set (i.e., studies × words).
We applied CA to studies-by-words matrix to define the semantic space of the neuroimaging literature. We used in-house MATLAB code, as well as the ExPosition (Beaton et al. 2014) and ggplot2 (Wickham 2009) packages in R to perform CA as well as visualizations and additional analyses (i.e., visualizations, resampling-based inference tests, clustering, and supplementary projections; see following sections).
Split-half Resampling
Split-half resampling (SHR, Strother et al. 2002; Churchill et al. 2012) is a cross- validation (CV) technique that evaluates the stability of the results of a statistical analysis performed on a data set by randomly splitting this data set into two (approximately) equal sized non overlapping data sets, and then performing the same analysis on each data set. The similarity (e.g., correlation) between the results obtained from these two data sets is then used to evaluate the reliability of the results (i.e., replicable effects). SHR is performed many times to create a distribution of reliability estimates.
We used SHR to identify the most replicable components in two ways: (1) split the data by rows, and (2) split the data by columns; in both approaches, we performed CA on each split set, and then computed the absolute correlation1 between the component scores of each split. SHR was performed 1,000 times to create a distribution of (absolute) correlations between components for both the (1) row component scores conditional to the columns, and (2) the column sets of scores conditional to the rows. We then computed the average (absolute) correlations to detect which components (after 1,000 splits) were most replicable between splits in order to identify a low rank approximation of the semantic space (i.e., component selection via SHR).
1We used absolute correlation because there can be trivial sign flips between sub-samples of data, so the sign is irrelevant but the magnitude of the correlation is relevant.
Clustering of Studies and Assignment of Words
We performed hierarchical clustering (HC), with squared Ward linkage (Murtagh & Legendre, 2014), on the component scores for the studies (rows). We chose squared Ward linkage because its objective function minimizes the error sums of squares (and thus aims for an optimal ANOVA-like configuration). The component scores take into account the explained variance per component (i.e., Component 1 explains more variance than Component 2). After HC, we performed cluster stability analysis via Calinski-Harabasz index (Calinski & Harabasz, 1974) in order to identify a stable number of clusters. Once studies were divided into clusters, we then used distance-based classification in order to assign each word (column) to a study cluster. Hierarchal clustering and cluster stability were conducted in R via hclust and clusterCrit (Desgraupes 2015), respectively.
Producing Brain Maps
Activation maps are represented in Neurosynth as peak activations of individual studies as centers of a sphere with a radius of 6mm. Voxels inside the sphere have a value of 1 and the other voxels have a value of 0. The voxels-by-studies matrix then uses a vectorized (flattened) version of the peak activation maps with reference to a 3D brain. The voxels-by- studies matrix initially contained 10,898 studies and 228,453 voxels (i.e., voxels within MNI space). For our analyses, infrequently reported voxels (i.e., less than 10% of studies) were removed. The final studies-by-voxels matrix contained 10,898 studies and 206,077 voxels. We computed two different brain activation maps from the semantic space. The first type of activation map was a component-wise map. Brains were projected onto (i.e., predicted by) the semantic space—per replicable component—via supplementary projections. The second type of activation map was simply the sum of peak activations per study cluster.
Supplementary Projections
Supplementary—a.k.a. out of sample—observations (or variables) can be integrated into an existing analysis performed on a different set of observations (or variables) referred to as the active data set. Supplementary data are assigned component scores by computing the least square projection for observations (or variables) onto the space defined by the active observations (or variables). We used supplementary projection to predict component scores for voxels from the component scores of studies defined in the in the semantic space (i.e., CA of studies × words). Predicted activation maps (from the supplementary scores) were projected back into MNI space. The predicted maps were then thresholded to 2% (to only show extreme voxels on either side of the distribution of factor scores) and smoothed with a Gaussian 10mm kernel (FWHM). Smoothing and glass brain visualizations were conducted with nilearn (Abraham et al. 2014) in Python. All resulting maps (5 components maps and 6 cluster maps) are shared publicly in a Neurovault repository (http://neurovault.org/collections/2002/, last accessed June 12, 2017; Gorgolewski et al.2016).
RESULTS
Correspondence Analysis (CA) was applied to a 10,898 studies × 3,114 words matrix and produced 3,112 components (see Figure 1a for the Scree plot). Split-half resampling (SHR) was applied 1,000 times to split the studies and then applied again 1,000 times to split the words of the studies × words matrix. SHR of both sets revealed that the first three components were highly replicable (Figure 1b and c) whereas Components 4 and 5 were moderately-to-highly replicable (see Figure 1b). Hierarchical clustering and cluster stability analysis were performed only on the first five (i.e., replicable) components. Cluster stability analysis revealed 6 reliable study clusters (see Supplemental Figure 1). Words were then assigned to the closest study cluster.
Variance explained and reproducibility (via split-half resampling; SHR) of latent semantic components. (a) The Scree plot shows the explained variance per component for all 3,112 components. Figures 1 (b) and (c) show heatmaps of correlations between component scores after SHR where (b) shows average correlation after SHR for the words component scores and (c) shows the average correlation after SHR for the studies component scores; only components 1 through 20 are shown. Both the Scree plot and the heatmap for the studies component scores suggest three high variance and highly reproducible components. The heatmap for the words component scores also show that the first three components are highly reproducible, but also that Components 4 and 5 are reproducible in the words component scores.
The total number of studies and terms per cluster are presented. Furthermore, we provide a brief description that helps characterize the contents (i.e., studies and terms) of each cluster. Note. The cluster solution indicated six clusters.
To help interpret the components, we used the words and studies at the extremes (i.e., highest contributing variance) for each component (Figure 3 for extreme words; Supplemental tables 1-5 for extreme studies). Table 1 shows the total and relative number of studies and words per cluster. As with the components, the most (and least) frequent words within each cluster help us interpret the cluster’s meaning (Supplemental tables 6). Furthermore, we also identified the words that were closest to the barycenter (i.e., multidimensional mean; the origin of the axes in Figure 2) of each cluster (across all five dimensions; Supplemental table 7). We also provide the titles and PubMed IDs of the twenty studies closest to the barycenter of each cluster in Supplemental tables 8-13 as well as the overall “most average” and “most unique” studies and terms in Supplemental table 14. Component maps—which present two components at a time—are presented in Figure 2. We present component maps of the words and studies separately. In each map, we color each dot (i.e., a study or word) by its associated cluster. Components 1, 2, and 3 are visualized in Figures 2a–d. We show Components 4 and 5 separately from the other components (Figures 2e and f) because studies on Components 4 and 5 constitute a single cluster (see next section). Brain maps for the components are presented in Figure 3, and brain maps for clusters are presented in Figure 4. In Results the components and clusters are first referred to by numbers: The component number reflects its rank order (by variance), but the cluster numbers are arbitrary. We provide interpretations of components and names for clusters right after their descriptions.
Visualization of component scores for both the words and the studies on Components 1 through 5. Axes are components and individual dots represent either a particular study or particular word. Words and studies are colored by which cluster they belong to and thus illustrate the large sub- domains within fMRI. Figures 2 (a) and (b) show the words and studies (respectively) component scores for Components 1 (horizontal) and 2 (vertical). Figures 2 (c) and (d) show the words and studies (respectively) component scores for Components 1 (horizontal) and 3 (vertical). Figures 2 (e) and (f) show the words and studies (respectively) component scores for Components 4 (horizontal) and 5 (vertical). While most words and studies form large groups within the axes, Components 4 and 5 show a highly specific subset of words and studies, of which nearly all are assigned to Cluster 6 (typically fMRI studies that include genetic and molecular terms).
Visualization of brain maps, per component, as predicted (via supplementary projections) by the words × studies component scores (left) and a word cloud that shows some words that either loads on the positive or negative axis (right). (a) The projected map for Component 1 shows that the positive side (marked in red) is associated with the left temporal lobe, bilateral occipito-temporal, and parietal regions, while the negative side (marked in blue) is associated with many subcortical structures. (b) The projected maps for Component 2 show that the positive side is associated with bilateral somatosensory areas and the right cerebellum, while the negative side is associated with subcortical structures as well as medial prefrontal cortex. (c) The projected brain maps for Component 3 show that the positive side is associated with the left lateralized language-related areas while the negative side is associated with somatosensory cortex in addition to the brainstem. (d) The projected map for Component 4 show that the positive side is associated with medial structures of parietal areas (precuneus) while the negative side is associated with bilateral somato- sensory areas as well as the insular cortex and brainstem. (e) The projected brain maps for Component 5 show that the positive side is associated with the posterior cingulate and medial prefrontal cortices while the negative side is associated with bilateral somato-sensory areas and the brainstem.
Components
Component 1 separates words associated with cognitive (especially linguistic) studies from words associated with clinical or translational studies (see Fig. 2a–b). Words at extreme positive and negative side of Component 1 are visualized in Fig. 3a. The projected brain map for Component 1 (Fig. 3a) shows that: (1) the positive side of Component 1 is associated with the left temporal lobe, bilateral occipito-temporal, and parietal regions, and (2) the negative side of Component 1 is associated with many subcortical structures.
Component 2 separates words often associated with diffusion imaging studies from words often associated with affective and decision-making studies (see Fig. 2a–b, and Fig. 3b for words at positive and negative extremes). The projected brain maps for Component 2 (shown in Fig. 3b) show that: (1) the positive side of Component 2 is associated with bilateral somatosensory areas and the right cerebellum, and (2) the negative side of Component 2 is associated with subcortical structures as well as medial prefrontal cortex.
Component 3 separates words associated with language studies from words associated with somatosensory studies and pain (see Fig. 2c–d, and Fig. 3c for words at positive and negative extremes). The projected brain maps for Component 3 (shown in Fig. 3c) show that:(1) the positive side of Component 3 is associated with the left lateralized language-related areas (e.g., temporal and frontal areas known as Broca’s Area and Wernicke’s Area), and (2) the negative side of Component 3 is associated with somatosensory cortex in addition to the brainstem.
Components 4 and 5 (together) identify a highly distinct set of words and studies related to molecular, genetic, and genomic neuroimaging studies (i.e., “imaging genetics;” see Fig. 2e–f, and Figs. 3d-e for words at positive and negative extremes). The brain maps for Component 4 (Fig. 3d) show that: (1) the positive side of Component 4 is associated with medial structures of parietal areas (precuneus), and (2) the negative side of Component 4 is associated with bilateral somato-sensory areas as well as the insular cortex and brainstem. The brain maps for Component 5 (in Fig. 3e) show that: (1) the positive side of Component 5 is associated with the posterior cingulate and medial prefrontal cortices, and (2) the negative side of Component 5 is associated with bilateral somato-sensory areas and the brainstem.
Component 1 reflects the research spectrum of neuroimaging: from basic science research on the positive side to clinical/translational neuroimaging research on the negative side (Figs 2a-b). While the basic science research is more associated with cortical structures, the clinical/translational research is more linked with subcortical structures (Fig 3a). Component 2 reflects a methodological spectrum that ranges from cognitive tasks on the negative side to multi-modal imaging (i.e., structural-functional association) on the positive side. The negative—and more densely populated—side of Component 2 is more associated with studies on affect and emotion (Fig. 2a–b) with dense projections to subcortical and prefrontal cortex, whereas the positive side of Component 2 is linked to studies that rely on multi-modal MRI and other related methodologies with somatosensory, temporal and cerebellar projections (Fig. 3b). Component 3 reflects low-to-high level of cognition; studies about higher-order cognitive processes (especially linguistics) are on the positive side of Component 3 (with cortical projections to the bilateral temporal and frontal regions; Figs 2c–d; Fig. 3c), whereas studies on lower-order cognitive processes (such as sensation, perception, and direct sensory stimulation) are on the negative side of Component 3 (and also projects to the bilateral somatosensory areas and the brainstem in addition to the left cerebellum; Figs 2c–d; Fig. 3c). Finally, Components 4 and 5 showed a distribution of words and studies that makes a near perfect 45° angle between Components 4 and 5 that extended out from the origin. These words and studies were almost entirely molecular, genetic, and genomic neuroimaging (i.e., “imaging genetics”) studies. While studies on Component 4 are more associated with genetic contributions to cognition in healthy populations, by contrast, the projections to parietal and frontal regions studies on Component 5 were more associated with genetic contributions in disordered populations with projections to bilateral somatosensory regions.
Visualization of brain maps, per cluster, computed as the sum of peak activations for all studies within a particular cluster. (a) Summed activations of individual studies in cluster 1 (Knowledge Representation & Language Processing) showed in the bilateral peaks in the frontal and temporal lobes—which are often associated with language and knowledge representation—in addition to a peak area in the anterior cingulate cortex.(b) Summed peak activations of studies in cluster 2 (Development, Lifespan and Disorders) showed a diffuse pattern of reported activations in frontal and parietal areas, as well as subcortical regions. (c) Summed peak activations of studies in cluster 3 (Sensation, Movement and Action) showed in bilateral somatosensory areas and the thalamus. (d) Summed peak activations of studies in cluster 4 (Cognition & Psychology) showed in the mid-line and bilateral areas in frontal regions, in addition to bilateral occipito-temporal regions. (e) Summed peak activations of cluster 5 (Decision, Emotion, & Substance Use) appeared in subcortical areas and medial frontal regions. (f) Summed peak activations of Cluster 6 (Imaging Genetics) showed almost entirely subcortical areas.
Clusters
Cluster 1 contains words primarily associated with language/speech production, comprehension, and disorders, as well as knowledge processing. Some examples include: decod, word, superior, auditori, languag, semant, percept, speech, recognit, complex (see Supplemental tables 6, 7, and 8). Figures. 2a-b show that this cluster primarily lies on positive sides of Components 1 and 3. Summed peak activations of individual studies in this cluster showed in the bilateral frontal and temporal regions—which are often associated with language and knowledge representation—in addition to the anterior cingulate cortex (see Fig 4a).
Cluster 2 contains words associated with developmental, lifespan, and aging studies, as well as their respective disorders. Some examples include: patient, differ, chang, healthi, age, structur, breakdown, degen, ecnp, epidemiolog (see also Supplemental tables 6, 7, and 9). Figs. 2a–b shows that this cluster primarily loads on the negative side of Component 1 and on the positive side of Component 2. Summed peak activations of studies in this cluster (see Fig 4b) showed a diffuse pattern of activations in the frontal and parietal areas, as well as in the subcortical regions.
Cluster 3 contains words primarily associated with sensation (cutaneous and olfaction) and movement. Some examples include: include motor, pain, movement, hand, stimul, sensori, thalamus, somatosensori, reflex, anesthet (see also Supplemental tables 6, 7, and 10). Figs. 2c-d shows that this cluster loads primarily on the negative side of Component 3. Summed peak activations of studies in this cluster (Fig. 4c) showed in the bilateral somato- sensory areas and the thalamus.
Cluster 4 contains words associated with more “traditional” aspects of human cognitive neuroscience: those rooted in cognitive and experimental psychology (i.e., they rely primarily on behavioral tasks to examine neural correlates). Some examples include activ, function, task, area, fmri, network, memori, effect, visual, decay (see also Supplemental tables 6, 7, and 11). Figures 2a–c show that Cluster 4 is closest to the origin point across all components with no apparent trend toward any axis. Summed peak activations of studies in this cluster (shown in Fig. 4d) showed in the mid-line and bilateral frontal regions, in addition to bilateral occipito-temporal region.
Cluster 5 contains studies/words that describe affective processes, such as emotional responses and decision-making, but also includes a number of studies and words related to substance use disorders and mood disorders. Cluster 5 includes the words: emot, prefront, reson, cingul, examin, medial, amygdala, negat, social, diminish, take (see also Supplemental tables 6, 7, and 12). Figs. 2a–b show that this cluster lies mostly on the negative side of Component 2. Summed peak activations of Cluster 5 (Fig. 4e) appeared in subcortical areas and medial frontal regions.
Cluster 6 loads almost entirely and exclusively on both Components 4 and 5 (see Figs 2e–f). Cluster 6 contains words such as: variat, genet, dopamin, gene, carrier, allel, genotyp, receptor, polymorph, dopaminerg, comt, serotonin, apo, norepinephrine (see also Supplemental tables 6, 7, and 13). Summed peak activations of Cluster 6 (Fig. 4f) showed almost entirely in the subcortical areas.
Clustering revealed semantically homogenous subgroups within neuroimaging. Cluster 1 (henceforth: Knowledge Representation & Language Processing) represents studies that mainly investigate knowledge representation and language processing. Cluster 2 (henceforth: Developmental, Lifespan, and Disorders) represents studies that investigate developmental and adult lifespan research in addition to brain disorders. Cluster 3 (henceforth: Sensation, Movement, and Action) represents studies that investigate sensation, movement and action. Cluster 4 (henceforth: Cognition and Psychology) represents the majority of cognition and psychological research. Cluster 5 (henceforth: Decision, Emotion, and Substance Use) represents studies that investigate decision-making, emotions and substance use (or abuse). Cluster 6 (henceforth: “Imaging Genetics”) represents a unique dimension (i.e., Components 4 and 5) of molecular, genetic, and genomic neuroimaging (“imaging genetics”) studies.
Follow up Analyses
Upon completion of the analyses, there were two clusters that stood out: (1) Cluster 4 (Cognition and Psychology)—which is essentially the “average” neuroimaging study because it is centered roughly on the origin of the components—and (2) Cluster 6 (“Imaging Genetics”)—which is comprised of the studies that define Components 4 and 5). Notably, Cluster 4 (Cognition and Psychology) reflects the origins of neuroimaging use (i.e., cognitive psychology), whereas Cluster 6 (“Imaging Genetics”) reflects the current state-of-the-art (i.e., translational and interdisciplinary work).
Furthermore, we wanted to compare our (components and clusters) brain maps with some existing maps, specifically as Yeo et al.’s 12-component ontology (Yeo et al. 2015; derived from thousands of fMRI studies). In Yeo et al. (2015), a hierarchal Bayesian model was applied to 10,449 experimental contrasts in the BrainMap database in order to estimate the probability that each pre-defined task category would engage a specific cognitive component, and the probability that each cognitive component would engage brain regions (represented by voxels).
Therefore we followed up with two additional sets of analyses: (1) a breakdown of our clusters over time, and (2) a comparison of our predicted brain maps to those in Yeo et al., (2015).
Figure 5 shows the relative frequency of the number of studies in each cluster sorted by year. Cluster 4 (Cognition and Psychology) accounts for a substantial amount of studies in the earlier years. For example, in the year 2000, approximately 50% of all neuroimaging studies (in Neurosynth) were in Cluster 4 (Cognition and Psychology). On the other hand, Cluster 5 (Decision, Emotion, & Substance Use) started as a small proportion of all neuroimaging studies in the earlier years, but now accounts for nearly 33% of all studies. We discuss the temporal properties of these clusters further in the Discussion.
Shows the proportion of studies within each cluster over time. Cluster 4 (Cognition and Psychology) was, and still generally is the core of of fMRI research and as such comprises a substantial proportion of the literature. Though Cluster 4 (Cognition and Psychology) remains very large, it has decreased over time. Both Cluster 5 (Decision, Emotion, Substance Use) and Cluster 2 (Developmental, Lifespan, Disorders) have shown a considerable increases over time and now comprise, respectively, comparable proportions of the literature as Cluster 4 (Cognition and Psychology).
Shows the correlations between the maps generated by Yeo et al., (2015) and (a) our components or (b) our clusters. There are some notably high similarities between our brain maps (which were generated conditional to the latent semantic space) and the Yeo et al., (2015) maps, such as the Yeo components 11 and 12 with our Component 2 (see a), and the Yeo components 1 and 7 with our Cluster 3 (see b).
Figure 6 shows the correlations between our maps and the maps from Yeo et al (2015). In order to use only the valid voxels that lie within the brain, we only included non- zero voxels from the component maps. Correlations were performed with Yeo et al.’s (2015) maps available from NeuroVault (http://neurovault.org/collections/866/, last accessed June 7, 2017). We refer to Yeo et al.’s (2015) components as, e.g., Yeo Component 1 (YC1) or Yeo Component 6 (YC6) while we refer to our own components as “Component 1” or “Component 6.” There were several correlations of note for both the components (Figure 6a) and the clusters (Figure 6b). To note, although the magnitudes of those correlations are interpretable, the sign (or direction) of the correlation are not easily interpretable.
Figure 6a shows strong correlations between our Component 2 and YC11 and YC12. Both maps show strong association with bilateral subcortical structures (e.g., amygdala and striatum) in addition to their relationship with subcortical-related functions such as emotions and affect. Also, there is a strong correlation between our Component 3 and YC5 because both maps show strong association with temporal and frontal activations in addition to their relationships with semantic knowledge and language processing. Furthermore, there is a strong correlation between our Component 4 and YC6 because both maps show strong associations with medial parietal and frontal areas (commonly known as the frontal-parietal network; Smith et al. 2009).
Similar to the components, correlations between our clusters and Yeo components are illustrated in Figure 6b. Our Cluster 1 is most correlated with YC 5 followed by YC3, all of which have activations in the temporal and frontal regions and are generally involved with knowledge representation and higher-order semantic processing. Our Cluster 3 is correlated with YC1 followed by YC7, both of which have activations in somatosensory areas and are involved in sensation and movement processing. Our Clusters 5 are both correlated with YC11 and YC12, all of which are associated with activations in subcortical structures and are associated with tasks that involve some aspect of affective or emotional processing. Although our Cluster 6 also share the same regions (and is most similar to YC11 and YC12), it comes mostly from molecular and genetic studies.
DISCUSSION
In recent years, there have been many meta-analyses, mega-analyses (analyses of pooled data across many studies), and other large-scale analyses of data within neuroimaging. In general, the aims of such analyses are to (1) test or refute findings and hypotheses (Wager et al. 2007), (2) build a consensus around particular models, hypotheses, or theories (Salimi- Khorshidi et al. 2009), (3) estimate consistency of findings (Wager et al. 2009), (4) help define related brain regions and networks (Toro et al. 2008; Mesmoudi et al. 2013), (5) interpret functional maps (Laird et al. 2011), or (6) segment the brain in new ways with resting-state fMRI measurements (Yeo et al. 2011; Power et al. 2011) or using massive multi- modal data (Glasser et al. 2016).
Using the meta-analytic cognitive component maps from Yeo et al (2015), we showed a substantial overlap between many of our maps and maps from Yeo. The major difference is that our meta-analytic maps were defined conditionally through the language used (in abstracts) from the functional neuroimaging literature, whereas others authors took a more brain-centric approach, for examples: network- and meta-maps generated by with resting state fMRI (Yeo et al. 2011; Power et al. 2011) or via meta-analysis of data from hundreds or even thousands of studies (Yang et al. 2016; Yeo et al. 2015; Poldrack et al. 2012; de la Vega et al. 2016).
Our components explain the primary sources of variance of language used: in the field at large (i.e., Component 1), for methodological tools (i.e., Component 2), in various aspects of cognition (i.e., Component 3), and in relatively new studies with highly-specific terminology (i.e., Components 4 and 5). With supplementary projections we also showed that these language-based components are frequently associated with particular brain regions. While the components indicate language variation and gradients, our clusters define the boundaries of functional neuroimaging into specific—albeit large—sub-domains. Furthermore, our analyses revealed that there are, perhaps, biases or preferentially studied brain areas based on domains (i.e., clusters).
Some of our components also reflect, to a degree, current debates such as the distinctions between “neurological vs. psychiatric brain disorders” (White et al. 2012). For example, Crossley et al. (2015) recently used CBMA of voxel-based morphometric (VBM) studies to show a “neuroimaging-based” evidence for the biological distinctions between neurological vs. psychiatric disorders (Crossley et al. 2015). However, our components show that neuroimaging studies in neurology and psychiatry do not use the same terminology and thus could be a source of the “versus” argument. As an illustration of this contention, we have selected some of the same neurology and psychiatry related terms used by Crossley et al., (2015) to highlight particular features of our components. First, all the words related to psychiatric or neurological disorders (Supplemental Table 15) appear on the negative side of Component 1—a configuration that supports our interpretation of a spectrum from basic science to applied and clinical neuroimaging. Furthermore, the neurological and psychiatric terms from Crossley et al., (2015) oppositely load on both Components 2 and 4 (Supplemental table 15): a configuration that reflects overall differences in patterns of terminology between neurological and psychiatric studies and thus expresses a dissociation of neurological studies and their regions (such as sensorimotor cortices and insula; in red) from psychiatric studies and their regions (such as limbic and prefrontal areas; in blue) as see in Figure 3b.
Furthermore, the positions—and contents—of our clusters reveal a broad configuration of the neuroimaging literature. Cluster 4 (Cognition and Psychology) is the closest to the barycenter (origin of the axes across all components) and thus represents the average or most common neuroimaging study. This interpretation is supported by Cluster 4 (Cognition and Psychology) because it contains a substantial proportion of words and studies (~33% of words and ~29% of studies, see Table 1). Thus, much of the neuroimaging literature has been—and appears to still be—rooted in the approaches from cognitive and psychological domains. Summed peak activations of studies in this cluster (shown in Fig. 4d) show a high association with a wide set of cortical areas in the medial and bilateral frontal, occipital and subcortical regions that are associated with task performance. We also see opposition of clusters and this suggests that these are the sources of variance for our components. For example, Cluster 5 (Decision, Emotion and Substance Use) opposes all other clusters on Component 2 (Figs. 2a-b)—a pattern that further supports the neurological vs. psychiatric dissociation of Component 2. Summed peak activations of studies in this cluster (shown in Fig. 4e) show high association with the subcortical areas and medial frontal regions that are generally associated with emotional processing and decision-making process. Similarly, Cluster 3 (Sensation, Movement and Action) is opposed to all other clusters on Component 3 (Figs. 2c-d)—a component, as we previously noted, expresses a spectrum from low-to-high level processing. Summed peak activations of studies in this cluster (shown in Fig. 4c) show high association with the bilateral somatosensory areas and the thalamus. Furthermore, Cluster 6 (“Imaging Genetics”) is almost entirely defined by the unique configuration of both Components 4 and 5 (Figs. 2e–f). Not only does Cluster 6 reflects a unique subfield of neuroimaging, but it also indicates that “imaging genetics” uses an almost exclusive set of words, different from the vocabulary of the rest of neuroimaging (cf., the 45 degree angle from Components 4 and 5). Summed peak activations of Cluster 6 (Fig. 4f) are almost entirely associated with subcortical areas. Finally, both Clusters 4 (Cognition and Psychology) and 5 (Decision, Emotion and Substance Use) proportionally explain over half of the literature at any given time (Fig. 5).
Our clusters and their respective brain maps are consistent with results of other meta- analysis. The activation map of Cluster 1 (Knowledge Representation and Language Processing; Fig. 4a) is similar to other published meta-analytic maps and reviews of language processing and semantic representation (Binder et al. 2009; Price 2010; Price 2012; Fedorenko & Thompson-Schill 2014; Bookheimer 2002). The activation map of Cluster 3 (Sensation, Movement and Action; Fig 4c) is similar to other maps from studies investigating pain localization (Perini et al. 2013; Amanzio et al. 2013; Schomers & Pulvermüller 2016; Friebel et al. 2011; Vierck et al. 2013) in addition to the somatosensory co-activation network (Smith et al. 2009). Finally, the activation map of Cluster 5 (Decision, Emotion, and Substance Use; Fig. 4e) is also highly similar to the map of the structures involved in different aspects of emotional processing and decision-making (Bartra et al. 2013; Lindquist 2010; Etkin & Wager 2007; Buhle et al. 2014; Phan et al. 2002).
Many meta-analyses and meta-analytic tools for neuroimaging have a common (even if unstated) goal: to help homogenize our understanding of the literature and through this homogenization help define ontologies (Poldrack & Yarkoni 2016; Poldrack et al. 2011) so that we can relate brain function to cognition. However, with many tools at our disposal, there are known biases in neuroimaging (Jennings & Van Horn 2012) and the language we use can make building such ontologies difficult. With well defined language and homogenization of reporting results, fields such as genomics can provide a more robust assessment of the relationship between studies and the roles of particular genetic effects (Ailem et al. 2016).
Based on the analysis of term co-occurrences in the abstracts of 10,898 neuroimaging articles, we have identified a highly reliable set of dimensions and subfields that define the underlying semantic space of the neuroimaging field. Most researchers tend to stay within their specialized domain (by using specific key terms common to their field) and this behavior may restrict what they can conclude and how they report their findings, because they use a preferred or required terminology. In fact, Clusters 2 (Development, Lifespan and Disorders) and 5 (Decision, Emotion and Substance Use), as well as Components 1 and 2 show that there are language barriers between different types of clinical and experimental studies that could preclude thorough reviews of relevant literature (see examples in Supplemental Table 16).
Because such diverse terminologies and highly specialized fields could cause researchers to overlook relevant work in domains related and unrelated to their own, several recent tools have been proposed, including one that uses Neurosynth. Recently, (Yuan et al. 2017) introduced MAPBOT, which helps researchers navigate relevant studies in Neurosynth, but conditional to a region of interest. For example, in their paper, Yuan et al., (2017) use a thalamic mask to generate a study × term matrix and then decompose that matrix with non-negative matrix factorization. MAPBOT helps researchers navigate the literature and relevant terms conditional to a priori masks. Additionally, a tool called Papr (Chawla 2017; Leek 2017) was recently released to help researchers find preprints on bioRxiv that may be of interest to them. Papr is based on an approach similar to the approach that we took here: In Papr, users can move through the components-based subspace to find articles whose abstract is similar to a target paper as well as locate other users with similar interest. Papr provides for bioRxiv some of the mechanisms that our approach could provide for the neuroimaging field, namely the ability to function as a recommendation engine and allow users to interact with other users that have similar profiles. While both Papr and MAPBOT provide some tools to better navigate and search the literature—especially MAPBOT with Neurosynth—both are lacking the key features and information we provided here. We believe that our approach to structuring the functional neuroimaging literature is critical to both help organize the field and to help researchers navigate the literature.
To conclude, our work shows that different domains use different patterns of words, and that studies within these domains also report (or perhaps only study specific but) common brain areas. We believe that neuroimaging—and all of the domains that use and contribute to neuroimaging—would benefit from a broader harmonization of their terminology (à la the COBIDAS appendix on how to report routine fMRI analyses; Nichols et al., 2016) to put the field on the path towards formal ontologies (Poldrack & Yarkoni 2016). However there are barriers to achieve such ontologies (see examples in Supplemental Table 16). One such barrier is time and poses difficult questions, such as should we go back to older papers and “correct” terminology (e.g., addiction vs. substance use disorder).
Another barrier is language itself because many terms have a variety of uses across disciplines (e.g., to recollect) and the same concepts could have multiple terms and used in different ways depending on factors such as stylistic choices by the authors (e.g., marijuana and cannabis). Another limitation is that some of the automated language tools commonly used (including by us) cannot always detect that certain stems have the same meaning (hippocampi vs. hippocampus). Formal and more rigorous ontologies—such as those in genomics—as well as tools more sensitive to the peculiarities of language will be required as our field moves forward and connects brain imaging to a variety of other modalities (e.g., genetics; Cioli et al. 2014; Rizzo et al. 2016), but will require effort from a variety of disciplines to harmonize and standardize terminology.