A Meta-Analysis of fMRI Studies of Semantic Cognition in Children

Our capacity to derive meaning from things that we see and words that we hear is unparalleled in other animal species and current AI systems. Despite a wealth of functional magnetic resonance imaging (fMRI) studies on where different semantic features are processed in the adult brain, the development of these systems in children is poorly understood. Here we conducted an extensive database search and identified 50 fMRI experiments investigating semantic world knowledge, semantic relatedness judgments, and the differentiation of visual semantic object categories in children (total N = 1,018, mean age = 10.1 years, range 4–15 years). Synthesizing the results of these experiments, we found consistent activation in the bilateral inferior frontal gyri (IFG), fusiform gyri (FG), and supplementary motor areas (SMA), as well as in the left middle and superior temporal gyri (MTG/STG). Within this system, we found little evidence for age-related changes across childhood and high overlap with the adult semantic system. In sum, the identification of these cortical areas provides the starting point for further research on the mechanisms by which the developing brain learns to make sense of its environment.

FMRI STUDIES OF SEMANTIC COGNITION IN CHILDREN 5 additional clusters of consistent activation in the right-hemispheric homologues of these regions. This is because the left-lateralization of the language comprehension network, despite being present from newborn age onwards, continues to fully develop until early adolescence (especially in the IFG; Berl et al., 2014;Enge et al., 2020;Holland et al., 2007).

Literature Search
The search terms "(child OR children OR childhood OR pediatric) AND (brain mapping OR brain scan OR functional magnetic resonance imaging OR functional MRI OR fMRI OR neuroimaging) AND (semantics OR category OR categorization OR conceptual knowledge OR semantic knowledge OR semantic memory OR semantic feature OR semantic category OR semantic categorization OR semantic comprehension OR visual semantics OR visual categorization OR object categorization)" were entered into three online databases (PubMed/MEDLINE, PsycInfo, and Scopus). As of July 2020, this search yielded a total of 1,095 articles. Of these, 895 remained after removing duplicate articles and were subsequently evaluated for eligibility (see Figure 1). We prespecified ten inclusion criteria, ensuring that all articles to be included (1) were written in English, (2) reported original results from a group study (excluding review articles, meta-analyses, surveys, and case studies), (3) tested at least one group of children with a mean age of 3-12 years (range 3-15 years), (4) tested a typically developing, non-clinical sample (including healthy control groups from clinical studies), (5) performed task-based fMRI (excluding resting-state fMRI and other imaging modalities), (6) had children engage in a task probing semantic cognition (i.e., semantic world knowledge, semantic relatedness, or visual object semantics), (7) analyzed the fMRI data within the framework of the general linear model (GLM), (8) applied the same statistical threshold across the whole brain (excluding ROI analyses and partial brain coverage; Müller et al., 2018), (9) reported results as peak coordinates in standard space (Talairach or MNI), and (10) reported peaks for the within-group contrast of two semantic conditions and/or one semantic and one control condition.
Initially, this led to the inclusion of 34 articles. We consulted the introduction and reference sections of these articles as well as relevant review papers on children's semantic and language processing (Antonucci & Alt, 2011;Barquero et al., 2014;Enge et al., 2020;Leach & Holland, 2010;Martin et FMRI STUDIES OF SEMANTIC COGNITION IN CHILDREN 6 al., 2015;O'Shaughnessy et al., 2008;Schlaggar & McCandliss, 2007;Skeide & Friederici, 2016;Weiss-Croft & Baldeweg, 2015) to identify additional articles not covered by our database search. Following this procedure, we identified 23 additional articles, seven of which fulfilled all inclusion criteria.
In addition to those articles fulfilling all ten inclusion criteria, 37 articles met all but the last criteria-that is, the relevant within-group peak coordinates were not reported in the published article.
In these cases, we contacted the corresponding authors to request the missing information. This led to the inclusion of three additional articles, resulting in a total of 45 articles being included in the metaanalysis.

Literature Search and Selection Workflow
Whenever one of these articles reported multiple contrasts based on the same sample of children, the coordinates from all of these contrasts were treated as a single experiment (note that FMRI STUDIES OF SEMANTIC COGNITION IN CHILDREN 7 experiment is the term we use whenever we refer to our primary unit of analysis; Turkeltaub et al., 2012). This is considered good practice in order to minimize within-group effects and avoid inflating the number of independent data points included in the meta-analysis Turkeltaub et al., 2012). Conversely, whenever an article reported multiple contrasts from two or more independent samples of children, these were treated as separate experiments. This led to a final metaanalytic sample of n = 50 experiments. While each of these experiments targeted some aspect of general semantic cognition, they could further be subdivided into more homogeneous groups of experimental tasks, probing (a) semantic world knowledge (e.g., naming an object after hearing its description; n = 21), (b) semantic relatedness (e.g., hearing two words and deciding if they are related or not; n = 16), and (c) visual semantic object categories (e.g., passively viewing faces as compared to other visual stimuli; n = 13). The experiments belonging to these three task categories as well as the entire data set (probing general semantic cognition across all task categories) were meta-analyzed using two different algorithms: activation likelihood estimation and seed-based d mapping.

Activation Likelihood Estimation
Activation likelihood estimation (ALE) is the most frequently used algorithm to perform coordinate-based meta-analyses of neuroimaging experiments (Acar et al., 2018). It estimates the degree to which peak coordinates taken from independent MRI experiments, all investigating the same task and/or participant population, spatially converge to form non-random clusters of activation (Eickhoff et al., 2009Turkeltaub et al., 2002). To this end, the algorithm first recreates a modeled activation map for each of the input experiments. All voxels for which the experiment reports a peak coordinate are assigned a value of 1, whereas all of the other voxels within the gray matter mask are assigned a value of 0. Because these peaks entail spatial uncertainty and are assumed to be part of larger clusters of activation, their values are smoothed across the neighboring voxels by convolving them with a Gaussian kernel. The width of this kernel is chosen to be inversely proportional to the sample size of the experiment, reflecting the fact that larger sample sizes provide stronger evidence for the true location of any peak of activation (Eickhoff et al., 2009). When two (or more) peaks in the experiment are reported in close proximity of one another, the voxels at which their Gaussians overlap are assigned the maximum-rather than the sum-of their respective values.

FMRI STUDIES OF SEMANTIC COGNITION IN CHILDREN 8
This prevents any meta-analytic cluster from receiving artificially high likelihood values merely because a large number of subpeaks has been reported for this cluster in the original article.
The modeled activation maps of the individual experiments are then combined into a single meta-analytic map by assigning an ALE value to each voxel. This ALE value is computed as the union of the modeled activation values for this voxel across the modeled activation maps for all experiments (Acar et al., 2018;Eickhoff et al., 2012): This hierarchical procedure treats the included experiments as a random subsample of all possible experiments and therefore allows the generalization across the population of possible fMRI experiments on the topic of interest (Eickhoff et al., 2009). The statistical significance of these voxelwise ALE values is determined by comparing them to an analytically derived null distribution as described by Eickhoff et al. (2012). To correct for multiple comparisons, a cluster-level family-wise error (FWE) correction procedure has been shown to offer an excellent trade-off between control over the Type I error rate and statistical power (Eickhoff et al., 2016).
For the present analysis, these steps were performed using the NiMARE package (Version 0.0.7; Salo et al., 2021) in Python (Version 3.8.8; Van Rossum & Drake, 2009). If necessary, coordinates were transformed from Talairach to MNI space using the icbm2tal transform function (Lancaster et al., 2007). The modeled activation maps were rendered in MNI152 space at 2×2×2 mm resolution (Fonov et al., 2011). For statistical thresholding, a voxel-level cluster-forming threshold of p < .001 (uncorrected) and a cluster-level threshold of p < .01 (FWE-corrected) was used. This cluster-level threshold was determined by comparing the observed cluster size to an empirical distribution built from 1,000 iterations of drawing random peak locations from the gray matter template and recording the maximal cluster size. The Nilearn package (Version 0.7.1; Abraham et al., 2014) was used for image processing and plotting while the anatomic automatic labeling atlas (AAL2; Rolls et al., 2015) as implemented in the AtlasReader package (Version 0.

Seed-Based d Mapping
While ALE estimates the spatial convergence of reported activation peaks, an alternative approach is to use the effect sizes (if available) of these peaks to infer a meta-analytic effect size for each gray matter voxel. This approach more closely resembles traditional meta-analyses of behavioral or clinical outcomes and is used by the seed-based d mapping (SDM) algorithm (Albajes-Eizagirre, Solanes, Vieta, et al., 2019). In short, it determines lower and upper bounds for possible effect size images based on the peak coordinates and their reported effect sizes (t scores or z scores). Then, a meta-analytic method correcting for non-statistically significant unreported effects (MetaNSUE; Albajes-Eizagirre, Solanes, & Radua, 2019) is used to infer the most plausible effect size and its standard error based on multiple imputations of censored information. Subsequently, all imputed data sets are meta-analyzed separately and combined using Rubin's rules. For statistical thresholding, the resulting meta-analytic map is FWE-corrected by comparing the voxel-wise observed effect size against an empirical null distribution of effect sizes built from random permutations.
The SDM algorithm was used on the same experiments and peak coordinates as for the ALE analysis but adding, if available, their reported t scores or z scores (the latter being converted to t scores with df = nchildren -1). The data were preprocessed with the SDM-PSI software (Version 6.21, https://www.sdmproject.com), using its default gray matter correlation template with a voxel size of 2×2×2 mm and a Gaussian smoothing kernel (anisotropy α = 1.0, FWHM = 20 mm). This effect-size based SDM algorithm made it possible (a) to probe the robustness of the ALE results against a change in the meta-analytic approach and (b) to statistically control for systematic differences between the included experiments by means of a covariate analysis. Note that the latter type of analysis is impossible in an approach like ALE because it disregards the effect sizes of the reported peaks and instead treats them as binary. Two separate models were computed to meet these two objectives: (a) a mean-based meta-analysis without any covariates or predictors (as in the ALE analysis) and (

Differences Between Semantic Task Categories
Beyond mapping the cortical network associated with general and task-specific semantic cognition in children, a meta-analytic subtraction analysis (Laird et al., 2005) was carried out to test for reliable differences between the three different task categories (i.e., knowledge, relatedness, and objects were considered as showing reliable differences between task categories. In addition, a meta-analytic conjunction analysis was performed to identify areas where cognitive processing was shared across all three task categories. This was done by taking the minimum ALE value at each voxel across all three task-specific ALE maps-but only for those voxels that were statistically significant in each of the three (Nichols et al., 2005).

Age-Related Changes
The same approach as just described was used to compare semantic cognition in older versus younger children. To this end, the original sample was split into equally sized groups at the median of the (mean) sample ages across experiments. These two groups of experiments were compared using the same subtraction procedure and statistical threshold as for the semantic task categories.
Additionally, we also tested for a linear influence of age by means of a meta-regression using SDM (see Section 2.3). In this linear model, the outcome of interest was not the voxel-wise effect size across experiments (as in the main SDM analysis) but those voxels whose effect size showed significant covariation with the (mean) age of the sample(s) of children contributing to it.

Comparison With Semantic Cognition in Adults
The meta-analytic results of semantic cognition in children were also compared to semantic cognition in adults as reported in a recent meta-analysis on semantic control (Jackson, 2021). To this end, we recreated their ALE analysis of general semantic cognition in adults (n = 415 experiments; see their Figure 3 and Table 3) and compared it to our child-specific ALE analysis by means of a meta-analytic subtraction and conjunction analysis as described above (see Section 2.4).

Evaluation of Robustness
Meta-analyses reflect the state of the published literature on a given topic and are therefore subject to the same biases as the original studies (e.g., small sample bias, selective reporting, file drawer problem). For behavioral and clinical meta-analyses, a range of standard tools has been developed to assess the risk of these biases as well as the robustness of the meta-analytic results against them. Some but not all of these tools can be carried over to the meta-analysis of neuroimaging data (Acar et al., 2018). For instance, it is possible to assess the degree to which the meta-analytic results depend on any individual study, which may or may not have reported false positive findings (e.g., due to low statistical power; Button et al., 2013;Ioannidis, 2005;Thirion et al., 2007).  (Samartsidis et al., 2020). While these cannot be factored into the metaanalysis directly, the simulation of imaginary file drawers with different numbers of null experiments is informative regarding the robustness of the results against this type of bias. In this context, the failsafe N (FSN) metric has been defined as the number of null experiments that can be added to the original meta-analysis without rendering its meta-analytic effect size statistically non-significant FMRI STUDIES OF SEMANTIC COGNITION IN CHILDREN 12 (Rosenthal, 1979). If FSN exceeds the upper bound of experiments that can realistically be expected to be inside the file drawer, one can conclude that the file drawer problem does not suffice to explain the meta-analytic result. This logic can be extended to meta-analyses of neuroimaging studies by simulating null experiments with peaks of activation at random rather than spatially converging locations across the brain (Acar et al., 2018). Such null experiments were simulated as to resemble the original experiments in terms of their individual sample sizes and numbers of reported peak coordinates but had their peak locations drawn randomly from all possible voxels within the gray matter template. They were then added iteratively to the original experiments. At each step, the ALE analysis was repeated, recording for every voxel if it had remained part of a statistically significant cluster or not. This was repeated up to a maximum of five times the number of experiments in the original sample (e.g., FSNmax = 150 for our main analysis with n = 50 experiments). The whole procedure was repeated for 10 different (random) file drawers of null experiments.
Both of these approaches (leave-one-out and fail-safe N analysis) were performed separately for our main ALE analysis (including all 50 experiments) as well as for the task category-specific subanalyses. They were expected to be especially informative for the category-specific analyses because the low number of experiments in these sub-analyses might have reduced the robustness of the metaanalytic results (Eickhoff et al., 2016).

Literature Search
As of July 2020, a total of 45 articles reporting 50 independent fMRI experiments of semantic cognition in children were obtained by searching online literature databases (see Figure 1 and Table   1). These experiments could be grouped further into experiments probing semantic world knowledge (n = 21), semantic relatedness judgments (n = 16), and the discrimination of visual semantic object categories (n = 13). Together, they comprise fMRI data of 1,018 children (m = 20.4 per experiment, md = 15.5, range 5-67; see Figure 2) with a mean age of 10.1 years (range of mean ages 5.5-12.8 years, total age range 4-15 years). According to the original articles, 54.4% of these children were boys (45.6% were girls) and 98.4% were right-handed (1.6% were left-handed). From these experiments, a total of 687 peaks of activation were reported (m = 13.7 per experiment, md = 12, FMRI STUDIES OF SEMANTIC COGNITION IN CHILDREN 13 range 1-47) and entered into the meta-analysis. Of these peaks, 400 (58.2%) were in the left hemisphere (xMNI < 0), indicating a slight degree of lateralization. There were no reliable associations across experiments between sample size, the mean age of the children under study, and the number of peaks reported (see Figure 2). Additional descriptive information about the experiments can be found in the appendix.

Distributions and Bivariate Associations of Experiment-Level Characteristics
Note. Histograms on the main diagonal show the number of experiments in the meta-analysis, binned according to their sample size, number of peaks reported, and mean age of the children under study.
Scatterplots and correlation coefficients on the off-diagonals show the bivariate associations between these experiment-level characteristics. The gray lines show the linear regression trends with their 95% confidence interval. Note. Multiple experiments were derived from the same original article if and only if they were testing independent groups of children (see Section 2.1 and Turkeltaub et al., 2012). Additional information about the experiments can be found in the Appendix. # = experiment ID, n = sample size, F = female, M = male, R = right-handed, L = left-handed, no. = number, NA = not available.

Activation Likelihood Estimation
For general semantic cognition in children, the meta-analysis using ALE revealed spatial convergence of activation across experiments in eight clusters distributed across different regions of children's cortex (see Table 2 and Figure 3). Ordered by cluster size, they were located in the left inferior frontal and precentral gyri (Clusters #1 and #8), the bilateral supplementary motor areas (Cluster #2), the left fusiform gyrus (Cluster #3), the right insular and inferior frontal cortices (Cluster #4), the left middle and superior temporal cortices (Cluster #5), the right inferior occipital gyrus and calcarine sulcus (Cluster #6), and the right fusiform gyrus (Cluster #7).

Seed-Based d Mapping
Meta-analytic effect size maps were created based on the test statistics (z scores or t scores) of the reported peak coordinates. These were available for 461 (67.1%) of all peak coordinates, whereas the test statistics for the remaining 226 peak coordinates were inferred via multiple imputations as described in Albajes-Eizagirre et al. (2019). This alternative meta-analytic approach yielded qualitatively similar results as ALE: The largest clusters (and highest effect sizes) were again observed in the left inferior frontal gyrus, the bilateral supplementary motor areas, and the left middle and superior temporal gyri (see Figure 3C and Table 2). Three noteworthy differences between the results from these two different meta-analytic algorithms were (a) that the size of the significant clusters was larger overall for SDM as compared to ALE, (b) that one cluster in the right visual cortex was observed with ALE but not with SDM, and (c) that one cluster in the left angular gyrus was observed with SDM but not with ALE.
The effect size-based approach not only served as a robustness check for the main (ALE) analysis but also made it possible to re-assess the results while controlling for four different linear covariates of no interest (namely the mean age of the sample, the modality of stimulus presentation, the modality of children's response, and the statistical software package used for data analysis). This again yielded qualitatively similar results, although cluster sizes were larger than in the original analysis without covariates (see Figure 3D and Table 2). The same SDM analysis and statistical thresholding but controlling for four linear covariates of no interest, namely the mean age of the sample, the modality of stimulus presentation, the modality of children's responses, and the statistical software package used in the original article.

Differences Between Semantic Task Categories
Task category-specific sub-analyses for experiments probing semantic world knowledge (e.g., naming a word after hearing its description) and for experiments probing semantic relatedness (e.g., hearing two words and deciding if they are related or not) both showed the largest clusters of activation in the bilateral supplementary motor areas, in the pars triangularis of the left inferior frontal gyrus, and in the right insular and inferior frontal cortices (see Figure 4 and Table 3). For experiments probing semantic relatedness, there was one additional cluster in the left middle temporal gyrus. The sub-analysis for experiments probing the discrimination of visual semantic object categories showed three clusters of consistent activation in the bilateral fusiform gyri as well as in the visual cortex of the right occipital lobe.

Table 3
Statistics for the Meta-Analytic Clusters Shown in Figure 4 Analysis  These task-specific meta-analytic maps were contrasted against one another to examine where they differed reliably from one another (see Figure 5 and Table 4). First, experiments probing semantic knowledge showed more consistent activation than the other two task categories in two small clusters in the left insular and middle frontal cortices. Second, tasks probing semantic relatedness showed more consistent activation than the other two task categories in the pars opercularis of the left inferior frontal gyrus and in the left middle temporal gyrus. Finally, tasks probing visual semantic object categories showed more consistent activation than the other two task categories in the bilateral occipital and fusiform cortices as well as in the right superior parietal cortex. They also showed reliably less activation than the other two task categories in one small cluster in the left medial frontal lobe (shown in blue in Figure 5C). The conjunction analysis revealed that there were no regions of activation that were shared by all three semantic task categories (as seen also in Figure 4B-D).

Figure 5
Differences Between Semantic Task Categories

FMRI STUDIES OF SEMANTIC COGNITION IN CHILDREN 23
Note. For each task category shown in Figure 4, the meta-analytic ALE map was contrasted against the map for the experiments from the other two task categories. The resulting subtraction maps were thresholded at p < .001 (uncorrected) at the voxel level and k > 25 connected voxels (200 mm 3 ) at the cluster level.

Table 4
Statistics for the Meta-Analytic Clusters Shown in Figure 5 Analysis Note. Peak anatomical labels are based on the anatomic automatic labeling atlas (AAL2; Rolls et al., 2015). # = cluster ID, L = left, R = right, oper. = pars opercularis.

Age-Related Changes
Age-related changes in fMRI activation patterns across the 50 experiments were examined (a) by splitting the sample of experiments in half at the median of the (mean) sample ages (mdmean age = 10.3 years) and performing an ALE subtraction analysis as described above (see Section 2.4) and (b) by entering mean sample age as a linear predictor in a meta-regression model using the effect sizebased SDM approach (see Section 2.3).
The median split-based approach showed more consistent activation in experiments with older (> 10.3 years) as compared to younger (< 10.3 years) children only at the right putamen and insula (see Figure 6A and Table 5). The effect size-based approach showed no age-related changes   Note. Peak anatomical labels are based on the anatomic automatic labeling atlas (AAL2; Rolls et al., 2015). # = cluster ID, R = right.
However, one should consider that meta-analytic null effects for study-level moderating variables may in part reflect the lack of statistical power for detecting them (Hempel et al., 2013). This is especially true when the variable of interest (here: mean sample age) has a restricted variance (see Figure 2 and Table 1). To mitigate this lack of statistical power in a post hoc fashion, we present the uncorrected and non-thresholded z score map from the effect size-based meta-regression in Figure   6B. This map suggests an age-related decrease of effect sizes in the left middle/superior temporal gyrus (peak z = -2.55) and an age-related increase of effect sizes in the left inferior frontal gyrus (peak z = 2.33). To a lesser extent, the increase in the inferior frontal gyrus is mirrored in the right hemisphere, consistent with the median split-based result from ALE. However, none of these peaks survived our initial cluster-forming threshold and therefore additional experiments will be needed to confirm if these age-related changes turn out to be reliable on a meta-analytic level.

Comparison With Semantic Cognition in Adults
A recent meta-analysis by Jackson (2021) used ALE to synthesize the fMRI literature on semantic control in adults. They also broadened their analysis to 415 studies of general semantic cognition and found wide-ranging clusters of consistent activation especially in the left hemisphere, spanning multiple areas in the temporal and inferior frontal lobes as well as the supplementary motor area (see Figure 7A and Table 6 for a reproduction of these results based on the peak coordinates kindly provided by the original author). This meta-analytic map of semantic cognition in adults was compared to the map of semantic cognition in children by means of a subtraction and conjunction analysis. This revealed more consistent activation in children as compared to adults in multiple posterior regions of the cortex, including the bilateral inferior temporal and right inferior parietal gyri (see Figure 7B and Table 6). In contrast, adults showed more consistent activation in the anterior part of the left middle and inferior temporal gyri as well as deep in the left calcarine sulcus. Finally, the conjunction analysis indicated large areas of overlap between the two groups in the left inferior frontal FMRI STUDIES OF SEMANTIC COGNITION IN CHILDREN 26 gyrus, the supplementary motor area, the left middle and superior temporal gyri, the right insular and inferior frontal cortices, and the left fusiform gyrus (see Figure 7C and Table 6).

Evaluation of Robustness
The robustness of the meta-analytic results against two different types of publication bias -spurious findings and the file drawer problem-was assessed using a leave-one-out (jackknife) analysis and a fail-safe N analysis. Both of these analyses were conducted for the entire sample of all 50 semantic experiments (see Figure 3B and Table 2) and for each of the three task category-specific sub-analyses (see Figure 4B-D and Table 3). The leave-one-out procedure showed that all clusters detected in the main analysis were robust against the deletion of individual studies, with an average leave-one-out robustness of 96% across the eight clusters (range 84-100%; see Figure 8). For the sub-analysis of semantic relatedness experiments, the robustness of all four clusters was at 100%, whereas for semantic knowledge and #4 (left IFG, 81%). Finally, for visual semantic object category experiments, it was slightly reduced (85%) for all three clusters. Together, this reflects good overall robustness against spurious experiments in the meta-analysis, although this robustness was compromised slightly for the subanalyses that were run on fewer experiments (Eickhoff et al., 2016).
The fail-safe N analysis showed that most clusters were robust against the file drawer problem. This was indicated by the fact that in these cases, the number of (unpublished) null experiments that needed to be added until overturning the statistical significance of the cluster exceeded the number of (published) experiments in the original analysis (see Figure 9). The only clusters were this was not the case were Clusters #5 (right occipital lobe; FSN = 18), #6 (right fusiform gyrus; FSN = 15), and #8 (left inferior frontal gyrus; FSN = 1). Note, however, that Clusters #5 and #6 still marginally exceeded the desired value of 30% of the original sample size (see Section 2.7 and Samartsidis et al., 2020). For the task category-specific sub-analyses, FSN values were high overall, except for the knowledge-related Clusters #3 (right insula; FSN = 1) and #4 (left inferior frontal gyrus; FSN = 2) as well as the object-related Clusters #1 (right fusiform gyrus; FSN = 8), #2 (left fusiform gyrus; FSN = 5), and #3 (right occipital lobe; FSN = 11). All but the two knowledgerelated clusters exceeded the desired threshold of 30%, once more giving the overall impression of satisfactory robustness to publication bias. However, these two specific clusters need to be interpreted with caution and would require additional support by future fMRI experiments.

Discussion
Here we systematically localized the brain areas underlying semantic cognition in children by means of a coordinate-based meta-analysis of fMRI studies. We identified 50 individual experiments scanning children with a mean age of 3-12 years using a variety of semantic tasks. Pooling across the reported peak coordinates from all of these experiments, we found evidence for consistent activation in sub-regions of the left perisylvian language network associated with lexical processing (left MTG/STG and IFG) as well as in the bilateral SMA, the right insula, and more posterior brain regions in the bilateral fusiform and right occipital cortices. These areas were recruited to a different degree by different semantic task categories: Inferior frontal regions and the SMA were recruited preferentially during tasks tapping into semantic knowledge (e.g., naming an object after hearing its descriptions) and semantic relatedness (e.g., hearing two words and deciding if they are related or not), while posterior regions were recruited preferentially during tasks tapping into the differentiation of visual object categories (e.g., passively viewing faces as compared to other visual stimuli).
The left MTG/STG and the pars triangularis of the left IFG are known to be implicated in semantic processing from at least 3 years of age onwards (Skeide et al., 2014). There is also some evidence that children are able to process word meaning in the left MTG with as little as 2 years of age (Friedrich & Friederici, 2010;Travis et al., 2011). Within the left IFG, semantic processing especially recruits the more anterior parts (pars triangularis and pars orbitalis; Brauer & Friederici, 2007;Nuñez et al., 2011;Skeide et al., 2014;Skeide & Friederici, 2016) which also showed the strongest meta-analytic peaks in our study. These sub-areas seem to play an especially crucial role in children's language processing, as they allow them to successfully retrieve the semantic meaning of grammatically challenging sentences even though their syntactic abilities (localized in the pars opercularis of the left IFG in adults) are not yet fully developed (Skeide et al., 2014).
Just as the left IFG and MTG/STG, the bilateral SMA also showed meta-analytically robust activation in children performing semantic knowledge and relatedness tasks. This mirrors previous meta-analyses of semantic cognition in adults (e.g., Binder et al., 2009;Jackson, 2021) as well as meta-analyses of language comprehension in both children (Enge et al., 2020;Martin et al., 2015) and adults (e.g., Ferstl et al., 2008;Rodd et al., 2015). The premotor activation could reflect a grounding FMRI STUDIES OF SEMANTIC COGNITION IN CHILDREN 32 of abstract semantic concepts in articulatory motor representations (Martin, 2016;Pulvermüller & Fadiga, 2010). Alternatively or in addition to this, activation in the anterior part of the SMA (pre-SMA) may also reflect higher-order cognitive control processes such as ambiguity resolution and the integration of semantic context (Hertrich et al., 2016). This is supported by our observation that the SMA showed no consistent activation when children performed visual object category tasks. These tasks would in many cases afford a similar degree of pre-motor response (e.g., when viewing tools; e.g., Dekker et al., 2014;Kersey et al., 2016) but arguably a lesser degree of cognitive control compared to tasks probing semantic knowledge or relatedness (see Binder et al., 2009, for similar findings focusing exclusively on experiments with linguistic stimuli in adults). Note, however, that only a limited number of visual object category tasks (n = 13) could be included in the present metaanalysis, presumably limiting statistical power (Eickhoff et al., 2016).
Finally, the ventral temporal cortex (fusiform gyrus and adjacent areas) is well-known to house category-selective neuronal populations that respond primarily to certain categories of visual stimuli (e.g., faces in the fusiform face area [FFA] or objects in the lateral occipital complex [LOC]; Grill-Spector & Weiner, 2014;Haxby et al., 2001). Accordingly, these regions showed consistent activation only for tasks in which children viewed different visual semantic object categories. A subset of small areas within this ventral temporal and occipital region also showed the most prominent increase of meta-analytic activation for children as compared to adults. This may reflect that children need to recruit these patches of cortex to a stronger degree than adults to distinguish between different kinds of visual stimuli (see also Antonucci & Alt, 2011). We cannot preclude, however, that this group difference could also be driven by differences in the kinds of tasks and baseline conditions chosen to investigate semantic cognition in children and adults. Furthermore, the semantic categorization of different kinds of visual objects is confounded with lower-level sensory differences between them. These visual confounds might at least partially explain our meta-analytic results for this task category (e.g., the activation of the right early visual cortex; see Figure 4D).
There was one region, namely the left ATL, that is oftentimes considered to be at the core of the semantic system (Lambon Ralph et al., 2017;Patterson et al., 2007) but did not show any metaanalytically consistent activation in children whatsoever. In adults, the ATL serves as an amodal FMRI STUDIES OF SEMANTIC COGNITION IN CHILDREN 33 "hub" connecting different modality-specific sites within the wider semantic network (e.g., speech processing in the IFG and visual semantics in the occipital and ventral temporal cortices).
Neuroimaging studies of semantic cognition in children seem to elicit significantly less of such ATL activation (see Figure 7B), suggesting that this semantic hub may need time to develop over childhood and into adolescence (Hwang et al., 2013, but also see Stevens et al., 2009). In contrast, most other regions of the semantic network showed at least some overlap between children and adults, especially in the left IFG, bilateral SMA, right insula, left MTG/STG, and left FG (see Figure 7C and Jackson, 2021).
Because meta-analyses depend critically on the quality of the underlying literature, they can be prone to a number of biases (e.g., positivity bias and selective reporting). Tools to detect the presence of such biases are less well developed in meta-analytic frameworks for neuroimaging as compared to clinical or behavioral outcomes (Acar et al., 2018). However, in the present study, the number of reported peak coordinates-a very rough analogue of an experiment-specific effect size for fMRI studies-was unrelated to the sample size of the experiments (see Figure 2). This is consistent with small sample bias and/or selective reporting in larger studies. To assess the robustness of our meta-analytic results against these kinds of biases, we first conducted a leave-one-out analysis which showed that all clusters were considerably invariant against the deletion of individual experiments from the sample. This means that we would have obtained identical results even if any of the original experiments reported only spurious activations. Second, we also conducted a fail-safe N analysis in which we estimated the number of unpublished null experiments (i.e., experiments without any consistent pattern of activation) that have to be added until the significance of any observed cluster is overturned. This number was larger than 30% of the meta-analytic sample size for almost all clusters.
It thereby exceeded the current most conservative estimate for the actual number of unpublished fMRI experiments that are hidden in the "file drawer" (Samartsidis et al., 2020). Thus, although our metaanalysis could not directly assess or correct for publication bias in the underlying literature, its results seem to be stable even if one accepts that such biases are present.

Conclusion
Conducting fMRI experiments with children is challenging and costly, which is why sample sizes are often lower than in behavioral experiments or in neuroimaging experiments with adult participants. Meta-analyses are therefore necessary to filter out spurious results and to uncover similarities and differences between different task paradigms. Regarding children's capacity to process semantic information, our coordinate-based meta-analysis showed reliable patterns of activation in the left IFG and MTG/STG, the bilateral SMA, the right insula and parts of the bilateral ventral temporal and occipital cortices. Within this network, tasks probing children's semantic world knowledge and semantic relatedness between stimuli showed overlapping spots of activation that were distinct from those seen in tasks probing the differentiation of visual semantic object categories.
A comparison to the adult semantic system revealed largely overlapping regions of activation but also more child-specific activation in bilateral inferior temporal and occipital regions as well as more adult-specific activation in the anterior portion of the left temporal lobe.      Note. For articles reporting fMRI data from multiple groups (e.g., children and adults or typically developing children and neurodiverse children), the groupidentifier indicates which group(s) were included as experiments in the meta-analysis. Please refer to Table 1