Abstract
Hypothesis testing and replication in neuroimaging studies both rely heavily on treating gross anatomical regions as unitary entities for inferential purposes, using them as implicit spatial models. However, data collection and analyses are conducted at the voxel level, and this discrepancy between the unit of analysis and the unit of inference leads to ambiguity and flexibility in which findings researchers interpret as replications and confirmations of a priori hypotheses. For example, hypothesizing effects on “amygdala activity” does not provide a falsifiable and reproducible definition of precisely which voxels or which patterns of activation should be observed; rather, it comprises a large number of unspecified sub-hypotheses, leaving room for flexible interpretation of findings, which we can refer to as “model degrees of freedom.” From a survey of 135 functional Magnetic Resonance Imaging studies in which researchers claimed replications of previous findings, we found that 42.2% of the studies did not report voxel-level evidence for replication at all. Only 14.1% of the papers used exact coordinate-based or a priori pattern-based models. When we compared the peak coordinates between the original and replication studies that reported peak information, 42.9% of their ‘replicated’ findings had peak coordinates more than 15 mm away, suggesting that these replications entail two maps quite different at the voxel-level. To reduce the flexible and qualitative region-level tests in neuroimaging studies, we recommend adopting quantitative and specific spatial models and tests that are defined at the voxel level to assess replications and hypotheses. These include permutation tests on peak distance and using a priori multivariate pattern-based models. These practices will help researchers to establish precise and falsifiable spatial hypotheses, promoting a cumulative science of neuroimaging.