Abstract
When fields lack consensus standard methods and accessible ground truths, reproducibility can be more of an ideal than a reality. Such has been the case for functional neuroimaging, where there exists a sprawling space of tools and processing pipelines. We provide a critical evaluation of the impact of differences across five independently developed minimal preprocessing pipelines for functional MRI. We show that even when handling identical data, inter-pipeline agreement was only moderate, critically shedding light on a factor that limits cross-study reproducibility. We show that low inter-pipeline agreement mainly becomes appreciable when the reliability of the underlying data is high, which is increasingly the case as the field progresses. Crucially, we show that when inter-pipeline agreement is compromised, so too are the consistency of insights from brainwide association studies. We highlight the importance of comparing analytic configurations, as both widely discussed and commonly overlooked decisions can lead to marked variation.
Competing Interest Statement
The authors have declared no competing interest.
Footnotes
We added an analysis to the main text showing that in a BWAS experiment, while one can obtain comparable prediction accuracies using differing minimal processing pipelines, the comparability of the insights obtained (i.e., model features exhibiting high predictive value) depends on interpipeline agreement (i.e., is low when interpipeline agreement is low, and high when interpipeline agreement is high).