Searching for Consistent Brain Network Topologies Across the Garden of (Shortest) Forking Paths

The functional interactions between regions of the human brain can be viewed as a network, empowering neuroscientists to leverage tools such as graph theory to obtain insight about brain function. However, obtaining a brain network from functional neuroimaging data inevitably involves multiple steps of data manipulation, which can affect the organisation (topology) of the resulting network and its properties. Test-retest reliability is a gold standard for both basic research and clinical use: a suitable data-processing pipeline for brain networks should recover the same network topology across repeated scan sessions of the same individual. Analyzing resting-state functional Magnetic Resonance Imaging (rs-fMRI) recordings from two test-retest studies across short (45 minutes), medium (2-4 weeks) and long term delays (5-16 months), we investigated the reliability of network topologies constructed by applying 576 unique pipelines to the same fMRI data, obtained from considering combinations of atlas type and size, edge definition and thresholding, and use of global signal regression. We adopted the portrait divergence, an information-theoretic criterion to measure differences in network topology across all scales, enabling us to quantify the influence of different pipelines on the overall organisation of the resulting network. Remarkably, our findings reveal that the choice of pipeline plays a fundamental role in determining how reproducible an individual’s brain network topology will be across different scans: there is large and systematic variability across pipelines, such that an inappropriate choice of pipeline can distort the resulting network more than an interval of several months between scans. Across datasets and time-spans, we also identify specific combinations of data-processing steps that consistently yield networks with reproducible topology, enabling us to make recommendations about best practices to ensure high-quality brain networks.

The study consisted of two visits (separated by 2-4 weeks). For each visit, resting-state fMRI 182 was acquired for 5:20 minutes using a Siemens Trio 3T scanner (Erlangen, Germany). 183 Functional imaging data were acquired using an echo-planar imaging (EPI) sequence with 184 parameters TR 2,000 ms, TE 30 ms, Flip Angle 78•, FOV 192 × 192mm2, in-plane resolution 185 3.0 × 3.0mm, 32 slices 3.0mm thick with a gap of 0.75mm between slices. A 3D high resolution 186 MPRAGE structural image was also acquired, with the following parameters: TR 2,300 ms, 187 TE 2.98 ms, Flip Angle 9•, FOV 256 × 256 mm2. Task-based data were also collected, and 188 have been analysed before to investigate separate experimental questions ( (Vatansever et al.,    The simplest approach to decide which edges to retain is to accept or reject edges based on a   For all filtering schemes considered here, edges that were not selected were set to zero.

332
However, edges that were included in the network could be weighted or unweighted. In the 333 case of unweighted (binary) networks, we set all non-zero edges to unity. Otherise, their 334 original weight was retained.  For each subject, at each timepoint, we obtained one brain network following each of the  Being grounded in information theory, the Portrait divergence between two networks can be 382 interpreted as measuring how much information is lost when using one network to represent 383 another. It ranges from 0 (no information loss) to 1 (complete information loss); in the present 384 case, since the two networks are derived from different scans of the same individual, we aim 385 to identify pipelines that minimise test-retest PDiv.  Second, our results indicate substantial consistency across the three time intervals considered 406 here, in terms of which data-processing steps feature prominently among the pipelines that are 407 best (and worst) at minimising the average within-subject PDiv. Additionally, correlation 408 between all pipelines' ranks across time intervals indicated very high consistency for short-409 term NYU and long-term NYU (Spearman's rho = 0.94, p < 0.001). While these two timespans 410 are the most different among those considered here, the data were acquired on the same 411 subjects. However, the pipelines' ranks obtained from both shortand long-term NYU were 412 also significantly and positively correlated with the ranks obtained in the independent 413 Cambridge dataset (Cambridge vs NYU-short: Spearman's rho = 0.31, p < 0.001; Cambridge 414 vs NYU-long: Spearman's rho = 0.32, p < 0.001) (Fig.3), indicating that pipelines' suitability 415 for network construction is not dataset-specific but rather can generalise to different groups of 416 individuals.   consistently present across the best-performing pipelines (Fig.4 indicates the relative 5 prevalence of each pipeline option among the 20 overall best-and worst-performing pipelines).

6
Importantly, there is a set of options that consistently feature as part of the pipelines that most Interestingly, both the best-and worst-performing pipelines involve the use of weighted rather 30 than binary edges. This shows that, when considered in isolation, this step is not indicative of 31 whether the pipeline as a whole will produce good results. Indeed, although the case of binary 32 vs weighted edges is especially obvious, systematically investigating each step revealed that 33 there is large overlap between pipelines that differ in only one step, in terms of their 34 distributions of PDiv values, with consistently large variance (Fig. S1-S6).

36
Therefore, a pipeline's performance is not solely attributable to any specific step: rather, it is 37 due to the synergy between combinations of steps. Indeed, in addition to the consistent presence of many specific options among the best pipelines (as shown in Fig.4), there are also systematic 39 patterns in how these options are combined. This is especially evident when focusing on the systematically investigated 576 unique pipelines that a neuroscientist could adopt to obtain 64 brain networks from resting-state fMRI data, arising from the combination of several key data-65 processing steps. Rather than choosing any arbitrary graph-theoretical property for our 66 comparisons, we focused on the networks' overall topology across all scales, which enabled us 67 to compare the pipelines in terms of their ability to recover similar brain network topologies in 68 data obtained from the same individual across short (minutes), medium (weeks) and long Summarized, our findings reveal large and systematic differences across pipelines in terms of than focusing on topological test-retest reliability between different scans using the same 105 pipeline, as we did here). In other studies, OMST has also outperformed alternative datasets for individual participants, we expect that such differences should be randomly 151 distributed and thus cancel out at the group level. Additionally, we note that none of these 152 potentially relevant factors are likely to influence functional brain networks obtained with the 153 same hour, as we studied in the short-term NYU dataset; and remarkably, pipelines that 154 minimise topological distance in the short term tend to also do so over longer periods of time.

155
Nevertheless, it will be important for future work to demonstrate the validity of our  194 However, this issue highlights the broader point we raised above: test-retest reliability is only 195 one of several aspects that should be taken into account before deciding whether to include a 196 given step as part of one's chosen pipeline for network analysis, and study-specific 197 considerations will also come into play.

214
Reliability across the lifespan should be also considered by comparing age-groups, as early 215 evidence untangled age-related differences in test-retest reliability of rs-fMRI (Song et al.,

216
2012). The choice of the optimal pipeline for test-retest reliability may therefore vary by 217 clinical characteristics, which still remains to be ascertained using topology-based approaches 218 such as portrait divergence. This is an important next step following the present work.