Tractography density affects whole-brain structural architecture and resting-state dynamical modeling

Dynamical modeling of the resting-state brain dynamics essentially relies on the empirical neuroimaging data utilized for the model derivation and validation. There is however still no standardized data processing for magnetic resonance imaging pipelines and the structural and functional connectomes involved in the models. In this study, we thus address how the parameters of diffusion-weighted data processing for structural connectivity (SC) can influence the validation results of the whole-brain mathematical models and search for the optimal parameter settings. On this way, we simulate the functional connectivity by systems of coupled oscillators, where the underlying network is constructed from the empirical SC and evaluate the performance of the models for varying parameters of data processing. For this, we introduce a set of simulation conditions including the varying number of total streamlines of the whole-brain tractography (WBT) used for extraction of SC, cortical parcellations based on functional and anatomical brain properties and distinct model fitting modalities. We observed that the graph-theoretical network properties of structural connectome can be affected by varying tractography density and strongly relate to the model performance. We explored free parameters of the considered models and found the optimal parameter configurations, where the model dynamics closely replicates the empirical data. We also found that the optimal number of the total streamlines of WBT can vary for different brain atlases. Consequently, we suggest a way how to improve the model performance based on the network properties and the optimal parameter configurations from multiple WBT conditions. Furthermore, the population of subjects can be stratified into subgroups with divergent behaviors induced by the varying number of WBT streamlines such that different recommendations can be made with respect to the data processing for individual subjects and brain parcellations. Author summary The human brain connectome at macro level provides an anatomical constitution of inter-regional connections through the white matter in the brain. Understanding the brain dynamics grounded on the structural architecture is one of the most studied and important topics actively debated in the neuroimaging research. However, the ground truth for the adequate processing and reconstruction of the human brain connectome in vivo is absent, which is crucial for evaluation of the results of the data-driven as well as model-based approaches to brain investigation. In this study we thus evaluate the effect of the whole-brain tractography density on the structural brain architecture by varying the number of total axonal fiber streamlines. The obtained results are validated throughout the dynamical modeling of the resting-state brain dynamics. We found that the tractography density may strongly affect the graph-theoretical network properties of the structural connectome. The obtained results also show that a dense whole-brain tractography is not always the best condition for the modeling, which depends on a selected brain parcellation used for the calculation of the structural connectivity and derivation of the model network. Our findings provide suggestions for the optimal data processing for neuroimaging research and brain modeling.

Introduction for the brain modeling as well as data analytics. 53

54
The workflow of the current study schedule is illustrated in Fig 1. To evaluate the impact of the WBT density as given by the 55 total number of WBT streamlines and the two considered brain atlases on the quality of the model validation, we first 56 investigated how the properties of the structural connectome get affected when the WBT resolution varies, which might 57 influence the dynamics of the whole-brain models. Then the distributions of the optimal parameters and the maximal 58 similarities between simulated and empirical data were studied for the considered simulation conditions. We applied three 59 criteria for the subject stratification, which provides an insight into the impact of the WBT density on the model performance 60 for individual subjects and suggests optimal configurations of the data processing parameters. 61 Impacts of WBT resolution on structural connectome 62 To evaluate the architecture of the structural networks, we calculated several main graph-theoretical network properties (see 63 Materials and methods) of the empirical SC and PL of individual subjects for all considered conditions of the data 64 processing (6 WBTs for the two atlases) and tested relationships between the network properties and the model performance 65 at the functional model fitting. Fig 2 illustrates the similarities between SC and PL (Fig 2 a and c) and behavior of the 66 weighted node degree, clustering coefficient, betweenness centrality, local and global efficiencies, and modularity over 6 WBT 67 conditions (10K, 50K, 100K, 500K, 2M, and 10M streamlines) for the Schaefer atlas and the Harvard-Oxford atlas (Fig 2 b   68 and d). The similarity of the eSC matrices to the 10M case remains relatively high except for a largest drop at 10K (Fig 2 a1 69 and c1). On the other hand, the PL matrices very quickly deviate from the 10M case, exhibit practically no correlation 70 already for 100K, and are weakly anti-correlated for 10K (Fig 2 a2 and c2). 71 By increasing the number of streamlines from 10K to 10M, the number of network edges increases, and the nodes become 72 densely connected, which results in monotonically increasing average binarized (discarded weights of edges) node degrees as 73 expected (S1 Fig in Supplementary materials). However, the weighted node degree based on the normalized count matrices 74 (SC divided by its mean) used in model (Eq 1) show relatively stationary behavior across the WBT conditions for dense WBT 75 conditions (Fig 2 b1 and d1 and Table 1 WD). Similar stationary behavior can also be observed for the average betweenness 76 centrality and the global efficiency for dense WBT conditions (Fig 2 b3, b5, d3, and d5 and Table 1 BC and GE). On the 77 other hand, the average clustering coefficient, local efficiency and their variances decrease for dense WBT (Fig 2 b2, b4, d2, 78 and d4 and Table 1 CC and LE), whereas the network modularity monotonically increase (Fig 2 b6 and d6). We performed a 79 non-parametric one-way analysis of variance (Kruskal-Wallis ANOVA) test over the WBT conditions (Table 1). In summary, 80 WBT density modulates the graph-theoretical network properties and results in similar tendencies at the group level through 81 varying WBT resolution for both the Schaefer and the Harvard-Oxford atlases. In particular, the clustering coefficient and 82 the local efficiency are significantly different across the WBT conditions, even between 2M and 10M (Table 1 CC and LE), 83 where very high similarities of SC can be observed (Fig 2 a1 and c1). 84 Impacts of WBT resolution on model fitting 85 The system of coupled phase oscillators (Eq 1) was simulated with varying the global delay τ and the global coupling C, and 86 the similarity (the two model fitting modalities) between sFC and empirical data (eFC and eSC) was calculated and assigned 87 to that parameter value, which resulted in (C, τ )-parameter planes of the model fitting modalities between simulated and plane demonstrating, however, different cluster shapes for the Schaefer and the Harvard-Oxford atlases. We also note here 93 that the latter atlas could lead to a stronger fit between the sFC and eFC, compare Fig 3 a7 and c7. In contrast, in the case 94 of the structure-functional model fitting between sFC and eSC (Fig 3 e-h), both atlases demonstrate a similar range of the 95 correspondence (correlation) between simulated and empirical data, however, the maximal similarity can also be attained for 96 large delay (Fig 3 f and h). The whole-brain tractography (WBT) was generated by an in-house pipeline. Structural connectivity (SC) and average path-length (PL) between brain regions were reconstructed based on a given brain parcellation/brain atlas (6 WBTs and 2 atlases). (b) The empirical BOLD signals were extracted for each brain region from the ICA-FIX preprocessed HCP data, and the empirical functional connectivity (FC) was calculated between BOLD signals by Pearson's correlation coefficient. (c) By using the empirical SC and PL matrices, the whole-brain network was reconstructed. The network nodes representing the brain regions were equipped by the phase oscillators (Eq 1) coupled with the coupling weights (Eq 2) and time delays (Eq 3) extracted from the empirical SC and PL matrices, respectively, and with the natural frequencies extracted from empirical BOLD signals. The model generated simulated BOLD signals used for the calculation of the simulated FC. (d) The simulated FC compared with empirical FC and SC, and the model was evaluated by optimizing its parameters for the best correspondence/fitting between the simulated and empirical data. At this, the impact of the data processing on the model validation was evaluated and described.
During the model validation for individual subjects under the 12 considered conditions (6 WBTs × 2 atlases), we also 98 searched for the optimal model parameter, where the maximal similarity between sFC and empirical data (eFC and eSC) was 99 achieved. The distributions of the optimal parameters are depicted in Fig 3 b, d, f, and h for the two fitting modalities and 100 the two brain atlases. In agreement with this results, the best fit between sFC and eFC is attained for small delays (Fig 3 b 101 and d), whereas the strongest structure-function correspondence between sFC and eSC can also be observed for large delays 102 (Fig 3 f and h). In the latter case, the parameter distributions apparently demonstrate a two-cluster shape of small and large 103 delays, which is addressed in detail below.

104
Together with the optimal model parameters for individual subjects, we also collected the corresponding maximal 105 similarities between the simulated and empirical data, which are illustrated in Fig 4 for the 12 simulated conditions and for 106 the two fitting modalities of the correspondence between sFC and eFC (Fig 4a) and between sFC and eSC (Fig 4b). Results 107 Variations of the network properties (1: average weighted node degree, 2: average clustering coefficient, 3: average betweenness centrality, 4: average local efficiency, 5: global efficiency, and 6: modularity) versus the streamline number. In each plot the thin gray lines depict the behavior of the illustrated quantities for individual subjects together with the box plots, where the red marks, blue boxes and red pluses indicate the medians, the interquartile ranges, and the outliers, respectively.  ), and the abbreviations in the upper rows denote the network properties. WD: average weighted node degree, CC: average clustering coefficient, BC: average betweenness centrality, LE: average local efficiency, GE: global efficiency, and MQ: modularity Q. of the functional model fitting in all conditions (Fig 4a) were not from the normal distributions, where the null hypothesis 108 was rejected by χ 2 goodness of fit test with p < 0.05. In the case of the structure-functional model fitting (Fig 4b), the 109 Fig 3. Parameter planes and the distributions of the optimal model parameters (C, τ ) for the two model fitting modalities between simulated and empirical data. Parameter planes are averaged (1-6) over all subjects (n = 351) separately for any of the simulation conditions (WBT resolutions and atlases) and (7) also over all considered WBT resolutions as indicated in the plots. The correspondence between the simulated and empirical data was calculated between (a-d) simulated FC and empirical FC and (e-h) simulated FC and empirical SC for (a, b, e, f) the Schaefer atlas and (c, d, g, h) the Harvard-Oxford atlas. The Pearson correlation between the connectivity matrices is depicted by color ranging from small (blue) to large (red) values. (b, d, f, h) Distributions of the optimal model parameters of the best model fitting calculated for all individual subjects and simulation conditions. conditions of 500K, 2M, and 10M WBT streamlines for the Schaefer atlas and 100K, 500K, 2M, and 10M streamlines for the 110 Harvard-Oxford atlas were also not from the normal distributions. Therefore, Kruskal-Wallis test was used for testing 111 significant difference in all conditions. Consequently, we performed Wilcoxon signed rank one-tail test to evaluate whether the 112 maximal similarities between the simulated and empirical data for one condition are significantly higher or lower than those 113 for the other conditions (see p values in Fig 4). For the functional model fitting (sFC versus eFC) and the Schaefer atlas (Fig 114  4a, blue violins), the models with 2M and 10M WBTs performed better than with the other WBTs, and the performance of 115 the model decreased when the number of streamlines decreased. On the other hand, the functional model fitting for the 116 Harvard-Oxford atlas revealed the optimal condition at 50K or 100K WBT (Fig 4a, orange violins). Furthermore, the model 117 could fit better to eFC for the Harvard-Oxford atlas, which was also observed in Fig 3. For the structure-functional model 118 fitting (sFC versus eSC), the situation is different, where 2M or 10M WBTs are preferable for the strongest correspondence 119 between the simulated and empirical data for both atlases demonstrating approximately similar extent of the maximal model 120 fitting (Fig 4b, see also Fig 3). The results of the pairwise comparisons between the conditions (Wilcoxon signed rank one-tail test) are also indicated with the corresponding p-values in the cases of statistically significant differences (Bonferroni corrected p < 0.05). Notation "SCH" on the horizontal axes is for the Schaefer atlas, and "HO" is for the Harvard-Oxford atlas, for instance, "SCH 10M" stands for the condition of Schaefer atlas with 10M streamlines. For the box plots the red marks, blue boxes and red pluses indicate the medians, the interquartile ranges, and the outliers, respectively.
Relationships between network properties and the functional model fitting 122 As discussed above, the WBT density modulates the structural connectome. Consequently, it can also influence the dynamics 123 of the model (see parameter planes over the simulation conditions in Fig 3). In this section, we investigate the effects of the 124 graph-theoretical network properties modulated by WBT density on the model performance.

125
For each of the considered 10 network properties, we tested the relationships between their values and the maximal 126 similarity between sFC and eFC as given by the Pearson's correlation across 6 WBT conditions for each individual subject. 127 The considered network properties demonstrate a pronounced agreement with the maximal goodness-of-fit values at the level 128 of individual subjects (Fig 5 a1 and b1), where the distributions of the correlation coefficients are significantly shifted from 129 zero (except for the averages of weighted node degree, the standard deviations of betweenness centrality, and global 130 efficiencies for the Schaefer atlas and the standard deviations of betweenness centrality for the Harvard-Oxford atlas). Relationships between the network properties and the results of the functional model fitting (maximal similarity between sFC and eFC) for individual subjects for (a) the Schaefer atlas and (b) the Harvard-Oxford atlas. (a1, b1) Distributions of the Pearson's correlation coefficients calculated across 6 WBT conditions for individual subjects between a given network property indicated on the horizontal axes and the maximal goodness-of-fit values. The gray dots represent the values for individual subjects, and the box plots illustrate the medians (red lines), the interquartile ranges (blue boxes) and the outliers (red pluses). The asterisks below the x-axes labels indicate statistically significant differences in the maximal goodness-of-fit values between the two subgroups of subjects with positive and negative correlations (p < 0.05 of two-sample one-tail t-test). (a2, b2) The results of the functional model fitting versus different numbers of the WBT streamlines for the two subject subgroups of pattern 1 and pattern 2 as indicated in the legends based on the statistically significant split of the subjects for the network properties marked by asterisks in plots a1 and b1, see the Materials and methods and text for details. The error bars indicate the standard error, and the asterisks denote the simulation conditions, where the pattern 1 and 2 exhibit significantly different extend of the similarity between simulated and empirical data.
The distributions of the correlation coefficients for the Harvard-Oxford atlas (Fig 5 b1) less deviate from zero than those 135 for the Schaefer atlas (Fig 5 a1) indicating a complex relationship between the network properties and the maximal respectively. Such an intersection of the subgroups resulted in a stratified pattern 1 containing 173 subjects complemented by 154 the others, i.e., 178 subjects into pattern 2. We applied the pattern 1 and 2 splitting for each atlas for the first criterion of the 155 stratification analysis. 156 We found that the two patterns of the split subjects subgroups demonstrate significantly different quality of the 157 goodness-of-fit of the model depending on the WBT conditions ( Fig 5 a2 and b2). We also found that the patterns 1 and 2 158 exhibit different behavior of the maximal goodness-of-fit values when the WBT resolution varies. In particular, for the 159 Schaefer atlas the quality of the model validation monotonically increases for both patterns for higher WBT resolution ( Fig 5 160 a2), whereas for the Harvard-Oxford atlas, the pattern 2 apparently demonstrates a non-monotonic behavior with an optimal 161 point at 50K of the WBT streamlines (Fig 5 b2). χ 2 goodness of fit test was used to test for a normal distribution for each 162 condition of pattern 1 and pattern 2, and the Wilcoxon rank sum one-tail test was then used for a non-parametric test of the 163 difference between the patterns if it is rejected by the χ 2 test. Otherwise, two-sample one-tail t-test was used for comparing 164 normal distributions of pattern 1 and pattern 2. We also tested the changes of the goodness-of-fit of the model for each 165 pattern when the WBT resolution varies by using Wilcoxon signed rank test. As a result, for the Schaefer atlas, 500K or more 166 streamlines of the pattern 1 and 2M or more streamlines of the pattern 2 showed significantly higher goodness-of-fit values 167 than for the sparser WBT conditions. For the Harvard-Oxford atlas, 100K or more streamlines of the pattern 1 showed 168 significantly higher goodness-of-fit values than sparser WBT conditions. However, 50K streamlines of the pattern 2 is the 169 optimal condition that shows significantly higher correspondence between the simulated and empirical data than for the other 170 conditions.

171
Based on the presented results, we can conclude that the optimal number of the WBT streamlines should be considered 172 large (∼500K-10M) for the Schaefer atlas (Fig 5 a2). Interestingly, the best goodness-of-fit of the model for the 173 Harvard-Oxford atlas can be reached for much sparser WBT at ∼50K streamlines for more than 50% of the subjects (Fig 5 174 b2).

175
Effects of time delay on model validation 176 Based on the clustered distributions of the optimal model parameters of the maximal structure-function similarity between 177 sFC and eSC (Fig 3 f and h), we divided the optimal parameter points and the corresponding subjects into two clusters (Fig 178  6). In such a way, the cluster of parameter points with small delay (cluster 1) was split from the other points characterized by 179 relatively large delay (cluster 2) based on their bimodal distributions (  These results also establish a connection between the two fitting modalities and the time delay, where the impact of the 187 latter was not observed in the distributions of the optimal parameters of the functional similarity between sFC and eFC (Fig 188  3 b and d) and can only be revealed by mediation of the structure-function correspondence. The broad distributions of global 189

194
When the number of the WBT streamlines varies, subjects may exchange their membership in the two clusters ( Fig 6, the 195 vertical alluvial plots). Interestingly, for the Schaefer atlas, the ratio of subjects in the two clusters is gradually changing 196 when WBT is getting sparser (from 10M to 10K), where more and more subjects move to cluster 1 approximately balancing 197 the subgroup sizes at 10K case (Fig 6a, the alluvial plot). In contrast, there are only small exchanges of the subject members 198 between clusters for the Harvard-Oxford atlas keeping the group sizes approximately constant for all WBT conditions (Fig 6b, 199 the alluvial plot), where cluster 2 contains most of the subjects as is for the Schaefer atlas for the case of 10M of the WBT 200 streamlines. We used the splitting of the subjects into the discussed two clusters as the second criterion of the stratification 201 analysis.

202
WBT-induced changes of model performance 203 In the previous sections we observed that the behavior of the maximal goodness-of-fit versus the WBT conditions is not akin 204 to each other for different atlases. For example, for the Schaefer atlas, the maximal similarity between sFC and eFC 205 monotonically increases when the WBT is getting denser (Fig 5 a2), and the corresponding curve had a positive slope. On the 206 other hand, the maximal goodness-of-fit for the Harvard-Oxford atlas may exhibit a non-monotonic behavior and attained the 207 maximal values at 50K WBTs (Fig 5 b2). We therefore explicitly searched for such a divergent dynamics, looked for the subjects with the best model performance for the most sparse or the most dense WBT, and divided the subjects into two subject population with the positive slope (n = 248) and the negative slope (n = 103) (Fig 7b). On the other hand, the 216 subjects split very unevenly for the Schaefer atlas, and most of them (n = 339) exhibited positive slope, where the similarity 217 between simulated and empirical data monotonically increases when the number of streamlines increases (Fig 7a). . The latter is based on the behavior (positive/negative slopes) of the maximal similarity versus the WBT conditions (see text for details) as indicated in the legends, where the number of subjects in each group is also pointed out. The asterisks indicate the statistically significant differences between the two subject groups (p < 0.05, two-sample one-tail t-test for normal distributions and Wilcoxon rank sum one-tail test for non-parametric test).
Each split subgroup was tested for a normal distribution by χ 2 goodness of fit test over WBT densities. The null 219 hypothesis of the χ 2 test for the Schaefer atlas was rejected for each subgroup and each condition, however, in the case of the 220 Harvard-Oxford atlas, the null hypothesis was not rejected. Consequently, we performed Wilcoxon signed rank test for the 221 Schaefer atlas and two-sample paired t-test for the Harvard-Oxford atlas. As a result, for the Schaefer atlas, the subgroup 222 with the positive slope of the 2M or more streamlines WBT conditions showed significantly higher goodness-of-fit of the 223 model than sparser WBT conditions (Fig 7a, red curve). In the case of the Harvard-Oxford atlas, the subgroup with the 224 positive slope showed significantly higher goodness-of-fit of the model with 100K or more WBT streamlines than sparser 225 WBT conditions (Fig 7b, red curve). On the other hand, the the subgroup with the negative slope showed significantly higher 226 goodness-of-fit of the model with 50K or less streamlines WBT conditions than denser WBT conditions (Fig 7b, blue curve). 227 We applied the discussed approach based on the model performance as the third criterion of the stratification analysis. As investigated in the previous sections, the entire subject population can be split into two groups based either on the two 230 patterns of the relationships between network properties and the functional model performance (Fig 5), the two clusters based 231 on the distributions of the optimal parameters of the maximal similarity between sFC and eSC (Fig 6), and the behavior conditions (Fig 7). By combining all three approaches, we illustrated a stratification result in the alluvial plots in Fig 8, which 234 show the proportions of the stratified subjects when the above stratifying criteria are consequently applied to the entire 235 subject population for each atlas. The stratified subjects in Fig 8 a1 and b1 show different extent and behavior of the maximal goodness-of-fit values of the 237 functional model fitting over the WBT conditions (Fig 8 a2 and b2). The first criterion provides a way to explain the changes 238 of the fitting results across WBT densities for individual subjects involving certain properties of the empirical structural 239 networks (Fig 5). According to the first criterion, we can expect that the subjects from pattern 2 will have the maximal 240 model performance for sparse WBTs for the Harvard-Oxford atlas (Fig 5 b2).

241
The second stratification step in Fig 8 a1 and b1, reflects the interchanging behavior between the parameter clusters 242 observed in Fig 6, which is notable for the Schaefer atlas, whereas it is not prominent for the Harvard-Oxford atlas. This is 243 also observed in Fig 8 a1 and b1, where the overwhelming majority of subjects for the latter atlas were sorted to the group of 244 persistent members of cluster 2, i.e., the split subgroup with large delay for the structure-functional model fitting. Finally, the 245 third criterion practically does not differentiate the subjects into positive and negative slopes for the Schaefer atlas, which is 246 also true for pattern 1 in case of the Harvard-Oxford atlas. However, the subjects in pattern 2 for the Harvard-Oxford atlas 247 can still be split into two subgroups with the inclining and the declining curves of the maximal goodness-of-fit by the third 248 criterion, which can further refine the differentiation of subjects of the best model performance at sparse WBT density (Fig 249  8b2, stratified groups 2 and 3, see also Fig 5). 250 The declining curves of the goodness-of-fit when the number of the WBT streamlines decreases imply that the optimal 251 number of the total streamlines for the simulation with the Schaefer atlas should be considered large, for example, more than 252 December 2, 2020 12/25 500K: 2M or 10M of the WBT streamlines (Fig 8a2). The model evaluation with the Harvard-Oxford atlas show different 253 optimal conditions than that for the Schaefer atlas (Fig 8b2). The optimal streamline number may depend on the 254 stratification subgroups to which the subject belongs, and which exhibited very different behavior of the goodness-of-fit when 255 the number of streamlines varied (Fig 8b2). For example, the optimal number of streamlines for a better model performance 256 could range from 10M to 100K for the subjects from subgroup 1 in Fig 8b2 (solid red curve). On the other hand, for more 257 than 20% of subjects (n = 80) of the entire subject population, i.e., for those from the stratified group 3 (Fig 8b2, dashed   258 blue curve), the optimal conditions are at ∼50K WBT streamlines, and more streamlines may lead to the degradation of the 259 quality of the model validation. For other 18% of subjects (n = 66, group 3 in Fig 8b2, solid blue curve) a sparse WBT can 260 also be a reasonable option.

262
The purpose of the current study was to explore how the processing of the neuroimaging data can influence the dynamics and 263 validation of the whole-brain mathematical dynamical models informed by the empirical data. We considered several 264 simulation conditions based on varying parameters of data processing, such as the number of total streamlines (or fibers in the 265 brain white-matter) of WBT and brain atlases. While the latter defined how the brain is parceled into several brain regions 266 that are considered as network nodes in the model, the former influenced the underlying SC and PL used for the calculation 267 of the coupling weights and time delays in the coupling between nodes. We discussed how the WBT resolution can influence 268 the structural information fed to the model and the corresponding modeling results for the considered brain atlases. We found 269 that the parcellation with different atlases showed similar changes of the architecture of the structural networks, but distinct 270 trends of the goodness-of-fit of the model to the empirical data across the numbers of WBT streamlines. Consequently, we 271 suggested optimal configurations of the considered data and model parameters for the best model fit at the group level as well 272 as for personalized models of individual subjects based on the properties of the empirical and simulated data.

273
The applied model-based approach followed the line of research suggested and developed in many modeling studies, see, for 274 example, the papers [18,19,23,33,[35][36][37] and references therein, where the potential of the whole-brain dynamical models to 275 explain the properties of the brain dynamics and structure-function relationship was demonstrated by a detailed investigation 276 of the correspondence between empirical and simulated brain connectomes. At this, the patterns of the underlying structural 277 network connectivity as related to the inter-node coupling strengths and delays can play a crucial role for observing a 278 pronounced structure-function agreement [39,40]. It is thus important to extract the empirical SC and PL used for evaluation 279 of parameters of the model connectivity as plausible as possible in order to obtain biologically realistic modeling results [41]. 280 Evaluating structural architecture for modeling 281 As discussed in Fig 2 and Table 1, the variation of the WBT streamline number affects the properties of the model networks 282 calculated from the structural connectome, especially, the PL matrices, where the edges with relatively small numbers of 283 streamlines are sensitive to reducing the total number of tracking trials. Therefore, SC extracted from relatively sparse WBT 284 with small number of streamlines may not guarantee a higher reproducibility with stable network properties, where some 285 edges will be disconnected or reconnected from time-to-time, when streamlines will be generated. 286 Within the framework of the modeling approach, the model parameters can be varied in a broad range and sense to 287 evaluate their impact on the simulated dynamics. As related to the discussed network topology, beyond the variation of the 288 global coupling strength, the network edges approximating the anatomical connections between brain regions can be removed 289 to obtain a better fit between simulated and empirical FC of schizophrenia patients [42]. Aiming at the best correspondence 290 between simulated and empirical data, new inter-region anatomical connections were allowed to be created, or existing 291 structural connections to be rewired according to algorithms based on the differences between the simulated and empirical FC 292 including the gradient-descent method [43,44]. The model connectivity can be composed of both empirical SC extracted from 293 dwMRI data and local intra-cortical connections incorporated into the model based on the distance-dependent 294 approximations [16]. Among many possible ways of SC variation for the best model fitting, which might also require 295 additional justifications, we propose to stay within the framework of realistically extracted signals from dwMRI data and 296 consider the well-established approaches for the data processing. In this study, we used state-of-the-art techniques for 297 calculation of WBT and SC [45] and investigated the impact of a constructive parameter for the structural connectome, the 298 number of extracted streamlines on graph-theoretical measures of SC, and their influence on the modeling results.
Effects of different parcellations on the modeling 300 For the extraction of the brain structural and functional connectomes and for setting up the model network, we used two 301 paradigmatically distinct brain atlases as represented by the Schaefer atlas [46] that is based on functional MRI data, and the 302 Harvard-Oxford atlas of anatomy-related parcellation [47] that is based on the landscape of gyri and sulci on the cortical 303 surface. We found that the graph-theoretical properties of the structural networks built based on these two parcellations and 304 used in the model are changing with similar tendencies across the considered WBT conditions for the both atlases (Fig 2 and 305 Table 1). Some of the considered network properties exhibit high sensitivity to the variations of the WBT density, for 306 example, the clustering coefficient (CC) or the local efficiency (LE), see Table 1. On the other hand, the weighted node 307 degree (WD) or the global efficiency (GE) manifested significant changes only when the number of the calculated WBT 308 streamlines was decreased from 10M to 100K or 50K, i.e., 100-200 times, and the sensitivity was stronger in the case of the 309 Schaefer atlas. These findings might be of importance when the discussed network properties influence the modeling results. 310 We also found that the mentioned network metrics (CC and LE) with sensitive dependence on the WBT density strongly 311 anti-correlate with the goodness-of-fit of the model for the Schaefer atlas, while the relationship is in average less pronounced 312 for the Harvard-Oxford atlas (Fig 5 a1 and b1). Furthermore, the situation is reversed for the insensitive network properties 313 (WD and GE), where the correlation is more enhanced for the Harvard-Oxford atlas. Given the impact of the WBT density 314 on the properties of the structural networks (Fig 2), this may explain the clear monotonic behavior of the goodness-of-fit for 315 the Schaefer atlas versus the number of streamlines (Fig 5a2) and apparently mixed behavior for the Harvard-Oxford atlas 316 (Fig 5b2). In summary, some of the network metrics are characterized by different relationships with the results of the model 317 validation for the varying number of the WBT streamlines between the parcellations, see also S3 Fig and S4 Fig in   318 Supplementary materials for the relationships of all considered network properties. Therefore, the tractography density 319 modulates the graph-theoretical network properties in similar changes for the considered atlases, however, it influences the 320 dynamics of mathematical models in different ways, in particular, depending on the used brain parcellation.

321
Role of time delay in the modeling 322 We also reported on the differences with the distributions of the optimal model parameters for the structure-functional model 323 fitting sFC-eSC and their behavior for the two considered brain atlases, where a parameter clustering with respect to small 324 and large delay in the coupling was observed (Fig 6). More detailed investigation revealed an apparent migration of the 325 optimal parameter points towards the parameter cluster of small delay when the number of streamlines decreased for the 326 Schaefer atlas. On the other hand, no such parameter (and subject) flow was detected for the Harvard-Oxford atlas, and the 327 parameter distribution remained relatively stable for any of the considered numbers of streamlines. Such a behavior of the 328 optimal parameters for the two considered atlases might be related to the performance of the model validation at the group 329 level when the WBT resolution varies, which can suggest a possible mechanism associated with the impact of time delay in 330 coupling on the model fitting results. Indeed, we observed that subjects from the parameter cluster with large delay 331 demonstrated better quality of the model validation for both functional and structure-functional model fittings (Fig 6 and S6 332  Fig). In other words, if the optimal parameters for the maximal sFC-eSC correspondence have a large delay, we might expect 333 a better correspondence between sFC and eFC. Accordingly, we might also expect that the group-averaged maximal 334 goodness-of-fit for the Schaefer atlas will decay faster than that for the Harvard-Oxford atlas when the streamline number 335 decreases as observed in Fig 4, because fewer optimal parameter points with large delay can be found for a sparser WBT for 336 the former atlas. It is also important to observe that the structure-functional similarity between the empirical connectomes 337 eFC and eSC used for the model derivation and validation exhibited weak opposite relationships between parameter clusters 338 and across the number of the WBT streamlines as compared to the correspondence between simulated and empirical data (see 339 S6 Fig a1 and b1 in Supplementary materials). This indicated a nontrivial character of the reported results that do not 340 directly follow from the empirical structure-function correspondence.

341
It is interesting to note here that the best agreement between simulated and empirical functional data (sFC and eFC) was 342 attained for the considered model at small (zero) delays (Fig 3). It is therefore safe to consider such a type of model 343 simulating ultra slow BOLD dynamics without delay in coupling [35,36,44]. However, if other fitting modalities are involved, 344 the delay in coupling can play an important role, for instance, for the structure-functional (sFC-eSC) model fitting. The latter 345 can be used to mediate the impact of non-zero delay on the functional model validation, which also plays an important role 346 for the higher correspondence between simulated and empirical functional data (Fig 6). The values of the optimal non-zero 347 delays for the latter fitting modality can be influenced by the natural frequencies of oscillators (Eq 1) demonstrating relatively 348 strong negative correlations with the structure-functional model fitting as illustrated in S7 Fig and S8 Fig in Supplementary 349 materials. Consequently, the optimal speed of the signal propagation in the brain as revealed by the modeling results can be 350 regulated by the mean intrinsic time scale of oscillatory activity of individual brain regions.

Stratification analysis and the optimal condition 352
The problem of the optimal number of the total WBT streamlines was also addressed in this study beyond the group-level 353 analysis and aimed at the best fitting of the personalized models for individual subjects. To investigate the impact of the 354 WBT resolution at the level of individual subjects, we stratified the entire subject population into smaller subgroups with 355 more homogeneous (heterogeneous) model dynamics within (between) subgroups. One of the stratification approaches is to 356 show the effect of the graph-theoretical network properties modulated by the WBT density on performance of the model. We 357 found that such correlations for individual subjects are well-pronounced for the Schaefer atlas, but they are relatively less 358 expressed for the Harvard-Oxford atlas (Fig 5 a1 and b1). Nevertheless, the stratification can be designed by combining the 359 splitting results for network properties, which resulted in a clear differentiation of the impact of the WBT streamline number 360 on the model validation across stratified subgroups and brain parcellations (Fig 5).

361
Another approach to stratification of the subjects was based on the global delay parameter clustering for the 362 structure-functional model fitting discussed above and can provide an informed view of the validation results for the 363 functional model fitting (Fig 6). One more stratification approach is illustrated in Fig 7, where the subjects were split into 364 two subgroups of qualitatively different individual behavior of the goodness-of-fit versus the streamline number. Based on the 365 obtained results, we can propose to use the large number (∼2M-10M) of the WBT streamlines for the best functional model 366 validation, if the Schaefer atlas was used for the brain parcellation. On the other hand, the recommendation is completely 367 opposite for more than 20% of subjects for the brain parcellation based on the Harvard-Oxford atlas (Fig 8 b2, blue dashed 368 curve 3). For such subjects, the large number of streamlines can lead to a lower quality of the model fitting as compared to 369 rather sparse WBT containing, for example, only 50K streamlines. Differentiating the subjects according to the discussed 370 stratification criteria can help to design an individual data processing workflow and configurations of parameters for the 371 optimal personalized modeling of the brain dynamics. In particular, based on the obtained results, we can suggest a 372 personalized optimal number of the WBT streamlines for the considered brain parcellation for the better model performance 373 at the modeling of the resting-state brain dynamics.

374
Summary and conclusion 375 We found that varying number of total streamlines for WBT affects the network properties of the structural connectome and 376 performance of the mathematical modeling of the resting-state brain dynamics. The results showed that a dense WBT is not 377 always the best condition for the whole-brain mathematical modeling represented by a system of interacting oscillators with 378 time delay in coupling. We also demonstrated that the optimal parameters of the data processing may be affected by the 379 utilized brain parcellation that should be taken into account already at early steps of the data processing workflow. The 380 present study did not aim to provide any quantitative conclusion concerning the optimal number of WBT streamlines, but 381 rather to illustrate possible qualitative effects caused by the varying WBT density on the structural connectome and modeling 382 results in combination with functional and anatomical brain parcellations. Our results can contribute to a better 383 understanding of the interplay between the data processing and model parameters and their influence on data analytics of 384 dwMRI and modeling of the resting-state fMRI data.

386
The current study considered 351 unrelated subjects (172 males, age 28.5 ± 3.5 years) from the Human Connectome Project 387 (HCP) S1200 dataset [48]. HCP data (https://www.humanconnectome.org/, [48]) were acquired using protocols approved by 388 the Washington University institutional review board (Mapping the Human Connectome: Structure, Function, and 389 Heritability; IRB #201204036). Informed consent was obtained from subjects. Anonymized data are publicly available from 390 ConnectomeDB (db.humanconnectome.org). 391 We reconstructed SC and PL by using six WBT densities and two atlases for individual subjects, then calculated 392 simulated FC from simulated signals by the computational model composed of coupled phase oscillators with delayed coupling. 393 We explored the two free parameters of the modeling for each subject and condition, and validated the model through the two 394 model fitting modalities. We also calculated graph-theoretical network properties of SC and PL over considered conditions 395 and compared the network properties with the maximal goodness-of-fit of the model. The individual subjects were stratified 396 into groups of different network properties and modeling results.

397
Preprocessing of MRI data and connectivity extraction 398 The current study used an in-house pipeline for the extraction of SC and PL matrices from the diffusion-weighted images 399 (DWI). The pipeline consists of four modules: preprocessing of MRI and DWI data, WBT calculation, atlas transformation 400 and connectivity reconstruction. It was optimized for parallel processing on high-performance computational clusters [49].

401
The pipeline was created with functions of Freesurfer [50], FSL [51], ANTs [52], and MRtrix3 [45]. Freesurfer was used for 402 processing the T1-weighted image (T1) including bias-field correction, tissue segmentation, cortical (surface) reconstruction, 403 volume-surface converting, and surface deformation for parcellation as well as for the correction of the eddy-current distortions 404 and head-motion in DWIs using the corresponding b-vectors and b-values. MRtrix3 performed de-noising and bias-field 405 correction on the DWIs. The pre-processed images were used for co-registration between the T1 and the DWIs and linear and 406 non-linear transformation by functions of FSL. Linear and non-linear transformation matrices and images for registration 407 from the standard MNI space to the native space and vice versa were estimated. Through the image registration, gray matter, 408 white matter, cortical/subcortical, cerebellar, and cerebrospinal fluid masks were generated in the native DWI space.

409
The WBT calculation module included only MRtrix3 functions, where the response functions for spherical deconvolution 410 were estimated using multi-shell-multi-tissue constrained deconvolution algorithm [53], fiber oriented distributions (FOD) 411 were estimated from the DWIs using spherical deconvolution, and the WBT was created through the fiber tracking by the 412 second-order integration over the FOD by a probabilistic algorithm [54]. In the latter step, we used six different numbers of 413 total streamlines for varying WBT density: 10K, 50K, 100K, 500K, 2M, and 10M, where the "K" and "M" letters stand for 414 thousand (Kilo-) and million (Mega-), respectively. The rest of the tracking parameters were set as default values of tckgen 415 function from MRtrix3 documents [55]. 416 The atlas transformation module applied the linear and non-linear transformation matrix and images to atlases that were 417 sampled in the standard MNI space. We used Schaefer atlas with 100-area parcellation [46] and Harvard-Oxford atlas with 96 418 cortical regions [47]. After the transformation, the labeled voxels on the gray matter mask were selected for a seed and a 419 target region. Consequently, the tck2connectome function of MRtrix3 reconstructed SC and PL (count and path-length 420 matrices in Fig 1a).

421
For the empirical functional connectivity (FC), the BOLD signals were extracted from the resting-state fMRI data 422 processed by ICA-FIX as provided by HCP repository [56]. The Schaefer atlas and the Harvard-Oxford atlas were applied for 423 the parcellation of the processed fMRI into brain regions within the standard MNI 2 mm space (6th-generation in FSL).

424
Empirical FC was calculated using Pearson's correlation coefficient across BOLD signals extracted as mean signals of the 425 parceled brain regions. There were four resting-state fMRI sessions (1200 volumes, TR = 720 ms) which consist of two 426 different phase-encoding directions (left and right) scanned in different days. In addition, a concatenated BOLD signal was 427 generated by using all four z-scored BOLD signals from the above four fMRI sessions, which results in five empirical FCs 428 calculated for BOLD signals from the four fMRI sessions and the concatenated BOLD signals for each subject. Finally, 12 429 simulation conditions (6 WBTs × 2 atlases) were tested by simulation of the mathematical whole-brain model, where the 430 model parameters were optimized for the best fit between simulated and empirical data.

431
Mathematical whole-brain model 432 We simulated a whole-brain dynamical model of N coupled phase oscillators [19,32,38] The number of oscillators N corresponds to the number of brain regions parceled as defined by a given brain atlas, where 434 the phase ϕ i (t) models by sin (ϕ i (t)) the mean BOLD signal of the corresponding region. C is a global coupling which scales 435 the level of couplings of the whole-brain network. η i is an independent noise perturbing of oscillator i, which is sampled from 436 a random uniform distribution from the interval [-0.3,0.3]. The natural frequencies f i were estimated from the empirical data 437 as frequencies of the maximal spectral peaks (restricted to the frequency range from 0.01 Hz to 0.1 Hz) of the empirical BOLD signals of the corresponding brain regions. k ij stands for the coupling strength between i th and j th oscillators, and τ ij 439 approximates the time delay of the signal propagation between i th and j th oscillators. They were calculated from the 440 empirical SC and PL and determined by the following equations: where w ij is the number of streamlines between i th and j th parceled regions and W is an averaged number of 442 streamlines over all connections except self-connections, and where τ is a global delay (unit: s/m) which is a reciprocal of an average speed of signal propagation V through the 444 whole-brain network. The time step of the numerical integration of Eq 1 was fixed to 0.04 s, and the simulated signals were 445 generated for 3500 seconds after skipping 500 seconds of the transient. The simulated BOLD signals and the corresponding 446 simulated FCs were calculated from the phases downsampled to TR = 0.72 s, which is the repetition time of HCP fMRI.

447
The considered mathematical model Ep 1 has two main free parameters: the global coupling C and the global time delay 448 τ . The global coupling ranged from 0 to 0.504 in evenly discretely distributed 64 values, and the global delay was from 0 to 449 423 s/m in evenly discretely distributed 48 values. Therefore, 3072 (64 × 48) simulations were performed for each subject to 450 calculate the simulated FCs that were compared with empirical functional and structural data for each simulation condition. 451 A total of 12,939,264 (64 × 48 × 12 × 351) simulations of model (Eq 1) were performed in this study for 351 subjects with 12 452 conditions (6 WBTs × 2 atlases). We explored the 2-dimensional (64 × 48) model parameter space and found the optimal 453 parameter values for the best correspondence between simulated and empirical data. The correspondence was calculated by 454 Pearson's correlation coefficient between simulated FC (sFC) and empirical FC (eFC) and SC (eSC) depending on the model 455 fitting modality. For each subject and simulation condition, 5 parameter planes of the functional model fitting modality 456 (correlation between sFC and eFC as a goodness-of-fit of the model) were obtained corresponding to 5 eFCs. In addition, one 457 parameter plane of the structure-functional model fitting modality (correlation between sFC and eSC) was also calculated.

458
From each parameter plane, we selected the optimal (C, τ )-parameter point, where the maximal correlation between the 459 simulated and the empirical data was reached.

460
Effects of different WBT conditions 461 We revealed the effects of the varying WBT density on the modeling results by evaluating its impact on 1) the 462 graph-theoretical network properties of empirical structural connectome, 2) patterns of the optimal model parameters in the 463 model parameter space, and 3) model performance as given by the quality of the model fitting over simulation conditions.

464
Based on the results from the three approaches, we introduced three criteria (see below) for differentiation of the influence of 465 the WBT density on the modeling results for individual subjects. To do this, we stratified the entire subject population by 466 splitting it into several subgroups according to the mentioned criteria based on (i) the relationships between the network  To investigate the impact of the varying WBT density on the architecture of structural networks, we calculated 473 graph-theoretical network properties from SC and PL for each subject, WBT condition, and atlas. The considered 10 network 474 properties (8 local properties and 2 global properties) included the weighted node degree, clustering coefficient, betweenness 475 centrality, local efficiency, global efficiency, and modularity, which were calculated by the brain connectivity toolbox in subgroups with significantly higher model performance for 10M, 2M, 500K, or 100K WBT streamlines. Then the subjects in 486 an intersection of the selected subgroups constituted a group referred to as pattern 1, and the rest of them were referred to as 487 pattern 2.

488
Impact of time delay on the model fitting 489 For the second stratification criterion, the optimal model parameters of the maximal correspondence between sFC and eSC 490 were divided into two clusters as suggested by the their bimodal distribution splitting small and large values of the optimal 491 time delay (Fig 6). Since subjects can move between the parameter clusters when the total number of the WBT streamlines 492 varies from 10M to 10K, we separated the subjects into five classes: Always staying in cluster 1 (From 1 to 1) or in cluster 2 493 (From 2 to 2), only once moving either from cluster 1 to cluster 2 (From 1 to 2) or in opposite direction (From 2 to 1), and 494 performing multiple changing between the two clusters (Multiple). This approach based on the distribution of the optimal 495 model parameters was used as the second criterion for the stratification of subjects.

496
Variation of the model performance

497
The last stratification criterion is based on the behavior of the optimal goodness-of-fit values when the number of WBT 498 streamlines varies. To quantify it, we calculated the increment of the maximal similarity between sFC and eFCs matrices of 499 the concatenated session for every individual subject when the number of the WBT streamlines increases from 10K to 10M. 500 Then, all subjects were divided into two subgroups exhibiting either positive or negative slopes (increments) of the 501 goodness-of-fit behavior versus the number of WBT streamlines (Fig 7). According to this criterion, the subject were 502 stratified into two subgroups demonstrating the best functional model fitting for either maximal or minimal number of the 503 WBT streamlines considered. Consequently, we used all three criteria for the three-step stratification analysis (Fig 8). (a, b) p-values of the two-sample one-tail t-test for the differences of the model fitting between the two subject subgroups split based on the positive or negative values of the correlation coefficients whose distributions are illustrated in Fig 5 a1 and b1. The corresponding network properties and the number of WBT streamlines are indicated on the horizontal and vertical axes, respectively. The black squares in the tables indicate significant results with p < 0.05. Red cells mean the subgroup of positive correlations showed higher goodness-of-fit of the model than the subgroup of negative correlations. Blue cells mean the subgroup of negative correlations showed higher goodness-of-fit of the model than the subgroup of positive correlations. Abbreviations of the network property names are as in Fig 5. The plus (+) or minus (-) sign after the property name in the black boxes, e.g., S.D. WD+, indicates a group of subjects with positive, respectively, negative correlation coefficients between the corresponding network property and the model goodness-of-fit values.
S6 Fig (a1 and b1) Agreements between eFC and eSC for two subgroups by the second criterion of the stratification. (a2 and b2) Agreements between sFC and eSC for two subgroups by the second criterion of the stratification. The subgroup 1 means the cluster 1 and the subgroup 2 means the cluster 2 in Fig 6.