ABSTRACT
The white matter structures of the human brain can be represented via diffusion tractography. Unfortunately, tractography is prone to find false-positive streamlines causing a severe decline in its specificity and limiting its feasibility in accurate structural brain connectivity analyses. Filtering algorithms have been proposed to reduce the number of invalid streamlines but the currently available filtering algorithms are not suitable to process data that contains motion artefacts that are typical in clinical research. We augmented the Convex Optimization Modelling for Microstructure Informed Tractography (COMMIT) filtering algorithm to adjust for signal drop-out artifacts due to subject motion present in diffusion-weighted images. We demonstrate with comprehensive Monte-Carlo whole brain simulations and in vivo infant data that our robust algorithm is capable to properly filter tractography reconstructions despite these artefacts. We evaluated the results using parametric and nonparametric statistics and our results demonstrate that if not accounted for, motion artefacts can have severe adverse effect in the human brain structural connectivity analyses as well as in microstructural property mappings. In conclusion, the usage of robust filtering methods to mitigate motion related errors in tractogram filtering is highly beneficial especially in clinical studies with uncooperative patient groups such as infants. With our presented robust augmentation and open-source implementation, robust tractogram filtering is readily available.
Highlights
We present a novel augmentation to tractogram filtering method that accounts for subject motion related signal dropout artefacts in diffusion weighted images.
Our method is validated with realistic Monte-Carlo whole brain simulations and evaluated with in vivo infant data.
We show that even if data has 10% of motion corrupted slices our method is capable to mitigate their effect in structural brain connectivity analyses and microstructural mapping.
1. Introduction
Diffusion-weighted magnetic resonance imaging (dMRI) of the human brain (Basser et al., 1994) has various applications ranging from early clinical stroke diagnostics (Hors-field and Jones, 2002) to investigations of the microstructural properties of the tissue (Alexander et al., 2019; Novikov et al., 2019) and structural brain connectivity mapping (Griffa et al., 2013; Delettre et al., 2019; Zhang et al., 2021). The latter two are gaining popularity in clinical research (Kamiya et al., 2020) to investigate various brain diseases and neurological conditions of adults (Fieremans et al., 2013; Ben-itez et al., 2014) and development of the growing brain in children and adolescents (Genc et al., 2017; Huber et al., 2019). Furthermore, with the latest advances in automatic brain segmentation with tools like Infant FreeSurfer (Zöllei et al., 2020), it is likely that the amount of brain connectivity studies of infants (Kunz et al., 2014; Pannek et al., 2018; Pecheva et al., 2019) will grow in the near future too.
The clinical dMRI research comes with its own puzzles to solve, with one most difficult being the patient motion. The subject motion can be unavoidable when imaging infants or patients in discomfort or pain, resulting in complex missing data problems (Andersson et al., 2016; Sairanen et al., 2017, 2018). Therefore, the processing of the motion corrupted images requires specialized algorithms and robust methods to minimize motion induced bias in the results. While robust modeling has been considered in the contexts of diffusion and kurtosis tensor estimations (Chang et al., 2005, 2012; Tax et al., 2015) as well as in higher order models (Pannek et al., 2012) that could be used for tractography purposes, it has not been investigated thoroughly in the context of the brain structural connectivity analyses.
Structural brain connectivity analyses are based on the rapidly developing dMRI tractography (Basser et al., 2000) algorithms that represent the brain white matter structures with streamlines. These streamlines can be used to investigate which gray matter regions might have a structural link. In general, the tractography algorithms are sensitive but they lack specificity and they find great number of false streamlines connections (Thomas et al., 2014; Maier-Hein et al., 2017). This means that two gray matter regions could be linked by tractography streamlines despite that the brain tissue does not form a true structural link. This is a known issue in structural connectivity analyses (Drakesmith et al., 2015; Zalesky et al., 2016; Yeh et al., 2020) to which tractogram filtering has been proposed as one solution. Tractogam filtering can be achieved with different approaches, one being the Convex Optimization Modelling for Microstructure Informed Tractography (COMMIT) (Zhang et al., 2021) which we will use in this study to demonstrate possible effects of subject motion to the tractogram filtering and microstructural mapping as well as how it can be accounted and corrected for.
There are three alternative post-scan approaches to address the outliers caused by the subject motion. The first approach is to find outliers in dMRI data manually or automatically with statistical methods or deep-learning and simply exclude the artefactual dMRI data or even the whole subject from the analysis (Oguz et al., 2014; Samani et al., 2019). The second approach is to use a model to predict what the measurements should look like, locate the outliers based on differences to model predictions and replace them with these predictions if differences are deemed large enough (Lauzon et al., 2013; Andersson et al., 2016). The third approach is to detect the outliers, but instead of replacing or completely excluding them, their weight is reduced in all subsequent model estimation steps (Sairanen et al., 2018).
Manual outlier detection can be laborious and excluding whole subjects from clinical studies with relatively small number of participants might not be the optimal choice. The outlier replacement approach relies on the quality and robustness of the chosen model and method to represent the measured dMRI signal. If multiple dMRI measurements are corrupted by motion artifacts, this initial modeling and prediction step can fail altogether (Sairanen et al., 2018). Even in the best case, the replaced data points are simply interpolations based on the chosen model and the data points used in the modeling therefore it cannot increase the available information but leads to increased error propagation due to subsequent model fittings. The third approach, on the contrary, enables quantifying the amount of the motion corrupted data and versatile subsequent modeling and analysis options therefore being optimal for our purposes.
While weighted and robust modeling has been implemented before, they have mostly been used outside the scope of tractogram filtering. For example, in diffusion tensor modeling weighted linear least squares is typically the fastest and most robust estimator (Veraart et al., 2013; Tax et al., 2015; Sairanen et al., 2018). Robust modeling has been proposed for higher order models as well (Pannek et al., 2012). In the context of tractogram filtering, weighted cost functions have been introduced earlier in e.g. SIFT (Smith et al., 2013, 2015), but it has only been evaluated with voxels affected by partial voluming. SIFT algorithm states that their ‘processing mask’ is ‘the square of the estimated white matter partial volume fraction’ which indeed should be beneficial in the case of partial voluming. However, the approach in SIFT does not account for outliers that are randomly occurring in the measurements as our newly proposed augmentation to COMMIT does.
In this work, we propose a robust augmentation to the COMMIT filtering algorithm (Daducci et al., 2015) that accounts for the unreliability of the original measurements. We detail the theoretical changes to the algorithm as well as provide open-source code1 of its implementation. To evaluate the method, we use the data from the Human Connectome Project (HCP) (Van Essen et al., 2013) as a base for thorough Monte-Carlo simulations which emulate various motion induced artifacts in synthetic whole brain data with Rician noise. Synthetic data provide the necessary baseline that can be used to highlight the bias arising from subject motion in structural connectivity analyses as well as how well it can be amended using our robust augmentation.
In the context of this study, the measurement unreliability is associated with outliers due to subject motion. However, it can readily be utilized to correct for measurements that are affected by partial voluming, as our preliminary results have demonstrated earlier (Sairanen et al., 2021). This proposed augmentation to tractogram filtering provides the necessary update for them to be practically usable in clinical research.
2. Materials and Methods
2.1. Implementation
We augmented the original cost function of COMMIT (Daducci et al., 2015) algorithm with a voxelwise weighting factor W that can be used to downweight data points that have decreased reliability due to subject motion or any other reason. The original COMMIT is based on a minimization of the difference between the original measurements and a forward model prediction. The forward model prediction is calculated by fitting a chosen microstructural model for each streamline in every voxel. COMMIT assigns a weight to each streamline that tells how much that streamline contributes to the predicted signal. These streamline contribution weights are iteratively updated until the difference between the measurements and this prediction converges to a minimum. Any streamline with contribution of zero is then removed as an implausible streamline (i.e. not compatible with the measured signal). If the original measurement is artefactual due to subject motion or any other reason, the algorithm could converge to an incorrect solution. To decrease the weight of these artefactual data points, we propose the robust cost function shown in eq. 1. Our proposed idea is further illustrated in Fig. 1 with a simple toy example.
The robust cost function in eq. 1 is intended to be used with outlier detection with tools such as SOLID (Sairanen et al., 2018). SOLID detects slicewise outliers based on robust statistical analysis of the original dMRI data and can be used either to exclude outliers or downweight them depending how strong outliers are. This downweighting scheme is likely a better option to outlier replacement that is proposed in earlier studies (Lauzon et al., 2013; Andersson et al., 2016). If the outlier is replaced with a prediction from a tensor or a gaussian model, then COMMIT would try to minimize the difference from those model predictions to its own model prediction. Since these models can be different and therefore capture different details of the dMRI signal, it is more straightforward to use robust modeling with the proposed weighted cost function.
2.2. Simulations
To investigate the outlier effect on the tractogram filtering, we developed a comprehensive Monte-Carlo simulation pipeline delineated in Fig. 2. Simulations are based on T1-weighted and dMRI data from the HCP subject 100308 which were processed with current state-of-the-art methods (Vansen et al., 2013). We do not expect or imply that this ground-truth connectivity matrix depicted in Fig. 3 would represent the true structural connections in a human brain. It simply provides us the necessary ground-truth connectivity that we can use to evaluate the noise and outlier effects in the Monte-Carlo simulations with more realistic picture of the whole brain than typical fiber phantoms.
Ground-truth data
We segmented the T1-weighted HCP data with FreeSurfer (Fischl, 2012) to obtain 85 regions-of-interests (ROIs) based on the Desikan-Killiany atlas (Desikan et al., 2006). Instead of the full brainstem, we used only its inferior part of medulla Es the last ROI. We used these brain segments to compute the ground-truth connectivity matrix as well as to ensure that we used only the connecting streamlines in our analyses.
To calculate a whole brain tractogram from the HCP dMRI data, we used the anatomically constrained probabilistic tractography (iFOD2) (Tournier et al., 2010; Smith et al., 2012) implemented in MRTrix3 software (Tournier et al., 2019). We used the white matter mask as a seed region for three million streamlines. The tracking parameters were left to their default values and we removed all non-connecting streamlines based on the 85 ROI segmentation of T1-image.
For tractogram filtering, we used the original COMMIT (Daducci et al., 2015) as the data did not contain any slicewise outliers. The used forward model was the stick-zeppelinball (SZB) with 1.7 · 10−3mm2/s for parallel stick and zeppelin diffusivities, 0.61 · 10−3mm2/s for perpendicular zeppelin diffusivity, and 1.7 · 10−3mm2/s and 3.0 · 10−3mm2/s for the isotropic (ball) diffusivities as model parameters (Pana-giotaki et al., 2012). The filtered tractogram was used to form the ground-truth connectivity matrix with the information from T1-segmentation (Fig. 3). Moreover, we combined this information with the final streamline contributions to form the synthetic whole brain prediction of dMRI data using the HCP’s three-shell gradient scheme. This produced 270 noise free diffusion-weighted whole brain images that we used as a ground-truth for our Monte-Carlo simulations.
Monte-Carlo design
Our Monte-Carlo simulations were based on the ground-truth synthetic whole brain dMRI data obtained from HCP subject. We split the simulations in two groups: Baseline and Test. Baseline group provides the means to evaluate the pure noise effects on the connectome whereas Test group provides the means to evaluate the outlier effects.
In Baseline group, random Rician noise was added before repeating the normal COMMIT filtering with the original non-filtered but connecting streamlines. The Rician noise had signal-to-noise ratio of 20 based on the non-diffusion weighted signal which is roughly similar with signal-to-noise ratios in clinical research. We used the same filtering parameters that were used to form the ground-truth data. This process was repeated to obtain 100 whole brain baseline images and connectomes.
In Test group, outliers were introduced to the data before adding the same Rician noise that was used for the Base-line group. Test group was filtered with both the normal COMMIT as well as the proposed robust COMMIT using the same streamlines and parameters that were used for the Baseline group. This process was repeated to obtain 100 whole brain test images with outliers and corresponding connectomes from normal and robust filtering methods.
The outlier selection for the Test group was done by replacing axial slices with signal decrease outliers in an interleaved manner to 9 (10%) of the dMRI data per shell. The chosen number is not completely arbitrary as it is reported to be approximately the maximum amount of corrupted data that robust pre-processing tools can tolerate (Andersson et al., 2016; Sairanen et al., 2018). To maximize the missing data problem(Sairanen et al., 2017), the sampling of outliers in q-space was done by picking the first gradient direction randomly and the rest were selected based on the smallest angular distances from the first one as illustrated in Fig. 4. This was performed for each shell separately.
Statistical analysis
We investigated global brain connectivity as well as individual connections using analysis of variance (ANOVA) accompanied by Tukey’s honestly significant difference (HSD) test and non-parametric Friedman’s test accompanied by two-sample Kolmogorov-Smirnov tests. The reason for having these different test statistics is that outliers can lead to skewed and long tailed distributions that might not be correctly in-vestigated solely by parametric tests. For example, two-sample Kolmogorov-Smirnov test is necessary as the outlier effect can only be studied by comparing Test groups to Baseline group. This is because Rician noise has a non-zero positive average therefore it likely causes a bias that deviates even Baseline group from the ground-truth values.
While we report p-values from these tests, we argue that the effect sizes are more interesting as they describe how different the tested groups are. The effect sizes are mea-sured using Cohen’s D for parametric tests and KolmogorovSmirnov statistic for non-parametric tests. The test statistics we employ are widely used and they provide information about average differences and differences in the shapes of the Monte-Carlo simulated distributions. For details about these tests, we recommend any textbook that covers parametric and non-parametric statistics such as Sheskin’s handbook (Sheskin, 2004).
2.3. in vivo measurements
Infant data
We obtained preliminary data from an on-going infant study to evaluate our method with in vivo measurements. T1-weighted image and dMRI data were obtained with 3T MRI Siemens Skyra system (Erlangen, Germany) with a 32 channel head coil. The dMRI acquisition consisted of 13 non-diffusion weighted images that were interspersed between 60 diffusion-weighted images with b-value of 750s/mm2 and 74 diffusion-weighted images with b-value of 1800s/mm2 each with uniquely oriented gradients. Bipolar gradient scheme was used to minimize geometrical distortions due to eddy currents. The image resolution was isotropic 2mm with 80 × 80 × 44 imaging matrix. The in-plane acceleration factor was 2 (SENSE) and multi-band acceleration factor was 2. Only anterior-posterior phase encoded images were acquired as the reverse phase encoding required manual adjustment during the scan which was deemed infeasible at the used clinical scan environment. The use of infant data in this work was approved by the relevant Ethics Committee of the Helsinki University Hospital.
Infant analyses
We used ExploreDTI (Leemans et al., 2009) with SOLID-plugin (Sairanen et al., 2018) to simultaneously detect slice-wise outliers and to correct for subject motion and eddy currents as well as registered the data to anatomical T1-image to correct for geometrical distortions. Additionally, we used Gibbs ringing correction (Perrone et al., 2015). We did not correct for signal drift (Vos et al., 2017) as it was not observed in the measurements.
Processing of this data was limited to specific computers in the hospital network which prevented memory demand-ing tasks such as segmentation with infant Freesurfer (Zöllei et al., 2020). Therefore, we opted for a simpler analysis that consisted of comparison between normal and robust filtering method outputs instead of full connectivity analyses. The T1-image contrast of this subject was not suitable for white and grey matter segmentations. We obtained a WM mask from multi-shell multi-tissue constrained spherical deconvolution (Jeurissen et al., 2014) implemented in MRTrix3 (Tournier et al., 2019) and used that as a seed mask for probabilistic whole-brain tractography (iFOD2) (Tournier et al., 2010) to generate three million streamlines.
We filtered the generated streamlines with normal COMMIT (Daducci et al., 2015) and the proposed robust COMMIT_r to evaluate the improvements in the overall fit from root mean squared error (RMSE) maps as well as to see the impact of outliers in intracellular and isotropic signal fractions. We used the stick-ball model for both filtering methods with the following parameters: 1.7 · 10−3mm2/s for parallel signal diffusivity, and 1.7·10−3mm2/s and 3.0·10−3mm2/s for the isotropic signal diffusivities.
3. Results
3.1. Simulations
We investigated the effects of noise to the structural brain connectivity by comparing Baseline group. Test groups could not be directly compared to the ground-truth due to noise bias. Therefore, outlier effects were investigated by comparing Test groups to Baseline group instead. We evaluated these differences in both global connectivity matrix score and individual connection strengths.
Global Connectivity
The global connectivity difference was defined as the sum of the root mean squared differences between the upper triangle of the connectivity matrices from Monte-Carlo groups and the corresponding ground-truth values. The results of this comparison calculated over all 100 Monte-Carlo simulations are shown in Fig. 5 with violin plots. The noise effect on the global connectivity (Baseline) is shown with the first violin from the left, the outlier effect is shown in the middle, and the outlier effect with robust filtering is shown on the right.
Both, Baseline and robust COMMIT_r produced similar global results with differences ranging from 0.2 to 0.24. This demonstrates that on average, the proposed robust filtering method is capable to mitigate the outlier effect. On the contrary, the results from normal COMMIT ranged from 0.2 to 0.3 demonstrating that outliers can have a much stronger effect than noise on the global connectivity values.
Parametric statistical analysis
The global connectivity differences with ANOVA detail that the group averages were statistically different with p-value less than 0.01. Tukey’s HSD test depicted that normal COMMIT had significantly different mean to both Baseline and COMMIT_r results with p-value less than 0.01. We evaluated the effect size with Cohen’s D. The effect size between Baseline and normal COMMIT was 2.0 and the difference between robust COMMIT_r to COMMIT was 2.2. The difference between Baseline and COMMIT_r was not statistically significant with p-value of 0.56 and had a smaller effect size of 0.4. These effect sizes indicate that in our highly realistic scheme with 10% of outliers, the average bias caused by outliers quickly increases and compromises the connectivity analyses if data is not processed robustly.
Non-parametric statistical analysis
The Friedman’s test reported also p-value less than 0.01 therefore providing additional support for the graphical analysis and ANOVA results. We applied the two-sample Kolmogorov-Smirnov test to detect which of the distributions were different. Comparison between Baseline and normal COMMIT as well as COMMIT_r and COMMIT resulted in statistically significant differences between distributions with p-values less than 0.01. Respective effect sizes measured with Kolmogorov-Smirnovs statistic were 0.84 and 0.85. The difference between Baseline and robust COMMIT_r was not statistically significant with p-value slightly above 0.05 and effect size of 0.19 being nearly 80% smaller compared to the non-robust counterpart.
Individual Connectivity
We investigated the element-wise differences between Monte-Carlo connectivity matrices with parametric and non-parametric statistics as complementary information to the global results. The three violin plots in Fig. 6 depict the connectivity values from medulla to the right precentral gyrus. These streamlines are visualised in Fig. 7 and are likely a part of the corticospinal tract and therefore a known true connection.
The noise effect results in a systematic over estimation of connectivity strength as depicted by Baseline in Fig. 6. However, outliers have a more random effect depending on he affected dMRI measurements. This can either decrease or increase the connectivity strength and can counteract the noise effect. Therefore, group comparisons against Baseline were more meaningful than comparisons against the known ground-truth value. For example, in this case the normal COMMIT produces an average connectivity strength that is closer to the ground-truth than Baseline despite the distribution is wider. Therefore, a simple observation of root mean squared difference between the Monte-Carlo group and groundtruth could be misleading.
Parametric statistical analyses
The connectivity-wise differences between Baseline and normal COMMIT as well as Baseline and robust COMMIT_r are shown in Fig. 8. The color map indicates the effect size measured with Cohen’s D. Only elements that were deemed significantly different (p-value less than 0.05) based on ANOVA and Tukey’s HSD were drawn. The comparison between Baseline and normal COMMIT resulted in more elements with significant differences than the comparison between Baseline and COMMIT_r. The effect sizes between Baseline and normal COMMIT ranged from 0 up to 5 indicate that outliers can have strong adverse effects on specific connectivity matrix elements. The overall smaller effect sizes between Baseline and robust COMMIT_r highlight that our augmentation is well capable to mitigate the outlier effects even on individual connectivity level.
Non-parametric statistical analysis
The connectivitywise distributional differences between Baseline and normal COMMIT as well as Baseline and robust COMMIT_r are shown in Fig. 9. The color map indicates the effect size measured with Kolmogorov-Smirnov statistic. Only elements that were deemed statistically significantly different (p-value less than 0.05) based on Kolmogorov-Smirnov tests were drawn. Similar to the parametric counterpart, the differences between Baseline and normal COMMIT were again more frequent than differences between Baseline and robust COMMIT_r. Also the effect sizes between Baseline and normal COMMIT ranged from 0 to nearly 1 which is the maximum of the used statistic. This indicates that outliers can lead to very large distributional differences. The differences between Baseline and robust COMMIT_r remained relatively small with effect sizes ranging from 0 to 0.2.
3.2. in vivo measurements
Besides tractogram filering, we calculated the intracellular and isotropic signal fractions calculated using the COMMIT framework (Daducci et al., 2015) and the proposed robust COMMIT_r. Fig. 10 shows the results for outlier detection, RMSE, and signal fraction maps obtain from the infant data. On average, the amount if missing data i.e. how much confidence in fitting was decreased per slice position ranged from 5% to 19%.
The RMSE map of normal COMMIT was clearly affected by the outliers resulting in visible stripes in the image. On the contrary, the COMMIT_r RMSE map that describes the robust cost function does not have such stripes therefore the fitting is not affected by outliers. The difference RMSE map visualises the stripy pattern more prominently and ranges from 0 to 30%. The outlier effect on intracellular and isotropic signal fractions was less prominent in visual analysis i.e. less or no stripes. However, the difference between normal COMMIT and robust COMMIT_r depicts that the differences ranged from −10% to +10% even in regions that were less affected by outliers for intracellular signal fraction. For isotropic signal fraction the differences ranged from −7% to +7%.
4. Discussion
We demonstrated that tractogram filtering is severely affected by subject motion artefacts and that with our proposed robust augmentation these effects can be mitigated. In clinical research with uncooperative patients such as infants, it is highly likely that motion to some degree occurs during scan. This leads to corrupted measurements which should not affect any modeling methods applied to the data. To best of our knowledge, this is the first time that motion related outliers are considered in the context of tractogram filtering therefore this update is crucial to enable tractogram filtering in clinical research.
The reason why we evaluated the proposed augmented cost function with simulated brains instead of real brain data was simply to ensure that nothing else in the relatively long dMRI processing pipeline might affect the results. For example, it is currently unknown issue, how outliers affect constrained spherical deconvolution based probabilistic tractographies. While there have been proposals for robust higher order model estimators (Pannek et al., 2012), such are not widely available. Furthermore, developing and evaluation of robustness of currently available constrained spherical deconvolution tractography algorithms are beyond the main scope of this study.
Comparison to other filtering methods
While similar weighted cost function as in eq. 1 has been proposed before in SIFT filtering algorithm (Smith et al., 2013), those have been designed and tested to account for partial voluming related artefacts not subject motion. The main difference in these artefact types is that partial voluming affects all dMRI data whereas subject motion affects only part of the dMRI data randomly. Therefore, adjusting for partial voluming requires one three-dimensional reliability image whereas adjusting for subject motion requires fourdimensional reliability image as the measurement reliability must be accounted for each dMRI data separately. This difference in the implementations of the algorithms also makes the accurate comparison of them fall outside the scope of this study. While outlier replacement (Lauzon et al., 2013; Andersson et al., 2016) could seem beneficial to correct for subject motion, they actually cannot increase the available information but they could lead to difficulties in the interpretation and comparison of the obtained results (Sairanen et al., 2018).
Correcting for artefacts
Our proposed algorithm (Fig. 1 can also be used to adjust for partial voluming but the necessity of that depends on the forward model used in COMMIT. For example, with ball and sticks model, voxels containing cerebrospinal fluid or gray matter can be described with an increased contribution from a ball compartment therefore the contribution of a stick compartment could be correct even without additional reliability weighting. If reliability weights are used, then the estimate for ball compartment would likely be improved but that should not still affect the filtered tractogram.
With motion induced artefacts, the outliers cause anisotropic signal deviations (Sairanen et al., 2017) affecting only part of the dMRI datas. Therefore, COMMIT cannot adjust for those deviations simply by increasing the contribution of the ball compartment as the the deviations are not isotropic over dMRI measurements. This is demonstrated in Fig. 10 where normal COMMIT obtains incorrect estimates for isotropic signal fraction maps i.e. ball compartments. Issue propagates causing also incorrect estimates for intracellular signal fraction maps i.e. stick contributions. Therefore, a local motion artefact can have a global adverse effect in tractography filtering if not accounted for.
Computational speed
We evaluated our algorithm with the CPU version of COMMIT using a PC with ten 3.6 GHz cores. The 300 whole brain tractogram filterings with HCP like data in our Monte-Carlo simulations required approximately a week to run. Thus, one whole brain tractogram filtering took approximately 30 minutes.
Statistical analysis
The global connectivity difference (Fig. 5) between Baseline and robust COMMIT_r assessed with two-sample Kolmogorov Smirnov was nearly statistically significant with p-value slightly above 0.05. It is possible that the amount of simulated outliers (%10) was already reaching the limit after which the missing data problem becomes too severe even for robust modeling methods. This could also be related to sample size being so large that Kolmogorov-Smirnov test finds any differences statistically significant despite having relatively small effect sizes.
A more in-depth analysis of the connection from medulla to the right precentral gyrus (Figs. 6 and 7) revealed that ANOVA failed to find statistically significant differences between the groups with a p-value of 0.4. On the contrary, the non-parametric Friedman’s tests indicated that differences existed between the groups with a p-value less than 0.01. The effect sizes measured using Kolmogorov-Smirnov statistic between Baseline and normal COMMIT was 0.54 and between Baseline and robust COMMIT_r was 0.33. Both differences were found statistically significant by the Kolmogorov-Smirnov test with p-values less than 0.01.
In summary, it remains unsolved what test statistic would be the most suitable to analyse such data that is affected by outliers in anisotropic manner. We used two alternative approaches to evaluate the differences in group averages (ANOVA) and group distributions (Kolmogorov-Smirnov). Average based analyses are likely inefficient to locate all differences arising from outliers in the data whereas non-parameteric test can be even too sensitive to label finding significant. Therefore, instead of statistical significance, the obtained effect sizes are likely more meaningful results.
Where to go from here?
We considered only post-scan motion corrections in this study because during-scan corrections should be able to produce data that does not need these correction algorithms. The problem with during-scan corrections is their limited availability due to external hardware requirements or still experimental software. Due to the long time span of tens of years required to advance MRI technology in clinical use, it is unlikely that these during-scan correction methods would be so widely available in clinical research centers that postscan corrections such as our proposal are rendered obsolete any time soon. While the post-scan corrections are more like a remedy to the symptom instead of cure to the cause, novel studies on clinical patients and even infants are increasingly proposed and carried out therefore the need for robust tools is current and cannot wait decades for hardware based solutions.
5. Conclusion
We proposed a augmentation to a tractogram filtering algorithm COMMIT that renders it robust towards subject motion caused outliers in the measurements. This addition is necessary for conducting tractogram filtering in clinical research where subject motion is often unavoidable. While robust data processing has been implemented before in the context of diffusion tensor and higher order model estimations, it has not been previously implemented for tractogram filtering. We used highly realistic whole brain Monte-Carlo simulations and successfully demonstrated that our augmentation is capable to accurately map the structural brain connectivity in the presence of such outliers in the data. We also demonstrated that if this correction is not done, the structural connectivity estimates can become extremely biased. With this update any clinical study investigating structural connectomics of children or uncooperative patient populations can robustly perform their analyses without the need to exclude subjects with outliers from them.
Acknowledgements
V.S. was supported by the Brain Research Foundation Verona. M.O-P., C.G., S.S. and A.D have no relevant financial or non-financial interests to disclose.