Abstract
Background Alzheimer’s disease (AD) is a neurodegenerative disease that leads to anatomical atrophy, as evidenced by magnetic resonance imaging (MRI). Automated segmentation methods are developed to help with the segmentation of different brain areas. However, their reliability has yet to be fully investigated. To have a more comprehensive understanding of the distribution of changes in AD, as well as investigating the reliability of different segmentation methods, in this study we compared volumes of cortical and subcortical brain segments, using automated segmentation methods in more than 60 areas between AD and healthy controls (HC).
Methods A total of 44 MRI images (22 AD and 22 HC, 50% females) were taken from the minimal interval resonance imaging in Alzheimer’s disease (MIRIAD) dataset. HIPS, volBrain, CAT and BrainSuite segmentation methods were used for the subfields of hippocampus, and the rest of the brain.
Results While HIPS, volBrain and CAT showed strong conformity with the past literature, BrainSuite misclassified several brain areas. Additionally, the volume of the brain areas that successfully discriminated between AD and HC showed a correlation with mini mental state examination (MMSE) scores. The two methods of volBrain and CAT showed a very strong correlation. These two methods, however, did not correlate with BrainSuite.
Conclusion Our results showed that automated segmentation methods HIPS, volBrain and CAT can be used in the classification of AD and HC. This is an indication that such methods can be used to inform researchers and clinicians of underlying mechanisms and progression of AD.
1 Introduction
Alzheimer’s disease (AD) is a devasting neurodegenerative disease, contributing to 60-70% of dementia cases 1. Currently there are around 50 million people with dementia worldwide. In 2015, the total global societal cost of dementia was estimated to be US$ 818 billion 2,3 Mainly due to increased life-expectancy, the total number of people with dementia is projected to reach 82 million (64% increase) in 2030, with between 49 and 57 million of these cases being AD 4 Whilst several drugs are available to mitigate the symptoms in some cases, no treatments are available that prevent progression from the relatively late stage at which the disease is diagnosed 5.
AD is characterised by two main pathological hallmarks: extracellular amyloid deposits, composed of insoluble amyloid beta (Aβ) protein, and intra-neuronal neurofibrillary tangles (NFTs), containing hyperphosphorylated tau protein 6. AD is also characterised by a significant loss of neurons and synapses, resulting in brain shrinkage and atrophy 7,8. Structural changes have been shown to be one of the earliest biomarkers that can be used in the diagnosis of AD 9–11. Much effort has been devoted to find patterns of changes in the structure of different brain areas that can be reliably used for diagnosis of AD 12
Earlier investigations relied mostly on manual segmentation of brain areas requiring a great deal of expertise and time 13–16. Therefore, the majority of the focus has been devoted to changes in the hippocampus due to its distinct structure 17 It has been shown that a loss in hippocampal volume can be an indication of AD 18,19 Further investigations have looked at subfields of the hippocampus, showing a nonuniform rate of neuroplasticity due to their specialisation 20–22. For example, it has been shown that NFT begin in the medial temporal region and exhibit a characteristic distribution pattern across subfields, starting in the CA1 and later spreading to subiculum, CA2, CA3 and CA4/Dentate Gyrus 23–27.
With the development of semi- and fully-automated segmentation methods, however, it has now become easier and faster to segment not only the hippocampal area, but also other brain areas 28–32. HIPpocampus subfield Segmentation (HIPS) 33, volBrain 34, Computational Anatomy Toolbox (CAT) 35,36, BrainSuite 37,38 and FreeSurfer 39 are some of the commonly used semi- and fully-automated methods. These methods, however, are yet under development 40,41. For example, CA1 segmentation in the FreeSurfer v5.3 was partially included in the subiculum 18 potentially explaining why the CA1 field was reported to be insensitive to AD pathology in some 42,43 but not all 44,45. Similar findings have recently raised questions and concerns regarding the accuracy and consistency of these methods 46–49 Therefore, it is important to investigate the accuracy of these methods further 50.
Benefiting from the computational power of automated methods, analysis of a large number of brain images has become more feasible. Large datasets of brain scans such as Minimal Interval Resonance Imaging in Alzheimer’s Disease (MIRIAD), a public database of Alzheimer’s magnetic resonance imaging (MRI) 51, offer a great opportunity to have a more comprehensive approach to the underlying mechanism and progression of AD 52–56. It also facilitates multisite studies to form a more accurate understanding of the disease 57,58.
Mini mental state examination (MMSE) is one of the commonly accepted measurements of cognitive ability, in particular in clinical settings 59–61. This measure has been widely used in classification of AD. For example, MIRIAD classifies participants with score between 12 and 26/30 as AD and those higher than 26/30 as healthy control. There is huge body of literature showing correlation between MMSE score and brain atrophy 62–64
The aim of this study was to investigate the reliability of four automated segmentation methods of volBrain, CAT and BrainSuite for segmentation of the whole brain, and HIPS for segmentation of subfields of hippocampus, which belongs to the same analysis tool as volBrain. We used images belonging to MIRIAD. Correlation of the volume of each brain area with MMSE scores are also investigated. To investigate the reliability of the three methods volBrain, CAT and BrainSuite, the correlation of their common brain areas is also reported.
2 Material and Methods
2.1 Subjects
We used 44 images (22 AD and 22 HC) taken from the Minimal Interval Resonance Imaging in Alzheimer’s Disease (MIRIAD) dataset 51. Table 1 shows a summary of the descriptives of the participants.
2.2 Magnetic Resonance Imaging (MRI)
Data was extracted from MIRIAD database. All subjects underwent MRI scanning on a 1.5 T Signa scanner (GE Medical Systems, Milwaukee, WI, USA). T1-weighted volumetric images were obtained using an inversion recovery prepared fast spoiled gradient echo sequence with acquisition parameters time to repetition = 15 ms, time to echo = 5.4 ms, flip angle = 15°, TI = 650 ms, a 24-cm field of view and a 256 × 256 matrix, to provide 124 contiguous 1.5-mm thick slices in the coronal plane (voxels 0.9735 × 0.9735 × 1.5 mm3) 51.
2.3 Methods
HIPS and volBrain
The volumes of Cerebrospinal fluid (CSF), white matter (WM), grey matter (GM), brain hemispheres, cerebellum and brainstem were obtained using volBrain pipeline 34. This method is based on an advanced pipeline providing automatic segmentation of different brain structures from T1 weighted MRI, Figure 1. The preprocessing is based on the following procedure: (1) a denoising step with an adaptive non-local mean filter, (2) an affine registration in the Montreal Neurological Institute (MNI) space, (3) a correction of the image inhomogeneities, and (4) an intensity normalisation. (5) Afterwards, MRI images are segmented in the MNI space using non-local patch-based multi-atlas method. Images were corrected for intensity inhomogeneity using the N4 algorithm 65, and the images were segmented into brain/non-brain using a semi-automated technique (MIDAS). The Non-Local Means filter 66 was applied to each pixel of the image by computing a weighted average of surrounding pixels using a robust similarity measure that takes into account the neighbouring pixels surrounding the pixel being compared. This segmentation method is based on the idea of non-local patch-based label fusion technique, where patches of the brain image to be segmented are compared with those of the training library, looking for similar patterns within a defined search volume to assign the proper label 67,68. HIPS and volBrain are used for segmentation of the hippocampus subfields and the rest of the brain, respectively 33.
CAT
Computational Anatomy Toolbox (CAT) is a powerful package for brain T1-MRI data segmentation, Figure 2. It is a voxel base estimation method 69. The CAT preprocessing steps are as follows: (1) spatial registration to a template, (2) tissue segmentation into grey, white matter and CSF, and (3) bias correction of intensity non-uniformities. (4) Finally, segments are extracted by scaling the amount of volume changes based on spatial registration, so that the total volume of grey matter in the modulated image remains the same as the original image. For correction of the orientation and size of the brain, non-linear registration methods are applied to the image 70. Projection-based thickness (PBT) method is used for calculation of the cortical thickness and central surface 70,71. Spatial-adaptive Non-Local Means (SANLM) and classical Markov Random Field (MRF) were used for image Denoising 72. Adaptive Maximum a Posterior (AMAP) method was used for segmentation 69
BrainSuite
BrainSuite is an open source software tool that enables largely automated cortical surface extraction from MRI of the brain, Figure 3. BrainSuite includes automatic cortical surface extraction, bias field correlation, cerebrum labelling, and surface generation features. Also, this toolbox is used in tractography and connectivity matrix calculation in diffusion imaging data 37.
2.4 Statistical Analysis
Independent-sample t-tests are run to compare the volume of different brain areas between the AD and HC groups for volBrain, CAT and BrainSuite for the whole brain, and HIPS for the hippocampus subfields. Bivariate-correlation analyses are also run to investigate the relationship between volume and MMSE scores for all four segmentation methods. Correlational analyses are run between the common brain areas in volBrain, CAT and BrainSuite to investigate the relationship between the three methods. Bonferroni correction is applied to account for multiple comparison.
3 Results
Using three automatic segmentation methods CAT, volBrain and BrainSuite, we segmented the whole brain, and using HIPS we segmented the hippocampus. Using independent-sample t-tests we compared the volumetric data for AD and HC for each segment. Figures 3–6 show sample output images for one AD patient and one HC participant. Furthermore, we investigated the correlation of volumetric data with MMSE scores in both groups.
CAT segmentation method returned data for 63 distinct brain areas. This method highlighted many brain areas that are significantly different between the two groups, Table 2. In particular fusiform gyrus, parahippocampal gyrus, hippocampus, entorhinal cortex, amygdala, temporal gyri, thalamus, nucleus accumbens, insula, caudate and precuneus were significantly different. Importantly, the size of all these brain areas showed a strong correlation with MMSE scores. For further details see supplementary figures 1-3.
volBrain segmentation method returned data for eight distinct brain areas. In particular the amygdala, hippocampus, nucleus accumbens, thalamus and caudate were significantly different between the two groups, Table 3. Again, the size of all these brain areas showed a strong correlation with MMSE scores. For further details see supplementary figures 4-6.
BrainSuite segmentation method returned data for 50 distinct brain areas. In contrast to CAT and volBrain, this method highlighted only six brain areas that are significantly different between the two groups, Table 4. These brain areas included temporal gyri, third ventricle, supramarginal gyrus and angular gyrus. Similar to previous segmentation methods, all these brain areas showed strong correlation with MMSE scores. For further details see supplementary figures 7-9.
HIPS segmentation method returned data for the whole hippocampus and five of its subfields: CA1, CA2-CA3, CA4/Dentate Gyrus, Subiculum and strata radiatum/lacunosum/moleculare (SR-SL-SM). All these areas showed a significant difference between the two groups, Table 5. The size of hippocampus and all its subfields showed strong correlation with MMSE scores. For further details see supplementary figures 10-12.
To investigate the relationship between the three whole-brain segmentation methods CAT, volBrain and BrainSuite, we ran correlational analysis, Table 6. Seven brain areas were common between these methods: nucleus accumbens, amygdala, caudate, globus pallidus, hippocampus, putamen and thalamus. CAT and volBrain showed strong correlation for nucleus accumbens, amygdala, caudate, hippocampus and thalamus. Two brain areas globus pallidus and putamen were not significantly correlated. These brain areas did not show significant difference between the two groups either. BrainSuite, however, showed no significant correlation with either of the other two segmentation methods.
4 Discussion
We used HIPS automated method to segment the subfields of hippocampus, and CAT, volBrain and BrainSuite automated methods to segment the whole brain using T1 weighted MRI data. Our results showed that all subfields of hippocampus in the Alzheimer’s Disease (AD) group were significantly smaller than those of the healthy control (HC) group. The atrophy of all subcomponents of hippocampus were correlated with the MMSE measure. Quite a large portion of cortical and subcortical areas in the brain were also smaller in the AD group as compared to the control group, as evident from CAT and volBrain segmentation results. The shrinkage in these brain areas mostly showed a strong correlation with MMSE measure. BrainSuite failed to discriminate between the two groups. While CAT and volBrain shows a strong correlation, BrainSuite did not show any significant correlation with CAT and volBrain.
With the advancement of computational methods, fine-grain analysis of the brain areas is more feasible 73–75. Earlier methods relied heavily on manual segmentation of the brain areas, which was extremely time demanding and also required a great level of expertise. Therefore, the majority of the analysis was limited to brain areas with more distinct structure, such as the hippocampus. Many semi- and fully-automated segmentation methods have been developed. While these methods have been used more commonly in recent years, the reliability and accuracy of these methods was yet to be fully studied. We used four pipelines of HIPS 33, volBrain 34, CAT 35,36 and BrainSuite 37 In this study we evaluated their reliability by looking at their ability to discriminate between AD and HC groups, whether a correlation existed between them, their correlation with MMSE scores, and comparing their results with past literature. Our results showed strong reliability of HIPS, volBrain and CAT. These methods have been successfully applied to brain images from those with AD 76–79
BrainSuite, however, underperformed greatly. For example, it failed to accurately segment the hippocampus, thalamus and amygdala to show a significant difference between the two groups. While this automatic segmentation method has been used frequently in past research 80, its application has been mostly limited to the processing of brains with no atrophy 81,82, as well as detection of gross segments such as tumours 83. This highlights the importance of validation studies such as ours to gain a greater understanding of the applications and limitations of different methods 84–87
The volume of the hippocampus is considered as an important biomarker for AD and has been included in recently proposed research diagnostic criteria 88,89. It has been shown that the hippocampal atrophy estimated on anatomical T1 weighted MRI can help in classifying the different stages of AD 90–92 Confirming past literature, our results showed that the hippocampus volume significantly differed between AD and the control group.
Histological studies have shown that lesions are not uniformly distributed within the hippocampus 91,93. Neuronal loss results in a reduction of the thickness of the layers richer in neuronal bodies, while the loss of synapses results in the reduction of the layers poorer in neuronal bodies 94–97 and these changes are stage-dependent 98,99 Our results, however, failed to differentiate the contribution of these subfields in AD; they all showed significant reduction in size, compared to the control group. This effect could be because our AD group consisted of those with later stages of AD. The contribution of different subfields of the hippocampus is more visible in those with mild cognitive impairment 100–102 Therefore, in future studies it would be informative to include participants with different stages of AD to investigate the contribution of different subfields of the hippocampus in AD.
While the contribution of atrophy in the hippocampus has been widely studied, the role of atrophy in the rest of the brain in AD is less clear 40. An important contributing factor is that the boundaries of the hippocampus are easier for human operators or automated algorithms to recognise than other brain areas such as the amygdala, entorhinal cortex or thalamus 40. Due to methodological advances, however, it is now possible to measure atrophy across the entire cortex with good precision 103. Our results from CAT and volBrain methods showed strongly significant differences between many brain areas such as the amygdala, thalamus, nucleus accumbens, insula and caudate. These findings are in-line with past literature showing similar differences in these brain areas 40,104–106.
There is a growing body of literature showing a correlation between cognitive decline and brain atrophy 62,107,108. For example, it has been shown that basal forebrain changes are correlated with cognitive decline in MCI and AD patients, as measured with recall task and MMSE 109,110, as well as healthy participants that later progressed to AD 111. Atrophy of other brain areas such as lateral and medial parietal cortex, as well as lateral temporal cortex have also been shown to have a correlation with cognitive decline 112 Our results showed a strong correlation between brain atrophy and cognitive decline as measured by MMSE. All brain areas that were significantly different between the AD and the control group showed a significant correlation with MMSE, except for the caudate (CAT p = 0.001155, volBrain p = 0.005091, Bonferroni corrected statistic not significant). While the effect of shrinkage of the caudate in AD is not very clear 113, there is some evidence that caudate volume has a correlation with MMSE measures, although not as strongly as other brain areas such as the thalamus 63,114 An important consideration is that atrophy in the left caudate has a stronger role in AD, as compared to the right caudate 113,115. Our analysis combined both the left and right caudate, which may have led to this inconsistency between our results and previous literature.
Although AD commonly presents as an amnestic syndrome, there is significant heterogeneity across individuals 116, which is accompanied by different atrophy patterns 64,117–119 For example, while those with more language difficulties might exhibit greater atrophy in temporal or parietal regions 120,121, those with more visual difficulties might have greater atrophy in posterior cortical regions 122,123. Availability of the automated systems offers many opportunities, such as the ability to analyse a large number of brain images with reasonable time and expertise. This is in particular very appealing, considering the increased number of large datasets such as MIRIAD. Automated systems can go through the collection and aggregate data from a wide range of participants, healthy and patients to gain a greater understanding of AD. This is important considering the heterogeneity of the disease and its progression 124.
Another application of automated systems is in clinical settings. By the time of diagnosis, rapid ongoing atrophy is already far advanced 125,126. Early diagnosis of AD can help with deceleration of the progression of the disease 21,127,128. This is particularly important as there are modifiable factors that can help with brain health129–131. Therefore, a massive effort has been devoted to the development of diagnostic methods to enable researchers and clinicians to detect AD and cases with potential progression to AD, as early as possible 132,133. For the development of preventive strategies, it is important to predict future brain atrophy, as this may aid in identifying which individuals with normal cognition are more susceptible of progressing to later stages of AD 134–136. Automated systems provide additional information to clinicians, enabling them to have a greater understanding of the progression of the atrophy 12,137 Some of these methods have already received approval from different licensing bodies such as CE (European conformity) and FDA (food and drug administration, USA) approval 138. These methods, however, come with some limitations such as speed of processing, expensive licences, or requirement of other specialised software. This study is another step to evaluate freely available analytical tools to achieve an ideal analysis pipeline, suitable for researchers and clinicians.
Availability of the reliable automated segmentation methods enables researchers and clinicians to have a greater understanding of the underlying mechanisms and the progression of the AD. This will allow them to attempt to prevent or decelerate the progression of the disease more effectively. Future research can look at the rate of atrophy to predict the progression of disease 139–142 This rate can be helpful to have a more informed understanding whether an individual with MCI will later progress to AD or not 105,143,144 The output of automated segmentation methods can also be used in training of intelligent classification methods such as those using artificial neural networks and support vector machines, which has shown promising results 145–154
The purpose of this article was not to identify the superiority of any particular automatic segmentation method over another, but to solely highlight possible limitations and applications of four commonly used segmentation methods. We proposed that CAT, volBrain and HIPS are methods that can robustly operate on brain images with significant atrophy and can be used in research and clinical settings. BrainSuite, however, should be used with caution for brain images with atrophy.
Authors’ Contribution
JZ and AHJ analysed the data. JZ and AHJ wrote the manuscript. JZ, AHJ and AS revised the manuscript.
Footnotes
References
- 1.↵
- 2.↵
- 3.↵
- 4.↵
- 5.↵
- 6.↵
- 7.↵
- 8.↵
- 9.↵
- 10.
- 11.↵
- 12.↵
- 13.↵
- 14.
- 15.
- 16.↵
- 17.↵
- 18.↵
- 19.↵
- 20.↵
- 21.↵
- 22.↵
- 23.↵
- 24.
- 25.
- 26.
- 27.↵
- 28.↵
- 29.
- 30.
- 31.
- 32.↵
- 33.↵
- 34.↵
- 35.↵
- 36.↵
- 37.↵
- 38.↵
- 39.↵
- 40.↵
- 41.↵
- 42.↵
- 43.↵
- 44.↵
- 45.↵
- 46.↵
- 47.
- 48.
- 49.↵
- 50.↵
- 51.↵
- 52.↵
- 53.
- 54.
- 55.
- 56.↵
- 57.↵
- 58.↵
- 59.↵
- 60.
- 61.↵
- 62.↵
- 63.↵
- 64.↵
- 65.↵
- 66.↵
- 67.↵
- 68.↵
- 69.↵
- 70.↵
- 71.↵
- 72.↵
- 73.↵
- 74.
- 75.↵
- 76.↵
- 77.
- 78.
- 79.↵
- 80.↵
- 81.↵
- 82.↵
- 83.↵
- 84.↵
- 85.
- 86.
- 87.↵
- 88.↵
- 89.↵
- 90.↵
- 91.↵
- 92.↵
- 93.↵
- 94.↵
- 95.
- 96.
- 97.↵
- 98.↵
- 99.↵
- 100.↵
- 101.
- 102.↵
- 103.↵
- 104.↵
- 105.↵
- 106.↵
- 107.↵
- 108.↵
- 109.↵
- 110.↵
- 111.↵
- 112.↵
- 113.↵
- 114.↵
- 115.↵
- 116.↵
- 117.↵
- 118.
- 119.↵
- 120.↵
- 121.↵
- 122.↵
- 123.↵
- 124.↵
- 125.↵
- 126.↵
- 127.↵
- 128.↵
- 129.↵
- 130.
- 131.↵
- 132.↵
- 133.↵
- 134.↵
- 135.
- 136.↵
- 137.↵
- 138.↵
- 139.↵
- 140.
- 141.
- 142.↵
- 143.↵
- 144.↵
- 145.↵
- 146.
- 147.
- 148.
- 149.
- 150.
- 151.
- 152.
- 153.
- 154.↵