Automatic Spatial Estimation of White Matter Hyperintensities Evolution in Brain MRI using Disease Evolution Predictor Deep Neural Networks

Muhammad Febrian Rachmadi; Maria del C. Valdés-Hernández; Stephen Makin; Joanna Wardlaw; Taku Komura

doi:10.1101/738641

Abstract

Previous studies have indicated that white matter hyperintensities (WMH) may evolve, i.e., shrink, grow, or stay stable, over a period of time. However, predicting the evolution of WMH is challenging because the rate and direction of WMH evolution varies greatly across studies. Evolution of WMH also has a non-deterministic nature because some clinical factors that possibly influence it are still not known. In this study, we attempt to predict the evolution of WMH from baseline image while addressing the non-deterministic nature of this process by proposing an end-to-end training model that uses deep learning and an auxiliary input module. We name this proposed model “Disease Evolution Predictor” (DEP). The DEP model receives a baseline image as input and produces a map called “Disease Evolution Map” (DEM), which represents the evolution of WMH from baseline to follow-up. Two models of DEP are proposed, i.e., DEP-UResNet and DEP-GAN, which represent supervised and unsupervised deep learning algorithms respectively. DEP-UResNet uses baseline T2-FLAIR as main input while DEP-GAN uses either baseline irregularity map (IM) or probability map (PM) instead. The generated DEM can be compared to the real DEM by using a simple subtraction between baseline and follow-up assessments. In this study, labels of WMH manually produced by an expert are used as reference standard for evaluation. To simulate the non-deterministic and unknown parameters involved in WMH evolution, we propose modulating a Gaussian noise array to the DEP model as auxiliary input. This forces the DEP model to imitate a wider spectrum of alternatives in the results. The alternatives of using other types of auxiliary input instead, such as baseline WMH and stroke lesion loads were also tested. Note that baseline WMH load is the strongest known risk factor of WMH evolution. Based on our experiments, DEP-GAN using PM and Gaussian noise as auxiliary input yielded one of the best results in almost all evaluations, including clinical plausibility. The DEP-UResNet regularly performed better than the DEP-GAN using PM and Gaussian noise in some evaluations, but eventually it did not show promise in the clinical evaluation. Moreover, supervised DEP-UResNet also requires manual WMH labels on two MRI scans for training, which, in many scenarios, is not applicable. To the best of our knowledge, this is the first extensive study on modelling WMH evolution using deep learning algorithms and dealing with the non-deterministic nature of WMH evolution.

1. Introduction

White matter hyperintensities (WMH) are neuroradiological features seen in T2-weighted and fluid attenuated inversion recovery (T2-FLAIR) brain magnetic resonance images (MRI). Clinically, WMH have been commonly associated with stroke, ageing, and dementia progression (Wardlaw et al., 2013; Prins and Scheltens, 2015). Furthermore, recent studies have shown that WMH may decrease (i.e., shrink/regress), stay unchanged (i.e., stable), or increase (i.e., grow/progress) over a period of time (Ramirez et al., 2016; Chappell et al., 2017; Wardlaw et al., 2017). In this study, we refer to theses changes as “evolution of WMH”.

In early studies, evolution of WMH and its severity were presumed to be linearly progressing over time with age (Veldink et al., 1998; Schmidt et al., 2003) due to lack of data from more than one follow-up assessment (van Leijsen et al., 2017b). More-over, in the early longitudinal studies of WMH, reduction (i.e., regression) of WMH load was only observed in a small number of patients (Schmidt et al., 2003, 2005; Sachdev et al., 2007; Gouw et al., 2008b; Maillard et al., 2009; Prins et al., 2004; Rovira Cañellas et al., 2007). Hence, most earlier studies regarded the regression of WMH as either measurement error (Sachdev et al., 2007; Maillard et al., 2009; Schmidt et al., 2003, 2005) or “no progression” (Prins et al., 2004; van Dijk et al., 2008; Gouw et al., 2008b). Moreover, bias in manual delineation of WMH towards progression cannot be overlooked when the raters were aware of the scans’ time sequence (Schmidt et al., 1999, 2005).

With increasing longitudinal data over the years, recent studies have suggested that evolution of WMH may be a non-linear process over time (Wardlaw et al., 2017; van Leijsen et al., 2017b) and have different dynamics in each patient (Ramirez et al., 2016). For example, WMH volume of one subject may grow in the first follow-up assessment, and shrink in the second follow-up assessment (or vice versa). Also, in an individual patient, different WMH clusters may simultaneously either grow, shrink, or remain stable (see bottom-right figure in Figure 1 as an example).

Figure 1:

“Disease evolution map” (DEM) (right) is produced by subtracting baseline images (middle) from follow-up image (left). In DEM produced by irregularity map (IM) (first row) and probability map (PM) (second row), bright yellow pixels represent positive values (i.e., progression) while dark blue pixels represent negative values (i.e., regression). On the other hand, DEM produced by binary WMH label (LBL) (third row) has three foreground labels which represent progression or “Grow” (green), regression or “Shrink” (red), and “Stable” (blue). We named this special DEM as three-class DEM label (LBL-DEM).

Predicting the evolution of WMH is challenging because the rate and direction of WMH evolution varies considerably across studies (Schmidt et al., 2016; van Leijsen et al., 2017a,b) and several risk factors, either commonly or not fully known, could be involved in their progression (Wardlaw et al., 2017). For example, some risk factors and predictors that have been commonly associated with WMH progression are baseline WMH volume (Schmidt et al., 1999, 2002a,b, 2003; Sachdev et al., 2007; van Dijk et al., 2008; Wardlaw et al., 2017; Chappell et al., 2017), blood pressure or hypertension (Veldink et al., 1998; Schmidt et al., 1999, 2002b; van Dijk et al., 2008; Godin et al., 2011; Verhaaren et al., 2013), age (van Dijk et al., 2008), current smoking status (Power C et al., 2015), previous stroke and diabetes (Gouw et al., 2008a; Wardlaw et al., 2017), and genetic properties (Schmidt et al., 2002a, 2011; Godin et al., 2009; Luo et al., 2017). Surrounding regions of WMH that may appear like normal appearing white matter (NAWM) with less structural integrity, usually called the “penumbra of WMH” (Maillard et al., 2011), have also been reported as having a high risk of becoming WMH over time (Maillard et al., 2014; Pasi et al., 2016). On the other hand, regression of WMH volume has been reported in several radiological observations on MRI, such as after cerebral infraction (Moriya et al., 2009), strokes (Durand-Birchenall et al., 2012; Cho et al., 2015; Wardlaw et al., 2017), improved hepatic encephalopathy (Mínguez et al., 2007), lower blood pressure (Wardlaw et al., 2017), liver transplantation (Rovira Cañellas et al., 2007), and carotid artery stenting (Yamada et al., 2010). While a recent study suggested that areas of shrinking WMH were actually still damaged (Jiaerken et al., 2018), a more recent study showed that WMH regression did not accompany brain atrophy and suggested that WMH regression follows a relatively benign clinical course (van Leijsen et al., 2019).

In this study, we propose an end-to-end training model for automatically predicting and spatially estimating the dynamic evolution of WMH from baseline to the following time point using deep neural networks called “Disease Evolution Predictor” (DEP) model (discussed in Section 3). The DEP model produces a map named “Disease Evolution Map” (DEM) which characterises each voxel of WMH or brain tissues as progressing, regressing, or stable WMH (discussed in Section 2). For this study we have chosen deep neural networks due to their exceptional performance on WMH segmentation (Rachmadi et al., 2017), which reportedly have produced better results than the conventional machine learning algorithms. We use a Generative Adversarial Network (GAN) (Goodfellow et al., 2014) and the U-Residual Network (UResNet) (Guerrero et al., 2018) as base architectures for the DEP model. These architectures represent the state-of-the-art unsupervised and supervised deep neural network models, respectively.

This study differs from previous studies on predictive modelling in the fact that we are interested in predicting the evolution of specific neuroradiological MRI features (i.e., WMH in T2-FLAIR), not the progression of a disease as a whole and/or its effect. For example, previous studies have proposed methods for predicting the progression from mild cognitive impairment to Alzheimer’s disease (Spasov et al., 2019) and progression of cognitive decline in Alzheimer’s disease patients (Choi et al., 2018). Instead, our proposed DEP model generates three outcomes: 1) prediction of WMH volumetric changes (i.e., either progressing or regressing), 2) estimation of WMH spatial changes, and 3) spatial distribution of white matter evolution at the voxel-level precision. Thus, using the DEP model, clinicians can estimate the size, extent, and location of WMH in time to study their progression/regression in relation to clinical health and disease indicators, for ultimately design more effective therapeutic interventions (Rachmadi et al., 2019b). Results and evaluations can be seen in Section 8.

This study is an extension of our previous work (Rachmadi et al., 2019b). The main contributions of this study that have not been done in our previous work are as follows.

We propose a standard training scheme to predict the evolution pattern of WMH between two time points of assessment. The proposed scheme consists of two parts: 1) generation of the spatial WMH representation and 2) generation of the “Disease Evolution Map” (DEM) using deep neural networks, namely “Disease Evolution Predictor” (DEP).
We propose and evaluate the use of three different modalities to produce the DEM: irregularity map (IM) (Rachmadi et al., 2019b), probability map (PM), and binary WMH label (LBL). The generation of the DEM per se, subtracting the IM from two time points, was proposed in the previous work (Rachmadi et al., 2019b), and is explained further in Section 2.
We propose three different end-to-end DEP learning approaches: 1) unsupervised learning, 2) indirectly supervised learning, and 3) supervised learning. Unsupervised and indirectly supervised learning approaches are based on GAN (i.e., DEP-GAN (Rachmadi et al., 2019b)) whereas the supervised learning approach is based on UResNet (i.e., DEP-UResNet). DEP-GAN and DEP-UResNet are discussed in Sections 3.1 and 3.2 respectively.
We performed an ablation study of four different auxiliary inputs to the DEP model: 1) no auxiliary input, 2) Gaussian noise (Rachmadi et al., 2019b), 3) baseline WMH load, and 4) baseline WMH and stroke lesions (SL) loads. Further explanation can be read in Section 4 while the results can be seen in Section 8.2.
We performed analysis of plausibility of the WMH volumetric changes predicted by the DEP models and risk factors of WMH evolution using analysis of covariance (ANCOVA). The results can be seen in Section 8.4.

2. Disease Evolution Map (DEM)

To produce a standard representation of WMH evolution, a simple subtraction operation between two irregularity maps from two time points (i.e., baseline assessment from follow-up assessment) named “Disease Evolution Map” (DEM) was proposed in our previous work (Rachmadi et al., 2019b). In this study, we evaluate the use of three different modalities in the subtraction operation: irregularity map (i.e. as per (Rachmadi et al., 2019b)), probability map, and binary WMH label.

Irregularity map (IM) is a map/image that describes the “irregularity” level of each voxel with respect to the normal brain tissue using real values between 0 and 1 (Rachmadi et al., 2018b). The IM is advantageous as it retains some of the original MRI textures (e.g., from the T2-FLAIR image intensities), including gradients of WMH. IM is also independent from a human rater or training data, as it is produced using an unsupervised method (i.e., LOTS-IM) (Rachmadi et al., 2019a). DEM resulted from the subtraction of two IMs has values ranging from −1 to 1 (first row of Figure 1). Note how both regression and progression (i.e. dark blue from negative values and bright yellow pixels from positive values) are well represented at the voxel level precision on the DEM obtained from IMs.

Probability map (PM) in this study refers to the WMH segmentation output from a supervised machine learning method. Similar to IM, PM has real values between 0 and 1 which describe the probability for each voxel of being WMH. However, PM differs from IM in the fact that PM only has WMH gradients on the borders of WMH (note that the centres of (big) WMH clusters mostly have probability of 1). Thus, the DEM produced from the subtraction of two PMs also has values ranging from −1 to 1 representing regression and progression respectively, but these are usually located on the WMH clusters’ borders and/or representing small WMH. On the other hand, the rest of DEM’s regions (i.e., the centers of big WMH and non-WMH regions) have probability value of 0 (see the second row of Figure 1). Another caveat is that the quality (i.e., accuracy and meaning) of DEM from PM depends on the performance of the automatic WMH segmentation method that generated the PM.

Lastly, binary WMH label (LBL) refers to the WMH label produced by an expert’s manual segmentation, which is often considered as gold standard (Valdés Hernández et al., 2015). DEM from LBL can be produced by subtracting the baseline LBL from the follow-up LBL, and each voxel of the resulted image is then labelled as either “Shrink” if it has value below zero, “Grow” if it has value above zero, or “Stable” if it has value of zero. We refer this DEM as three-class DEM label (LBL-DEM), and its depiction can be seen in the bottom-right of Figure 1.

3. Disease Evolution Predictor (DEP) Model using Deep Neural Networks

In this study, two different Disease Evolution Predictor (DEP) models are proposed and evaluated: 1) non-supervised DEP model based on generative adversarial networks (DEP-GAN) (Rachmadi et al., 2019b) and 2) supervised DEP model based on UResNet (DEP-UResNet). Each DEP model’s workflow consists on two parts: 1) construction of the WMH spatial representation and 2) generation of the predicted DEM. The general flow of DEP model is depicted in Figure 2.

Figure 2:

Flow diagram of Disease Evolution Predictor (DEP) models grouped by its learning approach. Each DEP model’s workflow is divided into two, which are input modality construction and Disease Evolution Map (DEM) generation. See Section 3 for explanation of DEP models.

DEP-GAN uses either IM or PM to represent the WMH while DEP-UResNet uses T2-FLAIR and three-class label DEM (LBL-DEM). DEP-GAN using IM is categorised as unsupervised learning as the input modality of IM is produced by an unsupervised method: LOTS-IM. DEP-GAN using PM is categorised as indirectly supervised learning because the the PM is produced by a supervised deep learning algorithm (i.e., UResNet). Finally, DEP-UResNet is categorised as supervised learning as it simply learns DEM labels from LBL-DEM.

3.1. DEP Generative Adversarial Network (DEP-GAN)

DEP Generative Adversarial Network (DEP-GAN) (Rachmadi et al., 2019b) is based on a GAN, a well established unsupervised deep neural network model commonly used to generate fake natural images (Goodfellow et al., 2014). Thus, DEP-GAN is mainly proposed to predict the evolution of WMH when there are no longitudinal WMH labels available. This model (i.e. DEP-GAN) is based on a visual attribution GAN (VA-GAN) originally proposed to detect atrophy in T2-weighted MRI of Alzheimer’s disease (Baumgartner et al., 2017). DEP-GAN consists of a generator based on U-Net (Ronneberger et al., 2015), a commonly used deep neural network model in medical imaging, and two separate convolutional networks used as discriminators (hereinafter will be referred as critics). The schematic of DEP-GAN can be seen in Figure 3.

Figure 3:

Schematic of the proposed DEP-GAN with 2 discriminators (critics). DEP-GAN can take either irregularity map (IM) or probability map (PM) as input. DEP-GAN also has an auxiliary input to deal with the non-deterministic factors in WMH evolution (see Section 4 for full explanation).

Let x₀ be the baseline (year-0) image and x₁ be the follow-up (year-1) image. Then, the “real” DEM (y) can be produced by a simple subtraction (y = x₁ − x₀). To generate the “fake” DEM (y′), i.e. without x₁, a generator function (M(x)) is used: y′ = M(x₀). Thus, a “fake” follow-up image can be produced by . Once M(x) is well/fully trained, the “fake” follow-up and the “real” follow-up (x₁) should be indistinguishable by a critic function D(x), while “fake” DEM (y′) and “real” DEM (y) should be also indistinguishable by another critic function C(x). Full schematic of DEP-GAN’s architecture (i.e., its generator and critics) can be seen in Figure 4.

Figure 4:

Architecture of DEP-GAN, which consists of one generator (upper side, “A”) and two critics (lower side, “C” and “D”). Note how the proposed auxiliary input is feed-forwarded to convolutional layers (yellow, “B”) and then modulated to the generator using FiLM layer (green) inside residual block (ResBlock) (light blue, “E”). Please see Section 4 for full explanation about auxiliary input. On the other hand, DEP-UResNet (upper right side, “F”) is based on DEP-GAN’s generator, including its auxiliary input, with modification of the last non-linear activation function (i.e., from tanh to softmax).

The DEP-GAN’s U-Net-based generator (M(x)) has two parts, an encoder which encodes the input image information to a latent representation and a decoder which decodes back image information from the latent representation. The baseline IM/PM (x₀) is feed-forwarded to this generator to generate a “fake” DEM (y′). There is also an auxiliary input modulated into the generator using a FiLM layer (Perez et al., 2018) inside the residual block (ResBlock) to deal with non-deterministic factors of WMH evolution. This auxiliary input and its modulation will be fully discussed in Section 4. The architecture of the DEP-GAN’s generator is depicted in the upper side of Figure 4 (with “A”, “B”, and “E” annotations for U-Net-based generator of M(x), auxiliary input, and residual block (ResBlock) respectively).

Unlike VA-GAN that uses only one critic (i.e., only D(x)) (Baumgartner et al., 2017), DEP-GAN uses two critics (i.e., D(x) and C(x)) to enforce both anatomically realistic modifications to the follow-up images (Baumgartner et al., 2017) and encode realistic plausibility in the modifier (i.e., DEM) (Rachmadi et al., 2019b). Anatomically realistic modifications to the follow-up images can be achieved by optimising the critic D(x) and the anatomically realistic plausibility of the modifier can be achieved by optimising the critic C(x). In other words, we argue that an anatomically realistic DEM is also important and essential to produce anatomically realistic (fake) follow-up images. The architecture of the DEP-GAN’s critics and their connection to the generator are depicted in the lower side of Figure 4 (with “C” and “D” annotations for critic C(x) and D(x) respectively).

The DEP-GAN’s optimisation process is the same as the optimisation of VA-GAN, where the optimisation processes of Wasserstein GAN (WGAN-GP) using a gradient penalty factor of 10 is used (Gulrajani et al., 2017). The optimisation of M(x) is given by the following function where x₀ is the baseline image that has an underlying distribution ℙ₀, x₁ is the follow-up image that has an underlying distribution ℙ₁, M(x₀) represents the “fake” DEM, is the “fake” follow-up image, and are the critics (i.e. a set of 1-Lipschitz functions (Baumgartner et al., 2017; Gulrajani et al., 2017)), and ∥·∥₁ and ∥·∥₂ are the L1 and L2 norms respectively.The optimisation was performed by updating the parameters of the generator and critics alternately, where (each) critic is updated 5 times per generator update. Also, the first 25 iterations and every 100 iterations, the critics were updated 100 times per generator update (Baumgartner et al., 2017; Gulrajani et al., 2017).

In summary, to optimise the generator M (x), we need to optimise Equation 1, which optimises both critics (i.e., D(x) and C(x)) using Equations 2 and 3 respectively based on WGAN-GP’s optimisation process (Gulrajani et al., 2017), and use the regularisation function described in Equation 4. Each term in the regularisation function (Equation 4) simply says:

Intensities of “fake” follow-up images have to be similar to the “real” follow-up images (x₁) based on L1 norm.
The WMH segmentation estimated from has to be spatially similar to the WMH segmentation estimated from x₁ based on the Dice similarity coefficient (DSC) (see Equation 6).
The WMH volume (in ml) estimated from has to be similar to the WMH volume estimated from x₁ based on L2 norm.

The WMH segmentation of and x₁ is estimated by either thresholding IM values (i.e., irregularity values) to be above 0.178 (Rachmadi et al., 2019a) or PM values (i.e., probability values) to be above 0.5. Furthermore, each term in Equation 4 is weighted by λ₁, λ₂, and λ₃ which equals to 100 (Baumgartner et al., 2017), 1 and 100 respectively.

3.2. DEP U-Residual Network (DEP-UResNet)

In case WMH binary labels (LBL) for the two time points (i.e., longitudinal data set) are available, a simple supervised deep neural network method can automatically estimate WMH evolution. As previously described in Section 2, DEM produced from LBL (i.e., three-class DEM label (LBL-DEM)) consists of 3 foreground labels (i.e., “Grow” (green), “Shrink” (red), and “Stable” (blue)) and 1 background label (black). An example of LBL-DEM can be seen in the bottom-right figure of Figure 1.

In this case, the DEP-GAN’s generator is detached from the critics and modified into DEP U-Residual Network (DEP-UResNet) by changing the last non-linear activation layer of tanh (i.e., for regression) to softmax (i.e., for multi-label segmentation). The DEP-UResNet’s schematic can be seen in the upper right side of Figure 4 (with “F” annotation). DEP-UResNet uses T2-FLAIR as input and LBL-DEM as target output. Note that the auxiliary input proposed in this study can be also applied to the DEP-UResNet. See Section 4 for full explanation about auxiliary input in DEP model.

4. Auxiliary Input in DEP Model

The biggest challenge in modelling the evolution of WMH is mainly their non-deterministic nature and the amount of factors involved in WMH evolution. In our previous work, we proposed an auxiliary input module which modulates random noises from normal (Gaussian) distribution to every layer of the DEP-GAN’s generator to simulate the non-deterministic property of WMH evolution and the unknown/missing factors (i.e., non-image features) involved in WMH evolution (Rachmadi et al., 2019b). To modulate the auxiliary input to every layer of the DEP-GAN’s generator, Feature-wise Linear Modulation (FiLM) layer (Perez et al., 2018) is used. The FiLM layer is depicted as the green block inside the residual block (ResBlock) in Figure 4 (annotated as “E”). In the FiLM layer, γ_m and β_m modulate feature maps F_m, where subscript m refers to m^th feature map, via the following affine transformation where γ_m and β_m for each ResBlock in each layer are automatically determined by convolutional layers (depicted as yellow blocks in Figure 4 with “B” annotation). Please note that the proposed auxiliary input module can be easily applied to any deep neural network model. Thus, we applied the auxiliary input module to the both DEP models proposed in this study: DEP-GAN and DEP-UResNet.

In this study, we performed an ablation study of auxiliary input modalities for DEP model by using: 1) no auxiliary input, 2) Gaussian noise, 3) baseline WMH volume, and 4) both baseline WMH and SL volumes. Firstly, we tested DEP models without any auxiliary input. Secondly, we added as input an array of 32 random noises which follow Gaussian distribution of as per our previous work (Rachmadi et al., 2019b). Hereinafter, Gaussian noise in this study refers to this array. Thirdly, instead of the random noises, we used some risk factors that have been commonly associated with WMH evolution. Note that while all factors which influence WMH evolution are not fully well known, baseline WMH load (i.e., cited as the most common and strongest predictor) (Schmidt et al., 2003; Sachdev et al., 2007; van Dijk et al., 2008; Wardlaw et al., 2017; Chappell et al., 2017) and baseline stroke lesions (SL) load (Gouw et al., 2008a; Wardlaw et al., 2017) have been found strongly associated with WMH evolution over time. The WMH and SL volumes were obtained from WMH and SL labels/masks. Please see Section 5 for full explanation on how WMH and SL masks were produced. It is worth to mention that changing the auxiliary input modality from Gaussian noise to WMH and SL loads changes the nature of the DEP model from non-deterministic to deterministic influenced by factors of WMH evolution (i.e., baseline WMH and SL loads).

5. Subjects and Data

We used MRI data from stroke patients (n = 152) enrolled in a study of stroke mechanisms where full recruitment and assessments were also published (Wardlaw et al., 2017). Written informed consent was obtained from all patients on protocols approved by the Lothian Ethics of Medical Research Committee (REC 09/81101/54) and NHS Lothian R+D Office (2009/W/NEU/14), on the 29th of October 2009. In the clinical study that provided the data, patients were imaged at three time points (i.e., first time (baseline) 1-4 weeks after presenting to the clinic with stroke symptoms, at approximately 3 months, and a year after (follow-up)). All images were acquired at a GE 1.5T MRI scanner following the same imaging protocol (Valdés Hernández et al., 2015). Ground truth segmentations were performed using a multispectral semi-automatic method (Valdés Hernández et al., 2015) only from baseline and 1-year follow-up scan visits in the image space of the T1-weighted scan of the second visit, in n = 152 (out of 264) patients. T2-weighted, FLAIR, gradient echo, and T1-weighted structural images at baseline and 1-year scan visits were rigidly and linearly aligned using FSL-FLIRT (Jenkinson et al., 2002). The resulted resolution of the images is 256 × 256 × 42 with slice thickness of 0.9375 × 0.9375 × 4 mm. We used data from all patients who had the three scan visits and ground truth generated as per above. Hence, our sample consists on MRI data (i.e., s = n × 2 = 304 MRI scans) for baseline and 1-year follow-up data. Out of all patients, there are 70 of them that have stroke subtype lacunar (46%) with median small vessel disease (SVD) score of 1. Other demographics and clinical characteristics of the patients that provided data for this study can be seen in Table 1.

View this table:

Table 1:

Demographics and clinical characteristics of the samples used in this study (n = 152). SVD and PV stand for small vessel disease and periventricular respectively.

The primary study that provided the data used a semi-automatic multispectral method to produce several brain masks including intracranial volume (ICV), cerebrospinal fluid (CSF), stroke lesions (SL), and WMH, all which were visually checked and manually edited by an expert (Valdés Hernández et al., 2015). The image processing protocol followed to generate these masks is fully explained in (Valdés Hernández et al., 2015). Extracranial tissues, SL, and skull were removed from the baseline and follow-up T2-FLAIR images using the SL and ICV binary masks from previous analyses (Chappell et al., 2017; Wardlaw et al., 2017).

In this study, binary WMH labels produced for the primary study that provided the data (Valdés Hernández et al., 2015) were used as the gold standard (i.e. ground truth) for evaluating the DEP models. As per these labels, 98 and 54 out of the 152 subjects have increasing (i.e., progression) and decreasing (i.e., regression) volume of WMH respectively. However, as the DEP models depend on the input, we also generated WMH binary labels from IM (i.e., cutting off IM’s values to be above 0.178 (Rachmadi et al., 2019a)) and PM (i.e., cutting off PM’s values to be above 0.5) and used them as second reference. Hereinafter, these WMH labels will be referred as “WMH label from original input modality” (IM/PM-LBL-DEM). Note that IM/PM-LBL-DEM will be used only for evaluating DEP-GAN using IM and PM respectively.

As previously explained, IM and PM are needed for DEP-GAN (i.e., the non-supervised learning approach of DEP model). We used LOTS-IM with 128 target patches Rachmadi et al. (2019a) to generate IM from each MRI data. To generate PM, we trained a 2D UResNet (Guerrero et al., 2018) with gold standard WMH and SL masks for WMH and SL segmentation. For this training, we used all subjects in our data set and a 4-fold cross validation training scheme. Thus, out of 304 MRI data (152 subjects × 2 scans), 228 MRI data (114 subjects × 2 scans) were used for training and 76 MRI data (38 subjects 2 × scans) were used for testing in each fold. Note that this UResNet is different from the DEP-UResNet, which is newly proposed in this study. Notice that we affix “DEP” key-word to any model’s name used for prediction and delineation of WMH evolution. Whereas, the UResNet was previously proposed for WMH and stroke lesions segmentation by (Guerrero et al., 2018).

6. Experiment Setup

In this study, we opted to use 2D architectures for all our networks rather than 3D ones. This includes the DEP models (i.e., DEP-GAN and DEP-UResNet) for estimating WMH evolution and URe-sNet for WMH and stroke lesions segmentation. The main reason of this decision was the few data available (i.e. only 152 subjects) in this study. VAGAN (i.e., the GAN scheme used as basis for DEP-GAN) used roughly 4,000 subjects for training its 3D network architecture, yet there was still an evidence of over-fitting (Baumgartner et al., 2017). The 2D version of VA-GAN has been previously tested on synthetic data (Baumgartner et al., 2017) and available from the project’s GitHub page¹.

To train DEP models (i.e., DEP-GAN and DEP-UResNet), 4-fold cross validation was performed. Note that cross validation was not used in the previous study that introduced DEP-GAN (Rachmadi et al., 2019b). In each fold, out of 304 MRI data (152 subjects × 2 scans), 228 MRI data (114 subjects × 2 scans) were used for training and 76 MRI data (38 subjects × 2 scans) were used for testing. Note that DEP models are subject-specific models, so pairwise MRI scans (i.e., baseline and follow-up) are needed and necessary for both training and testing. Out of all slices from the training set in each fold (i.e., 114 pairwise MRI scans), 20% of them were randomly selected for validation. Thus, around 4,000 slices were used in the training process in each fold. Values of IM and PM did not need to be normalised as these are between 0 and 1. Finally, each DEP model was trained for 200 epochs (i.e., 200 generator updates for DEP-GAN).

In this study, we evaluated and compare the performances of DEP-UResNet, DEP-GAN using IM, and DEP-GAN using PM. Furthermore, we also performed an ablation study to see the effect of an auxiliary input to each DEP models. The procedure of using auxiliary input depends on the input modality and training/testing process. If SL and WMH volumes were used as auxiliary input, these (i.e., not the volumes per slice, but the volume per subject) were feed-forwarded together with one MRI slice. Thus, all slices from one subject used the same number of WMH and stroke lesion volumes. Note that WMH and SL loads for the whole data set (i.e., all subjects) were first normalised to zero mean unit variance before their use in training/testing. If Gaussian noise were used as auxiliary input, an array of Gaussian noise was feed-forwarded together with an MRI slice in the training process as follows: 10 different sets of Gaussian noise were first generated and only the “best” set (i.e., the set that yielded the lowest M* loss (Equation 1)) was used to update the DEP model’s parameters. Note that this approach is similar to and inspired by Min-of-N loss in 3D object reconstruction (Fan et al., 2017) and variety loss in Social GAN (Gupta et al., 2018). In the testing process, also 10 different sets of Gaussian noise were generated first and only the “best” prediction of WMH evolution based on Dice similarity coefficient (DSC) was used in evaluation.

7. Evaluation Metrics

In this study, we used four tests to assess the performance of DEP models:

Prediction error of WMH volumetric change (i.e., whether WMH volume in a subject will increase or decrease).
Volumetric agreement between ground truth and predicted WMH volumes of the follow-up assessment using Bland-Altman plot (Bland and Altman, 1986).
Spatial agreement of the automatic map of WMH evolution in a patient (i.e. after binarisation) using Dice similarity coefficient (DSC) (Dice, 1945).
Clinical plausibility test between the out-come of DEP models in relation with baseline WMH load and clinical risk factors of WMH evolution suggested in clinical studies.

Prediction error is a simple metric to assess how good a DEP model can predict the WMH evolution in the future follow-up assessment (i.e., increasing or decreasing). On the other hand, volumetric agreement using Bland-Altman plot presents the mean volumetric difference and upper/lower limit of agreements (i.e., mean ± 1.96 × standard deviation) between ground truth and predicted WMH volumes of the follow-up assessment. Whereas, spatial agreement of Dice similarity coefficient (DSC) is used to measure spatial agreement between ground truth and automatic delineation results. Higher DSC means better performance. The DSC itself can be computed as follow: where TP is true positive, FP is false positive and FN is false negative.

In addition, we performed clinical plausibility test which evaluate the outcome of DEP models in relation with the baseline WMH load and clinical risk factors of WMH change and evolution suggested in clinical studies. For this, analyses of covariance (ANCOVA) were performed as follows:

The WMH volume at follow-up, predicted from each of the schemes evaluated was used as out-come variable.
The baseline WMH volume was the dependent variable or predictor.
After running Belsley collinearity diagnostic tests, the covariates in the models were: 1) type of stroke (i.e. lacunar or cortical), 2) basal ganglia perivascular spaces (BG PVS) score, 3) presence/absence of diabetes, 4) presence/absence of hypertension, 5) recent or current smoker status (yes/no), 6) volume of the index stroke lesion (abbreviated as “index SL”), and 7) volume of old stroke lesions (abbreviated as “Old SL”).

The outcome from an ANCOVA model using the baseline and follow-up WMH volumes of the gold-standard expert-delineated binary masks was used as reference to compare the outcome of the ANCOVA models that used the volumes generated by thresholding the input and output of the DEP models. All volumetric measurements involved in the ANCOVA models were previously adjusted by patient’s head size. Therefore, all ANCOVA models used the percentage of these volumetric measurements in ICV rather than the raw volumes.

8. Results and Discussion

8.1. Comparison against the Gold Standard WMH Labels

Table 2 shows the results predicting and estimating WMH volumetric changes in the sample against the WMH labels generated by expert-delineated WMH binary masks (i.e. considered the gold standard reference). Table 3 shows the results of evaluating spatial coincidence using Dice similarity coefficient (DSC), including regions of shrinkage, growth, and stable, achieved by automatic boundary delineation of WMH evolution method using different DEP models.

View this table:

Table 2:

Prediction error of WMH change and volumetric agreement of WMH volume compared to the gold standard expert-delineated WMH masks (i.e., three-class DEM labels). “Vol.” stands for volumetric, “LoA” stands for limit of agreement, “gr” and “sh” stand for number of subjects that have increasing and decreasing WMH volume (i.e., 98 and 54 respectively), and “G” and “S” stand for percentage of subjects correctly predicted as having growing and shrinking WMH by DEP models. Thus, G = p_gr/gr and S = p_sh/sh where “p_gr” and “p_sh” stand for number of subjects predicted as having growing and shrinking WMH. The best value for each machine learning approaches and evaluation metrics is written in bold. Furthermore, the best value of all machine learning approaches for each evaluation metrics is underlined and written in bold.

View this table:

Table 3:

Results (Dice similarity coefficient (DSC)) of the automatic estimation of the spatial changes of WMH clusters produced by DEP models compared to the gold standard expert-delineated WMH masks (i.e., three-class DEM label). All values are the mean DSC, calculated from all subjects. The best value for each machine learning approach and categories is written in bold. Furthermore, the best value of all machine learning approaches for each categories is underlined and written in bold.

As Table 2 shows, DEP-UResNet with WMH and stroke lesions (SL) volumes as auxiliary inputs, had the best average performance for prediction of WMH volumetric changes in the whole sample (i.e. average rate 77.76%). However, the number of patients that overall experienced growth was better predicted by DEP-UResNet with Gaussian noise as auxiliary input (81.63%) and the number of patients that experienced shrinkage was better predicted with DEP-GAN using probability map (PM) and Gaussian noise as inputs (88.89%).

With respect to the volumetric agreement, DEP-GAN using irregularity map (IM) and Gaussian noise as inputs produced the best estimation of WMH volumetric changes with 0.4420 ml mean difference with respect to the gold standard, followed by DEP-UResNet with Gaussian noise (i.e. −0.7926 ml) and DEP-UResNet with WMH and SL loads (i.e. 0.8114 ml). Interestingly, DEP-GAN using PM, which seemingly had some of the worst agreements of volumetric change estimation, had the better lower and upper limits of agreement (LoA) than the other models. Furthermore, DEP-GAN using PM and Gaussian noise as inputs was the only scheme that showed plausible results with respect to the clinical parameters evaluated (see Table 6 and Section 8.4 for further explanation). From the Bland-Altman plots (Figure 5) the volumetric agreement of DEP-GAN using PM is more similar to the ones from DEP-UResNet than to the ones from DEP-GAN using IM. Observe that the black dashed lines, depicting LoA, in DEP-GAN using PM’s plots are closer to 0 and more similar to those of the DEP-UResNet schemes.

Figure 5:

Volumetric agreement analysis (in ml) between ground truth (GT) and predicted volume of WMH (Pred) using Bland-Altman plot. Black figures correspond to results where GT is the golden standard of ground truth (i.e., binary WMH label (LBL)) presented in Table 2. Whereas, red figures correspond to results where GT is the WMH segmentation from original input modality (i.e., irregularity map (IM) or probability map (PM)) presented in Table 5. Solid lines correspond to “Vol. Bias” in Tables 2 and 5, and dashed lines correspond to either “Lower LoA” or “Upper LoA” of the same tables. “LoA” stands for limit of agreement.

On the automatic delineation of WMH change’s boundaries in the follow-up year, DEP-UResNet using Gaussian noise and DEP-GAN using PM and Gaussian noise produced the best mean DSC performances for the entire WMH; 0.6162 and 0.6155 respectively (see 2^nd column of Table 3). DEP-UResNet with Gaussian noise also outperformed the rest of the models on average (i.e., average DSC in delineation of shrinking, growing, and stable WMH clusters) with 0.3200 (i.e. right hand side column of Table 3). In general, DEP-GAN, especially DEP-GAN using IM (unsupervised learning), performed worse than DEP-UResNet (Table 3). Only (indirectly supervised learning) DEP-GAN using PM could compete with the (supervised) DEP-UResNet scheme, especially DEP-GAN using PM with Gaussian noise as auxiliary input. Thus, we conducted Wilcoxon and Kruskal-Wallis tests to evaluate whether the medians and distributions of DSC scores produced by both methods (i.e., DEP-UResNet using Gaussian noise and DEP-GAN using PM and Gaussian noise) were significantly different to each other. The results are listed in Table 4. From Table 4 we can see that performances of DEP-UResNet with Gaussian noise and DEP-GAN using PM with Gaussian noise did not differ from each other in the estimation of the whole volume of WMH at follow-up, in the delineation of the stable and shrinking WMH clusters, and in the average spatial estimation of shrinking, growing, and stable WMH clusters (p-value > 0.05). However, they differed estimating the WMH clusters that experienced growth, being this influential in the overall estimation of change. It is worth to note that the regions that grew and shrink are considerably smaller than those unchanged, and that when stroke lesions coalesce with WMH it is very difficult to discern the borders between them. As previously explained, stroke lesions were removed from the analysis. Nevertheless, inaccuracies in the ground truth while determining the borders between coalescent WMH and stroke lesions, and the small size of the volume changes in each WMH cluster (Rachmadi et al., 2018a) might have influenced in the low DSC values obtained in the regions that experienced change as seen in Table 3.

View this table:

Table 4:

Results of Wilcoxon and Kruskal-Wallis tests between DEP-GAN using PM with Gaussian noise and DEP-UResNet with Gaussian noise based on Dice similarity coefficient (DSC) of listed in Table 5. Note that we only tested the results of automatic delineation of WMH dynamic changes. SS, MS, and df refer to the sum of squares, mean squares, and degree of freedom respectively.

8.2. Ablation Study of Auxiliary Input in DEP Models

From ablation study of auxiliary inputs, we can see that auxiliary inputs improved the performances of DEP models, especially Gaussian noise in DEP-GAN (see Tables 2 and 3). DEP-GAN with Gaussian noise generally produced better performances than other auxiliary input modalities (i.e., no auxiliary input, WMH load, and both WMH and stroke lesion loads) when using any input modality (i.e., IM or PM) in almost all evaluations (i.e., prediction and estimation of WMH volumetric changes and delineation of dynamic WMH evolution). Moreover, DEP-UResNet with Gaussian noise performed the best on automatic estimation of WMH volumetric changes and delineation of dynamic WMH evolution. Gaussian noise notably failed to improve the performance of DEP-UResNet on the prediction of WMH volumetric changes (columns 3 and 4 of Table 2).

Furthermore, we also can see that DEP models with auxiliary input, either Gaussian noise or known risk factors of WMH evolution (i.e., WMH and SL loads), produced better performances than the DEP models without any auxiliary input. These results show the importance of auxiliary input, especially Gaussian noise which simulate the non-deterministic nature of WMH evolution.

8.3. Comparison to WMH Label from Original Input Modality (PM or IM)

From Table 3, we can see that DEP-GAN using IM did not perform well on automatic spatial delineation of the WMH clusters for the estimated follow-up visit. We believe this happened because DEP-GAN had to regress IM values for the whole brain tissue. In contrast, both DEP-UResNet and DEP-GAN using PM performed delineation and regression only around regions of WMH based on three-class DEM labels and PM’s values respectively.

To corroborate this hypothesis, we performed another experiment where we compared the performances of DEP models to WMH volume and labels produced from the original input modality (i.e., IM/PM-LBL-DEM). In other words, we wanted to see how good DEP-GAN learns available information (i.e., IM/PM values and their changes) based on the original modality of either IM or PM. The results are listed in Table 5, and the volumetric agreement can be seen in the Bland-Altman plots depicted in red in Figure 5. Note that only results from DEP-GAN that uses IM or PM were reassessed in this experiment.

View this table:

Table 5:

Results (mean Dice similarity coefficient (DSC)) on the automatic estimation of WMH volume (in ml) and delineation of dynamic WMH evolution produced by DEP-GAN compared to the WMH segmentation obtained from the input modality (i.e., IM or PM). Thus, unsupervised DEP-GAN using irregularity map (IM) is compared to automatic WMH segmentation from IM by cutting off its value at 0.178, and indirectly supervised DEP-GAN using probability map (PM) are compared to automatic WMH segmentation from PM by cutting off its value at 0.5. The best value for each machine learning approaches and categories is written in bold.

From the analysis of volumetric agreement depicted in the Bland-Altman plots of Figure 5, we can see that the estimation of WMH volumetric changes using DEP-GAN and IM/PM, agreed better with the WMH volume produced from the original modality of IM/PM (depicted by red dots and lines) than with the WMH volume produced from the gold standard WMH masks (depicted by black dots and lines). In other words, Figure 5 shows that the mean difference of WMH volume (depicted as solid lines) between DEP-GAN’s results and the ones produced from the original modality (in red colour) is closer to 0 than the mean differences between DEP-GAN’s results and the gold standard masks(represented in black). Similarly, lower and upper limits of agreement (LoA) between DEP-GAN’s results and WMH volumes produced from the original modality (depicted as red dashed lines) are closer to 0 than ones between DEP-GAN’s results and the gold standard masks (depicted as black dashed lines). Note that the exact values of these “volumetric bias” and lower/upper LoA for all DEP models depicted in Figure 5 are listed in Tables 2 and 5.

From this experiment, we can conclude that DEP-GAN learns reasonable well either from IM or PM. However, information contained in IM is more complex than PM which makes it more challenging to estimate the evolution of WMH unsupervisedly. Furthermore, from Table 5, we can see that auxiliary input of Gaussian noise improved the performance of DEP-GAN. This once again emphasise the importance of Gaussian noise as auxiliary input in DEP model.

8.4. Clinical Plausibility of DEP Models’ Results

From Table 6, we can see that the use of expertdelineated binary WMH masks and WMH maps obtained from thresholding IM or PM (see the second to the fourth rows), all produced the same ANCOVA model’s results; none of the covariates of the model had an effect in the 1-year WMH volume change, yielding almost identical numerical results in the first two decimal places. Therefore, the use of LOTS-IM and UResNet, generators of the IM and PM respectively, for producing WMH maps in clinical studies of mild to moderate stroke seems plausible.

View this table:

Table 6:

Results from the ANCOVA models that investigate the effect of several clinical variables (i.e. stroke subtype, stroke-related imaging markers and vascular risk factors) in the WMH volume change from baseline to one year after. The first column at the left hand side refers to the models/methods used to obtain the follow-up WMH volume used in the ANCOVA models as outcome variable. The rest of the columns show the coefficient estimates B and the significance level given by the p-value (i.e. B(p)), for each covariate included in the models.

As discussed in Section 1, baseline WMH volume has been recognised the main predictor of WMH change over time (Chappell et al., 2017; Wardlaw et al., 2017), although the existence of previous stroke lesions (SL) and hypertension have been acknowledged as contributed factors. However, from the results of the ANCOVA models (Table 6), none of the DEP models that used these (i.e WMH and/or SL volumes) as auxiliary inputs showed similar performance (i.e. in terms of strength and significance in the effect of all the covariates in the WMH change) as the reference WMH maps. The only DEP model that shows promise in reflecting the effect of the clinical factors selected as covariates in WMH progression was the DEP-GAN that used as input the PM of baseline WMH and Gaussian noise (i.e. underlined in the left hand side column of Table 6).

Some factors might have adversely influenced the performance of these predictive models. First, all deep-learning schemes require a very large amount of balanced (e.g. in terms of the appearance, frequency and location of the feature of interest, i.e. WMH in this case) data, generally not available. The lack of data available imposed the use of 2D model configurations, which generated unbalance in the training: for example, not all axial slices have the same probability of WMH occurrence, also WMH are known to be less frequent in temporal lobes and temporal poles are a common site of artefacts affecting the IM and PM, error that might propagate or even be accentuated when these modalities are used as inputs. Second, the combination of hypertension, age and the extent, type, lapse of time since occurrence and location of the stroke might be influential on the WMH evolution, therefore rather than a single value, the incorporation of a model that combines these factors would be beneficial. However, such model is still to be developed also due to lack of data available. Third, the tissue properties have not been considered. A model to reflect the brain tissue properties in combination with vascular and inflammatory risk factors is still to be developed. Lastly, the deep-learning models as we know them, although promising, are reproductive, not creative. The development of more advanced inference systems is paramount before these schemes can be used in clinical practice.

9. Conclusion and Future Work

In this study, we proposed a training scheme to predict the evolution of WMH using deep learning algorithms called Disease Evolution Predictor (DEP) model. To the best of our knowledge, this is the first extensive study on modelling WMH evolution using deep learning algorithms. Furthermore, we evaluated different configurations of DEP models: unsupervised, indirectly supervised, and supervised (i.e., DEP-GAN using irregularity map (IM), DEP-GAN using probability map (PM), and DEP-UResNet) with auxiliary input (i.e., Gaussian noise, WMH load, and WMH and stroke lesions (SL) loads). These configurations were designed and evaluated to find the best approach to automatically predict and delineate the evolution of WMH from a baseline measurement to a follow-up visit.

From our experiments, on average, supervised DEP-UResNet yielded the best results in almost every evaluation metric. However, it did not perform well in the clinical plausibility test. The indirectly supervised DEP-GAN yielded similar average performance to the supervised DEP-UResNet’s performance and yielded the best results from all schemes in the clinical plausibility test. Moreover, results from DEP-UResNet (using Gaussian noise) and DEP-GAN (using PM and Gaussian noise) were not statistically different to each other on delineating the WMH clusters, those that were unchanged, and those that shrunk. Thus, DEP-GAN using PM and Gaussian noise has the biggest potential to be explored for its improvement, which could lead to its further use in clinical settings.

If we consider the results, time, and resources spent in this study, then DEP-GAN using PM showed the biggest and strongest potential of all DEP models. Not only did it perform similarly to the supervised DEP-UResNet but it also did not need manual WMH labels on two MRI scans for training (i.e., baseline and follow-up scans). The PM needed as input for this model can be efficiently produced by any supervised (deep) machine learning model. Moreover, the development of automatic WMH segmentation for producing better PM could be done separately and independently from the development of the DEP model. If a better PM model is available in the future, then the DEP-GAN model can be retrained using the newly produced PM for better performance. Also, DEP-GAN using PM could be used for other (neuro-degenerative) pathologies, as long as a set of PM from these other pathologies could be produced and used to (re-)train the DEP-GAN.

Based on the ablation study, Gaussian noise successfully improved all DEP models in almost all evaluation metrics when used as auxiliary input. This shows that there are indeed some unknown factors that influences the evolution of WMH. These unknown factors make the problem of predicting/delineating WMH evolution non-deterministic, and Gaussian noise were proposed to simulate this scenario. The intuition behind this approach is that Gaussian noise fills in the missing (unavailable) risks factors or their combination, which could influence the evolution of WMH. Note that it is very challenging to collect and compile all risk factors of WMH evolution in a longitudinal study.

There are several shortcomings anticipated from the results of this study. Firstly, manual WMH labels of two MRI scans (i.e., baseline and follow-up scans) are necessary for training the DEP-UResNet. In many scenarios, this is not applicable and efficient in terms of time and resources. Secondly, the unsupervised DEP-GAN using IM is computationally very demanding as it involves regressing IM values across the whole brain tissue. This resulted in low performances of DEP-GAN using IM in almost all evaluation metrics. Thirdly, the schemes’ performances depend on the accuracy of the quality of input. For example, the PM generated in this study are slightly biased towards overestimating the WMH in the optical radiation and underestimating WMH in the frontal lobe. This could be caused by the absence of correcting the FLAIR images for b1 magnetic field inhomogeneities. However, a previous study on small vessel disease images demonstrated this procedure might affect the results underestimating the subtle white matter abnormalities characteristics of this disease, and recommends this procedure to be used in T1- and T2-weighted structural images but not in FLAIR images for WMH segmentation tasks (Hernández et al., 2016) Hence, the biggest challenge of using DEP-GAN using PM is its highly dependency on the quality of initial PM. Fourthly, volumetric agreement analyses suggest that there are still large differences in absolute volume and in change estimates produced by the proposed DEP models. While this study is intended as a “proof-of-principle” study to advance the field of white matter – and ultimately brain-health prediction, it is worth to mention that better reliability in the WMH assessment is necessary so as DEP models can be used in clinical practice. Furthermore, better understanding of what DEP models extract to estimate WMH evolution would be very useful in clinical practice. Lastly, the limitation of using (Gaussian) random noise in DEP models is the fact that we do not really know which set of Gaussian random noise should be used to generate the best result for each subject. Note that, in this study, all DEP models that used Gaussian noise as auxiliary input were tested 10 times to find the best set of Gaussian noise that produced the best automatic delineation of WMH evolution overall. In conclusion, DEP models suffer similar problems and limitations to any machine learning based medical image analysis methods.

However, the DEP models proposed in this study open up several possible future avenues to further improve their performances. Firstly, multi-channel (e.g., PM and T2-FLAIR) input could be used instead of single channel input. In this study, we only used single channel to draw a fair comparison between DEP-UResNet which uses T2-FLAIR and DEP-GAN which uses either IM or PM. Secondly, 3D architecture of DEP-GAN could be employed when more subjects are accessible in the future. 3D deep neural networks have been reported to have better performances than the 2D ones, but they are more difficult to train (Çiçek et al., 2016; Baumgartner et al., 2017). Thirdly, Gaussian noise and known risk factors (e.g., WMH and SL loads) could be modulated together instead of modulating them separately in different models. By modulating them together, DEP model would be influenced by both known (available) risk factors and unknown (missing) factors represented by Gaussian noise. Lastly, different random noise distribution could be used instead of Gaussian distribution. Note that each risk factors of WMH evolution (e.g., WMH load, age, and blood pressure) could have different data distribution, not only Gaussian distribution. If a specific data distribution (i.e., the same or similar to the real risk factor’s data distribution) could be used for a specific risk factor, then the real data could replace the random noise if available in the testing.

Acknowledgements

The first author (MFR) would like to thank Indonesia Endowment Fund for Education (LPDP) of Ministry of Finance, Republic of Indonesia, for funding his study at School of Informatics, the University of Edinburgh. Funds from Row Fogo Charitable Trust (Grant No. BRO-D.FID3668413) (MCVH) are also gratefully acknowledged. The primary study that provided data for this study was funded by the Wellcome Trust (Ref No. 088134/Z/0). Authors also thank support from the European Union Horizon 2020 (PHC-03-15, project No 666881, ‘SVDs@Target’), Fondation Leducq (CVD 16/05), and the UK Dementia Research Institute at the University of Edinburgh.

Footnotes

References

↵
Baumgartner, C. F., Koch, L. M., Tezcan, K. C., Ang, J. X., Konukoglu, E., 2017. Visual feature attribution using wasserstein gans. In: Proc IEEE Comput Soc Conf Comput Vis Pattern Recognit.
↵
Bland, J. M., Altman, D., 1986. Statistical methods for assessing agreement between two methods of clinical measurement. The lancet 327 (8476), 307–310.
OpenUrl
↵
Chappell, F. M., del Carmen Valdés Hernández, M., Makin, S. D., Shuler, K., Sakka, E., Dennis, M. S., Armitage, P. A., Muñoz Maniega, S., Wardlaw, J. M., 2017. Sample size considerations for trials using cerebral white matter hyperintensity progression as an intermediate outcome at 1 year after mild stroke: Results of a prospective cohort study. Trials 18 (1), 1–10.
OpenUrl CrossRef
↵
Cho, A.-H., Kim, H.-R., Kim, W., Yang, D. W., 2015. White Matter Hyperintensity in Ischemic Stroke Patients: It May Regress Over Time. Journal of Stroke 17 (1), 60.
OpenUrl
↵
Choi, H., Jin, K. H., Initiative, A. D. N., et al., 2018. Predicting cognitive decline with deep learning of brain metabolism and amyloid imaging. Behavioural brain research 344, 103–109.
OpenUrl CrossRef
↵
Çiçek, Ö., Abdulkadir, A., Lienkamp, S. S., Brox, T., Ron-neberger, O., 2016. 3d u-net: learning dense volumetric segmentation from sparse annotation. In: International conference on medical image computing and computer-assisted intervention. Springer, pp. 424–432.
↵
Dice, L. R., 1945. Measures of the amount of ecologic association between species. Ecology 26 (3), 297–302.
OpenUrl CrossRef Web of Science
↵
Durand-Birchenall, J., Leclercq, C., Daouk, J., Monet, P., Godefroy, O., Bugnicourt, J.-M., 2 2012. Attenuation of brain white matter lesions after lacunar stroke. International journal of preventive medicine 3 (2), 134–138.
OpenUrl
↵
Fan, H., Su, H., Guibas, L. J., 2017. A point set generation network for 3d object reconstruction from a single image. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp. 605–613.
↵
Godin, O., Tzourio, C., Maillard, P., Alperovitch, A., Mazoyer, B., Dufouil, C., 2009. Apolipoprotein E genotype is related to progression of white matter lesion load. Stroke 40 (10), 3186–3190.
OpenUrl Abstract/FREE Full Text
↵
Godin, O., Tzourio, C., Maillard, P., Mazoyer, B., Dufouil, C., 2011. Antihypertensive Treatment and Change in Blood Pressure Are Associated With the Progression of White Matter Lesion Volumes. Circulation 123 (3), 266–273.
OpenUrl Abstract/FREE Full Text
↵
Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., Bengio, Y., 2014. Generative adversarial nets. In: Advances in neural information processing systems. pp. 2672–2680.
↵
Gouw, A. A., van der Flier, W. M., Fazekas, F., van Straaten, E. C., Pantoni, L., Poggesi, A., Inzitari, D., Erkinjuntti, T., Wahlund, L. O., Waldemar, G., Schmidt, R., Scheltens, P., Barkhof, F., 2008a. Progression of White Matter Hyperintensities and Incidence of New Lacunes Over a 3-Year Period. Stroke 39 (5), 1414–1420.
OpenUrl Abstract/FREE Full Text
↵
Gouw, A. A., Van Der Flier, W. M., Van Straaten, E. C., Pantoni, L., Bastos-Leite, A. J., Inzitari, D., Erkinjuntti, T., Wahlund, L. O., Ryberg, C., Schmidt, R., Fazekas, F., Scheltens, P., Barkhof, F., 2008b. Reliability and sensitivity of visual scales versus volumetry for evaluating white matter hyperintensity progression. Cerebrovascular Diseases 25 (3), 247–253.
OpenUrl CrossRef PubMed
↵
Guerrero, R., Qin, C., Oktay, O., Bowles, C., Chen, L., Joules, R., Wolz, R., Valdés-Hernández, M. d. C., Dickie, D., Wardlaw, J., et al., 2018. White matter hyperintensity and stroke lesion segmentation and differentiation using convolutional neural networks. NeuroImage: Clinical 17, 918–934.
OpenUrl
↵
Gulrajani, I., Ahmed, F., Arjovsky, M., Dumoulin, V., Courville, A. C., 2017. Improved training of wasserstein gans. In: Advances in Neural Information Processing Systems. pp. 5767–5777.
↵
Gupta, A., Johnson, J., Fei-Fei, L., Savarese, S., Alahi, A., 2018. Social gan: Socially acceptable trajectories with generative adversarial networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp. 2255–2264.
↵
Hernández, M. d. C. V., González-Castro, V., Ghandour, D. T., Wang, X., Doubal, F., Maniega, S. M., Armitage, P. A., Wardlaw, J. M., 2016. On the computational assessment of white matter hyperintensity progression: difficulties in method selection and bias field correction performance on images with significant white matter pathology. Neuroradiology 58 (5), 475–485.
OpenUrl
↵
Jenkinson, M., Bannister, P., Brady, M., Smith, S., 2002. Improved optimization for the robust and accurate linear registration and motion correction of brain images. Neuroimage 17 (2), 825–841.
OpenUrl CrossRef PubMed Web of Science
↵
Jiaerken, Y., Luo, X., Yu, X., Huang, P., Xu, X., Zhang, M., 2018. Microstructural and metabolic changes in the longitudinal progression of white matter hyperintensities.
↵
Luo, X., Jiaerken, Y., Yu, X., Huang, P., Qiu, T., Jia, Y., Li, K., Xu, X., Shen, Z., Guan, X., Zhou, J., Zhang, M., Adni, F. T. A. D. N. I., 5 2017. Associations between APOE genotype and cerebral small-vessel disease: a longitudinal study. Oncotarget 8 (27), 44477–44489.
OpenUrl
↵
Maillard, P., Crivello, F., Dufouil, C., Tzourio-Mazoyer, N., Tzourio, C., Mazoyer, B., 2009. Longitudinal follow-up of individual white matter hyperintensities in a large cohort of elderly. Neuroradiology 51 (4), 209–220.
OpenUrl CrossRef PubMed
↵
Maillard, P., Fletcher, E., Harvey, D., Carmichael, O., Reed, B., Mungas, D., Decarli, C., 2011. White matter hyperintensity penumbra. Stroke 42 (7), 1917–1922.
OpenUrl Abstract/FREE Full Text
↵
Maillard, P., Fletcher, E., Lockhart, S. N., Roach, A. E., Reed, B., Mungas, D., Decarli, C., Carmichael, O. T., 2014. White matter hyperintensities and their penumbra lie along a continuum of injury in the aging brain. Stroke 45 (6), 1721–1726.
OpenUrl Abstract/FREE Full Text
↵
Mínguez, B., Rovira, A., Alonso, J., Córdoba, J., 2007. Decrease in the volume of white matter lesions with improvement of hepatic encephalopathy. American Journal of Neuroradiology 28 (8), 1499–1500.
OpenUrl Abstract/FREE Full Text
↵
Moriya, Y., Kozaki, K., Nagai, K., Toba, K., 2009. Attenuation of brain white matter hyperintensities after cerebral infarction. American Journal of Neuroradiology 30 (3), 3174.
OpenUrl
↵
Pasi, M., Van Uden, I. W., Tuladhar, A. M., De Leeuw, F. E., Pantoni, L., 2016. White Matter Microstructural Damage on Diffusion Tensor Imaging in Cerebral Small Vessel Disease: Clinical Consequences. Stroke 47 (6), 1679–1684.
OpenUrl FREE Full Text
↵
Perez, E., Strub, F., De Vries, H., Dumoulin, V., Courville, A., 2018. Film: Visual reasoning with a general conditioning layer. In: Thirty-Second AAAI Conference on Artificial Intelligence.
↵
Power C, M., Deal A, J., Sharrett Richey, A., Jack Jr, C. R., Knopman, D., Mosley H, T., Gottesman F, R., 2015. Smoking and white matter hyperintensity progression: The ARIC-MRI Study. Neurology 84 (8), 841–848.
OpenUrl CrossRef PubMed
↵
Prins, N. D., Scheltens, P., 2015. White matter hyperintensities, cognitive impairment and dementia: an update. Nature reviews. Neurology 11 (3), 157–65.
OpenUrl
↵
Prins, N. D., van Straaten, E. C. W., van Dijk, E. J., Simoni, M., van Schijndel, R. A., Vrooman, H. A., Koudstaal, P. J., Scheltens, P., Breteler, M. M. B., Barkhof, F., 5 2004. Measuring progression of cerebral white matter lesions on MRI. Neurology 62 (9), 1533 LP – 1539. URL http://n.neurology.org/content/62/9/1533.abstract
OpenUrl
↵
Rachmadi, M., Valdés-Hernández, M., Agan, M., Di Perri, C., Komura, T., 2018a. Segmentation of white matter hyperintensities using convolutional neural networks with global spatial information in routine clinical brain MRI with none or mild vascular pathology. Computerized Medical Imaging and Graphics 66.
↵
Rachmadi, M., Valdés-Hernández, M., Agan, M., Komura, T., 2017. Deep learning vs. conventional machine learning: pilot study of wmh segmentation in brain mri with absence or mild vascular pathology. Journal of Imaging 3 (4), 66.
OpenUrl
↵
Rachmadi, M. F., Valdés-Hernández, M. d. C., Komura, T., 2018b. Automatic irregular texture detection in brain mri without human supervision. In: International Conference on Medical Image Computing and Computer-Assisted Intervention. Springer, pp. 506–513.
↵
Rachmadi, M. F., Valdés-Hernández, M. d. C., Li, H., Guerrero, R., Meijboom, R., Wiseman, S., Waldman, A., Zhang, J., Rueckert, D., Wardlaw, J., et al., 2019a. Limited One-time Sampling Irregularity Map (LOTS-IM) for Automatic Unsupervised Assessment of White Matter Hyperintensities and Multiple Sclerosis Lesions in Structural Brain Magnetic Resonance Images. BioRxiv, 334292.
↵
Rachmadi, M. F., Valdés-Hernández, M. d. C., Makin, S., Wardlaw, J. M., Komura, T., 2019b. Predicting the evolution of white matter hyperintensities in brain mri using generative adversarial networks and irregularity map. bioRxiv, 662692.
↵
Ramirez, J., McNeely, A. A., Berezuk, C., Gao, F., Black, S. E., 2016. Dynamic progression of white matter hyperin-tensities in Alzheimer’s disease and normal aging: Results from the Sunnybrook dementia study. Frontiers in Aging Neuroscience 8 (MAR), 1–9.
OpenUrl
↵
Ronneberger, O., Fischer, P., Brox, T., 2015. U-net: Convolutional networks for biomedical image segmentation. In: International Conference on Medical image computing and computer-assisted intervention. Springer, pp. 234–241.
↵
Rovira Cañellas, A., Mínguez, B., Aymerich, F. X., Jacas, C., Huerga, E., Cóordoba, J., Alonso, J., 2007. Decreased white matter lesion volume and improved cognitive function after liver transplantation. Hepatology 46 (5), 1485–1490.
OpenUrl CrossRef PubMed
↵
Sachdev, P., Wen, W., Chen, X., Brodaty, H., 2007. Progression of white matter hyperintensities in elderly individuals over 3 years. Neurology 68 (3), 214–222.
OpenUrl CrossRef PubMed
↵
Schmidt, H., Zeginigg, M., Wiltgen, M., Freudenberger, P., Petrovic, K., Cavalieri, M., Gider, P., Enzinger, C., For-nage, M., Debette, S., Rotter, J. I., Ikram, M. A., Launer, J., Schmidt, R., 2011. Genetic variants of the NOTCH3 gene in the elderly and magnetic resonance imaging correlates of age-related cerebral small vessel disease. Brain 134 (11), 3384–3397.
OpenUrl CrossRef PubMed Web of Science
↵
Schmidt, R., Enzinger, C., Ropele, S., Schmidt, H., Fazekas, F., 2003. Progression of cerebral white matter lesions: 6-Year results of the Austrian Stroke Prevention Study. Lancet 361 (9374), 2046–2048.
OpenUrl CrossRef PubMed Web of Science
↵
Schmidt, R., Fazekas, F., Enzinger, C., Ropele, S., Kapeller, P., Schmidt, H., 2002a. Risk factors and progression of small vessel disease-related cerebral abnormalities. In: Ageing and Dementia Current and Future Concepts. Springer, pp. 47–52.
↵
Schmidt, R., Fazekas, F., Kapeller, P., Schmidt, H., Hartung, H.-P., 1999. MRI white matter hyperintensities: Three-year follow-up of the Austrian Stroke Prevention Study. Neurology 53 (1), 132–132.
OpenUrl CrossRef PubMed
↵
Schmidt, R., Ropele, S., Enzinger, C., Petrovic, K., Smith, S., Schmidt, H., Matthews, P. M., Fazekas, F., 2005. White matter lesion progression, brain atrophy, and cognitive decline: The Austrian stroke prevention study. Annals of Neurology 58 (4), 610–616.
OpenUrl CrossRef PubMed Web of Science
↵
Schmidt, R., Schmidt, H., Kapeller, P., Lechner, A., Fazekas, F., 2002b. Evolution of white matter lesions. Cerebrovascular Diseases 13 (SUPPL. 2), 16–20.
OpenUrl PubMed Web of Science
↵
Schmidt, R., Seiler, S., Loitfelder, M., 2016. Longitudinal change of small-vessel disease-related brain abnormalities. Journal of Cerebral Blood Flow and Metabolism 36 (1), 26–39.
OpenUrl CrossRef PubMed
↵
Spasov, S., Passamonti, L., Duggento, A., Lio, P., Toschi, N., Initiative, A. D. N., et al., 2019. A parameter-efficient deep learning approach to predict conversion from mild cognitive impairment to alzheimer’s disease. Neuroimage 189, 276–287.
OpenUrl
↵
Valdés Hernández, M. d. C., Armitage, P. A., Thrippleton, M. J., Chappell, F., Sandeman, E., Muñoz Maniega, S., Shuler, K., Wardlaw, J. M., 2015. Rationale, design and methodology of the image analysis protocol for studies of patients with cerebral small vessel disease and mild stroke. Brain and behavior 5 (12), e00415.
OpenUrl
↵
van Dijk, E. J., Prins, N. D., Vrooman, H. A., Hofman, A., Koudstaal, P. J., Breteler, M. M. B., 2008. Progression of cerebral small vessel disease in relation to risk factors and cognitive consequences: Rotterdam scan study. Stroke 39 (10), 2712–2719.
OpenUrl Abstract/FREE Full Text
↵
van Leijsen, E. M., Bergkamp, M. I., van Uden, I. W., Cooijmans, S., Ghafoorian, M., van der Holst, H. M., Norris, D. G., Kessels, R. P., Platel, B., Tuladhar, A. M., de Leeuw, F. E., 2019. Cognitive consequences of regression of cerebral small vessel disease. European Stroke Journal 4 (1), 85–89.
OpenUrl
↵
van Leijsen, E. M., de Leeuw, F.-E., Tuladhar, A. M., 2017a. Disease progression and regression in sporadic small vessel diseaseinsights from neuroimaging. Clinical Science 131 (12), 1191–1206.
OpenUrl Abstract/FREE Full Text
↵
van Leijsen, E. M., van Uden, I. W., Ghafoorian, M., Bergkamp, M. I., Lohner, V., Kooijmans, E. C., Van Der Holst, H. M., Tuladhar, A. M., Norris, D. G., van Dijk, E. J., Rutten-Jacobs, L. C., Platel, B., Klijn, C. J., De Leeuw, F. E., 2017b. Nonlinear temporal dynamics of cerebral small vessel disease. Neurology 89 (15), 1569–1577.
OpenUrl
↵
Veldink, J. H., Scheltens, P., Jonker, C., Launer, L. J., 1998. Progression of cerebral white matter hyperintensities on MRI is related to diastolic blood pressure. Neurology 51 (1), 319–320.
OpenUrl CrossRef PubMed
↵
Verhaaren, B. F. J., Vernooij, M. W., De Boer, R., Hofman, A., Niessen, W. J., Van Der Lugt, A., Ikram, M. A., F.J., V. B., W., V. M., Renske, d. B., Albert, H., J., N. W., Aad, v. d. L., Arfan, I. M., 6 2013. High Blood Pressure and Cerebral White Matter Lesion Progression in the General Population. Hypertension 61 (6), 1354–1359.
OpenUrl CrossRef PubMed
↵
Wardlaw, J. M., Chappell, F. M., Valdés Hernández, M. D. C., Makin, S. D., Staals, J., Shuler, K., Thrippleton, M. J., Armitage, P. A., Munz-Maniega, S., Heye, A. K., Sakka, E., Dennis, M. S., 2017. White matter hyperintensity reduction and outcomes after minor stroke. Neurology 89 (10), 1003–1010.
OpenUrl
↵
Wardlaw, J. M., Smith, E. E., Biessels, G. J., Cordonnier, C., Fazekas, F., Frayne, R., Lindley, R. I., O’Brien, J. T., Barkhof, F., Benavente, O. R., Black, S. E., Brayne, C., Breteler, M., Chabriat, H., Decarli, C., de Leeuw, F.-E., Doubal, F., Duering, M., Fox, N. C., Greenberg, S., Hachinski, V., Kilimann, I., Mok, V., Oostenbrugge, R. v., Pantoni, L., Speck, O., Stephan, B. C. M., Teipel, S., Viswanathan, A., Werring, D., Chen, C., Smith, C., van Buchem, M., Norrving, B., Gorelick, P. B., Dichgans, M., STandards for ReportIng Vascular changes on nEu-roimaging (STRIVE v1), 2013. Neuroimaging standards for research into small vessel disease and its contribution to ageing and neurodegeneration. The Lancet. Neurology 12 (8), 822–38.
OpenUrl
↵
Yamada, K., Sakai, K., Owada, K., Mineura, K., Nishimura, T., 2010. Cerebral white matter lesions may be partially reversible in patients with carotid artery stenosis. American Journal of Neuroradiology 31 (7), 1350–1352.
OpenUrl Abstract/FREE Full Text

View the discussion thread.

Posted August 19, 2019.

Download PDF

Data/Code

Citation Tools

Subject Area

Neuroscience

Subject Areas

All Articles

Animal Behavior and Cognition (5201)
Biochemistry (11718)
Bioengineering (8724)
Bioinformatics (29132)
Biophysics (14936)
Cancer Biology (12051)
Cell Biology (17360)
Clinical Trials (138)
Developmental Biology (9406)
Ecology (14146)
Epidemiology (2067)
Evolutionary Biology (18269)
Genetics (12223)
Genomics (16768)
Immunology (11844)
Microbiology (28016)
Molecular Biology (11560)
Neuroscience (60822)
Paleontology (450)
Pathology (1864)
Pharmacology and Toxicology (3231)
Physiology (4940)
Plant Biology (10401)
Scientific Communication and Education (1680)
Synthetic Biology (2878)
Systems Biology (7333)
Zoology (1642)

[1] ↵
Baumgartner, C. F., Koch, L. M., Tezcan, K. C., Ang, J. X., Konukoglu, E., 2017. Visual feature attribution using wasserstein gans. In: Proc IEEE Comput Soc Conf Comput Vis Pattern Recognit.

[2] ↵
Bland, J. M., Altman, D., 1986. Statistical methods for assessing agreement between two methods of clinical measurement. The lancet 327 (8476), 307–310.
OpenUrl

[3] ↵
Chappell, F. M., del Carmen Valdés Hernández, M., Makin, S. D., Shuler, K., Sakka, E., Dennis, M. S., Armitage, P. A., Muñoz Maniega, S., Wardlaw, J. M., 2017. Sample size considerations for trials using cerebral white matter hyperintensity progression as an intermediate outcome at 1 year after mild stroke: Results of a prospective cohort study. Trials 18 (1), 1–10.
OpenUrl CrossRef

[4] ↵
Cho, A.-H., Kim, H.-R., Kim, W., Yang, D. W., 2015. White Matter Hyperintensity in Ischemic Stroke Patients: It May Regress Over Time. Journal of Stroke 17 (1), 60.
OpenUrl

[5] ↵
Choi, H., Jin, K. H., Initiative, A. D. N., et al., 2018. Predicting cognitive decline with deep learning of brain metabolism and amyloid imaging. Behavioural brain research 344, 103–109.
OpenUrl CrossRef

[6] ↵
Çiçek, Ö., Abdulkadir, A., Lienkamp, S. S., Brox, T., Ron-neberger, O., 2016. 3d u-net: learning dense volumetric segmentation from sparse annotation. In: International conference on medical image computing and computer-assisted intervention. Springer, pp. 424–432.

[7] ↵
Dice, L. R., 1945. Measures of the amount of ecologic association between species. Ecology 26 (3), 297–302.
OpenUrl CrossRef Web of Science

[8] ↵
Durand-Birchenall, J., Leclercq, C., Daouk, J., Monet, P., Godefroy, O., Bugnicourt, J.-M., 2 2012. Attenuation of brain white matter lesions after lacunar stroke. International journal of preventive medicine 3 (2), 134–138.
OpenUrl

[9] ↵
Fan, H., Su, H., Guibas, L. J., 2017. A point set generation network for 3d object reconstruction from a single image. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp. 605–613.

[10] ↵
Godin, O., Tzourio, C., Maillard, P., Alperovitch, A., Mazoyer, B., Dufouil, C., 2009. Apolipoprotein E genotype is related to progression of white matter lesion load. Stroke 40 (10), 3186–3190.
OpenUrl Abstract/FREE Full Text

[11] ↵
Godin, O., Tzourio, C., Maillard, P., Mazoyer, B., Dufouil, C., 2011. Antihypertensive Treatment and Change in Blood Pressure Are Associated With the Progression of White Matter Lesion Volumes. Circulation 123 (3), 266–273.
OpenUrl Abstract/FREE Full Text

[12] ↵
Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., Bengio, Y., 2014. Generative adversarial nets. In: Advances in neural information processing systems. pp. 2672–2680.

[13] ↵
Gouw, A. A., van der Flier, W. M., Fazekas, F., van Straaten, E. C., Pantoni, L., Poggesi, A., Inzitari, D., Erkinjuntti, T., Wahlund, L. O., Waldemar, G., Schmidt, R., Scheltens, P., Barkhof, F., 2008a. Progression of White Matter Hyperintensities and Incidence of New Lacunes Over a 3-Year Period. Stroke 39 (5), 1414–1420.
OpenUrl Abstract/FREE Full Text

[14] ↵
Gouw, A. A., Van Der Flier, W. M., Van Straaten, E. C., Pantoni, L., Bastos-Leite, A. J., Inzitari, D., Erkinjuntti, T., Wahlund, L. O., Ryberg, C., Schmidt, R., Fazekas, F., Scheltens, P., Barkhof, F., 2008b. Reliability and sensitivity of visual scales versus volumetry for evaluating white matter hyperintensity progression. Cerebrovascular Diseases 25 (3), 247–253.
OpenUrl CrossRef PubMed

[15] ↵
Guerrero, R., Qin, C., Oktay, O., Bowles, C., Chen, L., Joules, R., Wolz, R., Valdés-Hernández, M. d. C., Dickie, D., Wardlaw, J., et al., 2018. White matter hyperintensity and stroke lesion segmentation and differentiation using convolutional neural networks. NeuroImage: Clinical 17, 918–934.
OpenUrl

[16] ↵
Gulrajani, I., Ahmed, F., Arjovsky, M., Dumoulin, V., Courville, A. C., 2017. Improved training of wasserstein gans. In: Advances in Neural Information Processing Systems. pp. 5767–5777.

[17] ↵
Gupta, A., Johnson, J., Fei-Fei, L., Savarese, S., Alahi, A., 2018. Social gan: Socially acceptable trajectories with generative adversarial networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp. 2255–2264.

[18] ↵
Hernández, M. d. C. V., González-Castro, V., Ghandour, D. T., Wang, X., Doubal, F., Maniega, S. M., Armitage, P. A., Wardlaw, J. M., 2016. On the computational assessment of white matter hyperintensity progression: difficulties in method selection and bias field correction performance on images with significant white matter pathology. Neuroradiology 58 (5), 475–485.
OpenUrl

[19] ↵
Jenkinson, M., Bannister, P., Brady, M., Smith, S., 2002. Improved optimization for the robust and accurate linear registration and motion correction of brain images. Neuroimage 17 (2), 825–841.
OpenUrl CrossRef PubMed Web of Science

[20] ↵
Jiaerken, Y., Luo, X., Yu, X., Huang, P., Xu, X., Zhang, M., 2018. Microstructural and metabolic changes in the longitudinal progression of white matter hyperintensities.

[21] ↵
Luo, X., Jiaerken, Y., Yu, X., Huang, P., Qiu, T., Jia, Y., Li, K., Xu, X., Shen, Z., Guan, X., Zhou, J., Zhang, M., Adni, F. T. A. D. N. I., 5 2017. Associations between APOE genotype and cerebral small-vessel disease: a longitudinal study. Oncotarget 8 (27), 44477–44489.
OpenUrl

[22] ↵
Maillard, P., Crivello, F., Dufouil, C., Tzourio-Mazoyer, N., Tzourio, C., Mazoyer, B., 2009. Longitudinal follow-up of individual white matter hyperintensities in a large cohort of elderly. Neuroradiology 51 (4), 209–220.
OpenUrl CrossRef PubMed

[23] ↵
Maillard, P., Fletcher, E., Harvey, D., Carmichael, O., Reed, B., Mungas, D., Decarli, C., 2011. White matter hyperintensity penumbra. Stroke 42 (7), 1917–1922.
OpenUrl Abstract/FREE Full Text

[24] ↵
Maillard, P., Fletcher, E., Lockhart, S. N., Roach, A. E., Reed, B., Mungas, D., Decarli, C., Carmichael, O. T., 2014. White matter hyperintensities and their penumbra lie along a continuum of injury in the aging brain. Stroke 45 (6), 1721–1726.
OpenUrl Abstract/FREE Full Text

[25] ↵
Mínguez, B., Rovira, A., Alonso, J., Córdoba, J., 2007. Decrease in the volume of white matter lesions with improvement of hepatic encephalopathy. American Journal of Neuroradiology 28 (8), 1499–1500.
OpenUrl Abstract/FREE Full Text

[26] ↵
Moriya, Y., Kozaki, K., Nagai, K., Toba, K., 2009. Attenuation of brain white matter hyperintensities after cerebral infarction. American Journal of Neuroradiology 30 (3), 3174.
OpenUrl

[27] ↵
Pasi, M., Van Uden, I. W., Tuladhar, A. M., De Leeuw, F. E., Pantoni, L., 2016. White Matter Microstructural Damage on Diffusion Tensor Imaging in Cerebral Small Vessel Disease: Clinical Consequences. Stroke 47 (6), 1679–1684.
OpenUrl FREE Full Text

[28] ↵
Perez, E., Strub, F., De Vries, H., Dumoulin, V., Courville, A., 2018. Film: Visual reasoning with a general conditioning layer. In: Thirty-Second AAAI Conference on Artificial Intelligence.

[29] ↵
Power C, M., Deal A, J., Sharrett Richey, A., Jack Jr, C. R., Knopman, D., Mosley H, T., Gottesman F, R., 2015. Smoking and white matter hyperintensity progression: The ARIC-MRI Study. Neurology 84 (8), 841–848.
OpenUrl CrossRef PubMed

[30] ↵
Prins, N. D., Scheltens, P., 2015. White matter hyperintensities, cognitive impairment and dementia: an update. Nature reviews. Neurology 11 (3), 157–65.
OpenUrl

[31] ↵
Prins, N. D., van Straaten, E. C. W., van Dijk, E. J., Simoni, M., van Schijndel, R. A., Vrooman, H. A., Koudstaal, P. J., Scheltens, P., Breteler, M. M. B., Barkhof, F., 5 2004. Measuring progression of cerebral white matter lesions on MRI. Neurology 62 (9), 1533 LP – 1539. URL http://n.neurology.org/content/62/9/1533.abstract
OpenUrl

[32] ↵
Rachmadi, M., Valdés-Hernández, M., Agan, M., Di Perri, C., Komura, T., 2018a. Segmentation of white matter hyperintensities using convolutional neural networks with global spatial information in routine clinical brain MRI with none or mild vascular pathology. Computerized Medical Imaging and Graphics 66.

[33] ↵
Rachmadi, M., Valdés-Hernández, M., Agan, M., Komura, T., 2017. Deep learning vs. conventional machine learning: pilot study of wmh segmentation in brain mri with absence or mild vascular pathology. Journal of Imaging 3 (4), 66.
OpenUrl

[34] ↵
Rachmadi, M. F., Valdés-Hernández, M. d. C., Komura, T., 2018b. Automatic irregular texture detection in brain mri without human supervision. In: International Conference on Medical Image Computing and Computer-Assisted Intervention. Springer, pp. 506–513.

[35] ↵
Rachmadi, M. F., Valdés-Hernández, M. d. C., Li, H., Guerrero, R., Meijboom, R., Wiseman, S., Waldman, A., Zhang, J., Rueckert, D., Wardlaw, J., et al., 2019a. Limited One-time Sampling Irregularity Map (LOTS-IM) for Automatic Unsupervised Assessment of White Matter Hyperintensities and Multiple Sclerosis Lesions in Structural Brain Magnetic Resonance Images. BioRxiv, 334292.

[36] ↵
Rachmadi, M. F., Valdés-Hernández, M. d. C., Makin, S., Wardlaw, J. M., Komura, T., 2019b. Predicting the evolution of white matter hyperintensities in brain mri using generative adversarial networks and irregularity map. bioRxiv, 662692.

[37] ↵
Ramirez, J., McNeely, A. A., Berezuk, C., Gao, F., Black, S. E., 2016. Dynamic progression of white matter hyperin-tensities in Alzheimer’s disease and normal aging: Results from the Sunnybrook dementia study. Frontiers in Aging Neuroscience 8 (MAR), 1–9.
OpenUrl

[38] ↵
Ronneberger, O., Fischer, P., Brox, T., 2015. U-net: Convolutional networks for biomedical image segmentation. In: International Conference on Medical image computing and computer-assisted intervention. Springer, pp. 234–241.

[39] ↵
Rovira Cañellas, A., Mínguez, B., Aymerich, F. X., Jacas, C., Huerga, E., Cóordoba, J., Alonso, J., 2007. Decreased white matter lesion volume and improved cognitive function after liver transplantation. Hepatology 46 (5), 1485–1490.
OpenUrl CrossRef PubMed

[40] ↵
Sachdev, P., Wen, W., Chen, X., Brodaty, H., 2007. Progression of white matter hyperintensities in elderly individuals over 3 years. Neurology 68 (3), 214–222.
OpenUrl CrossRef PubMed

[41] ↵
Schmidt, H., Zeginigg, M., Wiltgen, M., Freudenberger, P., Petrovic, K., Cavalieri, M., Gider, P., Enzinger, C., For-nage, M., Debette, S., Rotter, J. I., Ikram, M. A., Launer, J., Schmidt, R., 2011. Genetic variants of the NOTCH3 gene in the elderly and magnetic resonance imaging correlates of age-related cerebral small vessel disease. Brain 134 (11), 3384–3397.
OpenUrl CrossRef PubMed Web of Science

[42] ↵
Schmidt, R., Enzinger, C., Ropele, S., Schmidt, H., Fazekas, F., 2003. Progression of cerebral white matter lesions: 6-Year results of the Austrian Stroke Prevention Study. Lancet 361 (9374), 2046–2048.
OpenUrl CrossRef PubMed Web of Science

[43] ↵
Schmidt, R., Fazekas, F., Enzinger, C., Ropele, S., Kapeller, P., Schmidt, H., 2002a. Risk factors and progression of small vessel disease-related cerebral abnormalities. In: Ageing and Dementia Current and Future Concepts. Springer, pp. 47–52.

[44] ↵
Schmidt, R., Fazekas, F., Kapeller, P., Schmidt, H., Hartung, H.-P., 1999. MRI white matter hyperintensities: Three-year follow-up of the Austrian Stroke Prevention Study. Neurology 53 (1), 132–132.
OpenUrl CrossRef PubMed

[45] ↵
Schmidt, R., Ropele, S., Enzinger, C., Petrovic, K., Smith, S., Schmidt, H., Matthews, P. M., Fazekas, F., 2005. White matter lesion progression, brain atrophy, and cognitive decline: The Austrian stroke prevention study. Annals of Neurology 58 (4), 610–616.
OpenUrl CrossRef PubMed Web of Science

[46] ↵
Schmidt, R., Schmidt, H., Kapeller, P., Lechner, A., Fazekas, F., 2002b. Evolution of white matter lesions. Cerebrovascular Diseases 13 (SUPPL. 2), 16–20.
OpenUrl PubMed Web of Science

[47] ↵
Schmidt, R., Seiler, S., Loitfelder, M., 2016. Longitudinal change of small-vessel disease-related brain abnormalities. Journal of Cerebral Blood Flow and Metabolism 36 (1), 26–39.
OpenUrl CrossRef PubMed

[48] ↵
Spasov, S., Passamonti, L., Duggento, A., Lio, P., Toschi, N., Initiative, A. D. N., et al., 2019. A parameter-efficient deep learning approach to predict conversion from mild cognitive impairment to alzheimer’s disease. Neuroimage 189, 276–287.
OpenUrl

[49] ↵
Valdés Hernández, M. d. C., Armitage, P. A., Thrippleton, M. J., Chappell, F., Sandeman, E., Muñoz Maniega, S., Shuler, K., Wardlaw, J. M., 2015. Rationale, design and methodology of the image analysis protocol for studies of patients with cerebral small vessel disease and mild stroke. Brain and behavior 5 (12), e00415.
OpenUrl

[50] ↵
van Dijk, E. J., Prins, N. D., Vrooman, H. A., Hofman, A., Koudstaal, P. J., Breteler, M. M. B., 2008. Progression of cerebral small vessel disease in relation to risk factors and cognitive consequences: Rotterdam scan study. Stroke 39 (10), 2712–2719.
OpenUrl Abstract/FREE Full Text

[51] ↵
van Leijsen, E. M., Bergkamp, M. I., van Uden, I. W., Cooijmans, S., Ghafoorian, M., van der Holst, H. M., Norris, D. G., Kessels, R. P., Platel, B., Tuladhar, A. M., de Leeuw, F. E., 2019. Cognitive consequences of regression of cerebral small vessel disease. European Stroke Journal 4 (1), 85–89.
OpenUrl

[52] ↵
van Leijsen, E. M., de Leeuw, F.-E., Tuladhar, A. M., 2017a. Disease progression and regression in sporadic small vessel diseaseinsights from neuroimaging. Clinical Science 131 (12), 1191–1206.
OpenUrl Abstract/FREE Full Text

[53] ↵
van Leijsen, E. M., van Uden, I. W., Ghafoorian, M., Bergkamp, M. I., Lohner, V., Kooijmans, E. C., Van Der Holst, H. M., Tuladhar, A. M., Norris, D. G., van Dijk, E. J., Rutten-Jacobs, L. C., Platel, B., Klijn, C. J., De Leeuw, F. E., 2017b. Nonlinear temporal dynamics of cerebral small vessel disease. Neurology 89 (15), 1569–1577.
OpenUrl

[54] ↵
Veldink, J. H., Scheltens, P., Jonker, C., Launer, L. J., 1998. Progression of cerebral white matter hyperintensities on MRI is related to diastolic blood pressure. Neurology 51 (1), 319–320.
OpenUrl CrossRef PubMed

[55] ↵
Verhaaren, B. F. J., Vernooij, M. W., De Boer, R., Hofman, A., Niessen, W. J., Van Der Lugt, A., Ikram, M. A., F.J., V. B., W., V. M., Renske, d. B., Albert, H., J., N. W., Aad, v. d. L., Arfan, I. M., 6 2013. High Blood Pressure and Cerebral White Matter Lesion Progression in the General Population. Hypertension 61 (6), 1354–1359.
OpenUrl CrossRef PubMed

[56] ↵
Wardlaw, J. M., Chappell, F. M., Valdés Hernández, M. D. C., Makin, S. D., Staals, J., Shuler, K., Thrippleton, M. J., Armitage, P. A., Munz-Maniega, S., Heye, A. K., Sakka, E., Dennis, M. S., 2017. White matter hyperintensity reduction and outcomes after minor stroke. Neurology 89 (10), 1003–1010.
OpenUrl

[57] ↵
Wardlaw, J. M., Smith, E. E., Biessels, G. J., Cordonnier, C., Fazekas, F., Frayne, R., Lindley, R. I., O’Brien, J. T., Barkhof, F., Benavente, O. R., Black, S. E., Brayne, C., Breteler, M., Chabriat, H., Decarli, C., de Leeuw, F.-E., Doubal, F., Duering, M., Fox, N. C., Greenberg, S., Hachinski, V., Kilimann, I., Mok, V., Oostenbrugge, R. v., Pantoni, L., Speck, O., Stephan, B. C. M., Teipel, S., Viswanathan, A., Werring, D., Chen, C., Smith, C., van Buchem, M., Norrving, B., Gorelick, P. B., Dichgans, M., STandards for ReportIng Vascular changes on nEu-roimaging (STRIVE v1), 2013. Neuroimaging standards for research into small vessel disease and its contribution to ageing and neurodegeneration. The Lancet. Neurology 12 (8), 822–38.
OpenUrl

[58] ↵
Yamada, K., Sakai, K., Owada, K., Mineura, K., Nishimura, T., 2010. Cerebral white matter lesions may be partially reversible in patients with carotid artery stenosis. American Journal of Neuroradiology 31 (7), 1350–1352.
OpenUrl Abstract/FREE Full Text

Automatic Spatial Estimation of White Matter Hyperintensities Evolution in Brain MRI using Disease Evolution Predictor Deep Neural Networks

Abstract

1. Introduction

2. Disease Evolution Map (DEM)

3. Disease Evolution Predictor (DEP) Model using Deep Neural Networks

3.1. DEP Generative Adversarial Network (DEP-GAN)

3.2. DEP U-Residual Network (DEP-UResNet)

4. Auxiliary Input in DEP Model

5. Subjects and Data

6. Experiment Setup

7. Evaluation Metrics

8. Results and Discussion

8.1. Comparison against the Gold Standard WMH Labels

8.2. Ablation Study of Auxiliary Input in DEP Models

8.3. Comparison to WMH Label from Original Input Modality (PM or IM)

8.4. Clinical Plausibility of DEP Models’ Results

9. Conclusion and Future Work

Acknowledgements

Footnotes

References

Citation Manager Formats

Subject Area