Abstract
Electroencephalography (EEG)/Magnetoencephalography (MEG) source imaging aims to seek an estimation of underlying activated brain sources to explain the observed EEG/MEG recording. Due to the ill-posed nature of inverse problem, solving EEG/MEG Source Imaging (ESI) requires design of regularization or prior terms to guarantee a unique solution. Traditionally, the design of regularization terms is based on preliminary assumptions on the spatio-temporal structure in the source space. In this paper, we propose a novel paradigm to solve the ESI problem by using Unrolled Optimization Neural Network (UONN) (1) to improve the efficiency compared to traditional iterative algorithms; (2) to establish a data-driven way to model the source solution structure instead of using hand-crafted regularizations; (3) to learn the hyperparameter automatically in a data-driven manner. The proposed framework is based on unfolding of the iterative optimization algorithm with neural network modules. The proposed new learning framework is the first one that use the unrolled optimization neural network to solve the ESI problem. The newly designed framework can effectively learn the source extents pattern and achieved significantly improved performance compared to benchmark algorithms.
1 Introduction
Understanding the complex firing neurons and interactions between neural circuits at different brain regions paves an important path to uncover the brain mechanism and brain dysfunctions [1]. Electroencephalography (EEG) or Magnetoencephalography (MEG) is a measurement of the electric potentials on the scalp generated from the electric current sources in the brain [2]. The EEG/MEG measurement is a measures directly the electric firing pattern in the brain, while fMRI, measures the blood-oxygenlevel-dependent (BOLD) signal, which is a secondary measurement of the metabolic signal. EEG/MEG has a very high temporal resolution up to 1 millisecond compared to the temporal resolution of around 1 second for fMRI. EEG or MEG devices also have the advantage of being inexpensive, easity portability and versatility. EEG is accepted as a powerful tool to capture the instantaneous brain functionality by measuring the neuronal processes [3]. However, one disadvantage of EEG is its poor spatial resolution and it measures the electric potential on the scalp instead of the underlying sources in the brain. EEG/MEG source imaing (ESI) bridges the gap between the scalp EEG measurement to the brain source activations as it infers the brain sources activation by solving the inverse problem based on the measurement of EEG or MEG [4]. However, given that the dimension of source signal significantly outnumbers the EEG/MEG sensors, the ESI is an ill-conditioned inverse problem that requires sophisticated design of regularizations that utilize the spatial-temporal assumptions on the source space [5, 6].
Recently, numerous algorithms have been developed with different assumptions on the source structure. One seminal work is minimum norm estimate (MNE) where ℓ2 norm is used as a regularization [7]. Variants of MNE algorithm include dynamic statistical parametric mapping (dSPM) [8] and standardized low-resolution brain electromagnetic tomography (sLORETA) [9]. The ℓ2-norm based methods tend to render spatially-diffuse source estimation. To promote a sparse solution, Uutela et al. [10] introduced the ℓ1-norm, known as minimum current estimate (MCE). Also, Rao and Kreutz-Delgado proposed an affine scaling method [11] for a sparse ESI solution. Bore et al. proposed to use the ℓp-norm regularization (p < 1) on the source signal and the ℓ1 norm on the data fitting error term [12]. Babadi et al. [13] demonstrated that sparse distributed solutions to event-related stimuli can be found using a greedy subspace-pursuit algorithm. It is worthnoting that the sparse constraint can be applied to the orignal source signal or the transformed spatial gradient domain [14, 15]. As the brain is activated not discretely or pointwisely, an extended area of source estimation is prefered [16], and it has been used for multiple applications, such as somatosensory cortical mapping [17], and epileptogenic zone in focal epilepsy patients [18].
Most of the recently developed algorithms requires an iterative procedure to reach the final solution, which can be time consuming. Inspired by the recent advancement of unrolled optimization in solving the inverse problem [19–21], we attempt to use the unrolled optimization deep learning framework to solve the ESI problem to improve the accuracy and effiency solving ESI problem. The advantage of using the unrolled optimization framework is to learn a data-driven regularization instead of using handcrafted one such as total variation [22], and also replace the iterative procedure with neural network modules, thus improving the online reconstruction efficiency significantly.
2 EEG/MEG source imaging problem
EEG data are mostly generated by pyramidal cells in the gray matter with an orientation perpendicular to the cortex. The ESI forward model can be expressed as Y = LS + E, where is the EEG/MEG measurements, C is the number EEG/MEG channels, T is the number of time points, is the leadfield matrix which characterizes the mapping from brain source space to EEG/MEG channel space, represents the electrical potentials in N source locations for all the T time points, and E is the uncertainty/noise. The ESI inverse problem is to estimate S given the EEG/MEG measurements. Since channel number C is much smaller than the number of sources N, estimating S becomes ill-posed and has infinitely number of solutions. In order to find a unique solution, different regularizations were introduced by using prior assumptions of the source solution. More specifically, S can be obtained by solving the following minimization problem: where || · ||F is the Frobenius norm. The first term of Eq.(1) is called data fitting which tries to explain the observed EEG data, and the second term is called regularization term which is imposed to find a unique solution of Eq.(1) by using the sparsity or other neurophysiology inspired regularizations. If R(S) equals ℓ2 norm, the problem is called minimum norm estimate (MNE) [7]; if R(S) is defined using ℓ1 norm, the problem becomes minimum current estimate (MCE) [10].
ESI model with edge sparse total variation
As the cortex is discretized with 3D meshes, simply employing ℓ1 norm on S will result in an estimated descrete source located across the cortex instead of an extended continuous area in the cortex. In order to encourage source extents estimation, Ding proposed to use a sparse constraint in the transformed domain by introducing TV defined from the irregular 3D mesh [22]. Other researchers used the same TV definition such as [4, 23–25]. The TV was defined to be the ℓ1 norm of the first order spatial gradient using a linear transform matrix with its definition can be found in [22], where N is the number of voxels/sources, P equals the sum of the degrees of all source nodes. Qin et al [26] used a fractional-order total variation term to promote “Gaussian” shape of activation. The model with sparsity and total variation constraints are given as follows: where || · ||1 represents ℓ1 norm on both row and column of a vector or matrix. The first term is the data fitting term, and the second term is the total variation term. Ideally, the TV regularization promotes source extents estimation. However, it is worthnoting that the TV transormation matrix is hand-crafted, either defined on the first or seond spatial derivative, or using fractional-order TV, it can be limiting the flexibility of source configurations. In this paper, we try to learn the total variation in a data-driven way by using Unrolled Optimization Neural Network (UONN). The proposed network architecture is given in Fig.1.
3 Proposed unrolled optimization neural network
In this study, we use the recent development of network based compressive sensing (CS) approach [19], the framework is based on iterative shrinkage-thresholding algorithm (ISTA) for solving generally ℓ1 norm CS problem [27]. The traditional ISTA iterative reconstruction into a deep neural network. To solve Eq. 2, the following update steps are iterated in ISTA: where k denotes the interation index, and ρ is the step size. Eq. (3) is the gradient decent update of S, with an updated S denoted as R. Eq. (4) is a special case of promximal mapping, i.e., proxλΦ(RK) and. Instread of using the hand ||V S||1, we use Φ(S) to learn the implicit VS in a data-driven manner. Thus, the above objective function is re-written as:
Solving Eq. (5) using ISTA, the Eq. (4) becomes
The two iterative steps of Eq.(3) and Eq. (6) can be cast into two seprate modules in the k-th iteration of UONN, which is the R(k) module and S(k) Smodule. Similar to Zhang & Ghanem [19], we allow the step size p to vary across iterations, so the output of this module is given as:
The S(k) Smodule is to compute the S(k) Swith an input R(k). We use an approximated optimization function defined as follows:
The above equation has a closed-form solution for Φ(S(k)), which is
As our goal is to find S rather than Φ(S), we introduce a learnable implicit inverse function , satisfying . As a result, the update on S(k) is written as:
The soft shreshold function soft(x, a) is defined as sign(x) max(|x| – a, 0). The update on S is implemented using 2-layers fully connected network connected with a sigmoid activation function (to learn Φ(·)), and same struture for , and linked by a soft threshold operator. The structure of proposed network is illustrated in Fig. (Meng will create it, please refer to Fig.2 of Jian’s ISTA-Net paper).
With this learnable spatial structure expression for S using Φ(S), we aim to learn a more flexible data-driven expression for the extended source activation. The learnable parameters include , where Np is the total number of UONN phases, where each phase corresponds to each iteration of ISTA algorithm, and each module in each phase corrsponds to the R update and S update in Eq. (3) and Eq. (4).
Loss function
With the training data tuples {yi, si} for i ∈ {1,…, T}, the UONN generates the resoncstrution result denoted as , the discrepancy loss is denoted as , or . Inspired by the gradual improvement of reconstructed solution, we introduced another smoothness loss , to measure the consistency between two iterative steps to improve the robustness of the algorithm. The final loss function is: , and α is the weight to balance the above two loss components.
4 Numerical Experiments
In this section, we conducted numerical experiments to validate the effectiveness of the proposed method under different Signal Noise Ratios (SNR) for both synthetic EEG data and also validate it in real EEG data for epileptogenic zone localization.
Simulation experiments
We first conducted experiments on synthetic EEG data with known activation patterns, as the ground truth activation pattern for the
Forward model
We used a real head model to calculate the leadfield matrix. The head model was calculated based on T1-MRI images from a 26-year old male subject. Brain tissue segmentation and source surface reconstruction were conducted using FreeSurfer [28]. We used a 128-channel BioSemi EEG cap layout and coregister EEG channels with the head model using Brainstorm and further validated on MNE-Python toolbox [29]. The source space contains 1026 sources in each hemisphere, with 2052 sources combined, resulting in a leadfield matrix K with a dimension of 128 by 2052.
Experimental settings
To generate EEG data, we activate all 2052 locations in the source space as the central source in turn, and used 3 different neighborhood levels (1-, 2-, and 3-level of neighborhood) to represent different sizes of source extents, illustrated in Fig.2, We activated the whole “patch” of sources corresponding to neighbors at different levels. The activation strength of the central region is set to 1, while the activation strength of the 1-, 2-, and 3-level of adjacent regions is set to 0.8, 0.6, and 0.4 successively.
Then we used the forward model to generate scalp EEG data under different SNR settings (SNR = 40 dB, 30 dB, 20 dB, 10 db and 0 dB). We set the length of EEG data in each experiment setting to be 0.5 second with 100 Hz sampling rate. For each setting of SNR and neighborhood level, we divided the simulated data into training set and testing set according to the proportion of 80% and 20%. The number of layers in the ISTA-net is set to 9, and the Φ(·) and is approximated using a fully connected feedforward neural network with single hidden layer, the input nodes, hidden nodes as well as the output nodes of the network module to approximate Φ(·) are set to 2052, 500, and 128, respectively, and the number of input nodes, hidden nodes as well as the output nodes for the network module approximating is set to be 125, 500 and 2052. The design of this module archtecture is in resemblace of an autoencoder network [30]. The number of phases is set to be 9.
We chose the training set with 2-level of neighborhood and 40 dB SNR to train the UONN, then the source reconstruction is conducted on all test sets at different levels of source neighborhood configuration and settings of SNR. We picked 10 random source locations to conduct source reconstruction. The benchmark algorithms including MNE [7], dSPM [8], and sLORETA [9] were used for comparison. We quantitatively evaluated the performance of each competing algorithm based on the following metrics:
Localization error (LE): it measures the geodesic distance between two source locations on the cortex meshes using the Dijkstra shortest path algorithm.
Area under curve (AUC): it is particularly useful to characterize the overlap of an extended source activation pattern.
Better performance for localization is expected if LE is close to 0 and AUC is close to 1. The performance comparison between the proposed method and benchmark algorithms on LE and AUC is summarized in Table 1, and the boxplot figures for SNR=40 dB, 20 dB and 10 dB is given in Fig.3.
From Table 1, we can see that compared to the benchmark algorithms MNE, dSPM, and sLORETA, the proposed UONN method exhibits excellent performance, while the benchmark algorithms can only provide satisfactory source reconstruction when the activated area is small and the SNR level is low. With the expansion of the source range and the increase of noise level, the LE values of the benchmark algorithms deteriorate significantly, so as to the AUC values. By contrast, the superiority of the UONN method is demonstrated by improved AUC values and smaller LE values. It shows better per formance for larger source extents and higher noise level compared to the benchmark algorithms. The corresponding LE values are always at a low level, and the AUC values are always stable above 0.94, which demonstrates the excellent stability of the proposed method. By comparing different source activation size, we can see larger source extents are more difficult to be localized exactly, with a worse AUC then smaller areas of activation, however, the localization error can be lower in the benchmark algorithms. The noise can impact significantly on the performance of all the benchmark algorithms, however, when SNR equal or above 10 dB, our algorithm performs very well in recovering the source extents.
The comparison between the reconstructed source distributions based with 3-level of neighborhood and 40 dB, 30 dB and 20 dB SNR is shown in Fig. 4.
4.1 Real data experiments
We analyzed a real dataset that is publicly accessible through the Brainstorm tutorial datasets [31]. This dataset was recorded from a patient with focal epilepsy who was seizure-free during a 5-year follow-up period after a left frontal tailored resection. We followed the Brainstorm tutorial to obtain the head model and the lead field matrix, then we calculated the average spikes (as shown in Fig. 5) of the EEG measurements of 29 channels. We used the peak (0 ms) of the averaged interictal spike for brain source localization, and the comparison between the reconstructed source distributions based on different methods is shown in Fig. 6.
From Fig. 6, we can see that both the MNE method and the proposed UONN method can accurately locate the epileptogenic zone, which was validated by the follow-up survey after resection on the left frontal region. From the source reconstruction results of the proposed method, it can be seen that in addition to the accurate location of the lesion area, the difference in signal strength between the central region and its adjacent regions can also be seen clearly. This shows that the UONN method can reconstruct source signals of different intensities corresponding to neighbors of different levels. In contrast, the source distribution area estimated by sLORETA and dSPM is highly broad, although the left frontal region is covered, part of the right frontal which is not related to the epilepsy lesion is also included. Obviously, among these methods, the proposed UONN method provides a cleaner and accurate estimation of the epileptogenic zone.
5 Conclusion
In this paper, we propose a new method based on deep learning unrolled optimization framework for brain source reconstruction. The proposed framework enjoys the advantage of great approximation capability and principled parameter training procedure in deep learning. It eliminate the necessity of hand-crafted total variation in the traditional method. We designed a neural network module to learn the spatial structure of source extents. We incorporated a smoothness constraint in the loss function to mimic the iterative changes in the solution. The numerical experiments demonstrated a great boost in the performance measured by AUC and LE.
Footnotes
mjiao{at}stevens.edu, fliu22{at}stevens.edu