Abstract
To infer brain source activation patterns under different cognitive tasks is an integral step to understand how our brain works. Traditional electroencephalogram (EEG) Source Imaging (ESI) methods usually do not distinguish task-related and spurious non-task-related sources that jointly generate EEG signals, which inevitably yield misleading reconstructed activation patterns. In this research, we argue that the task-related source signal intrinsically has a low-rank property, which is exploited to to infer the true task-related EEG sources location. Although the true task-related source signal is sparse and low-rank, the contribution of spurious sources scattering over the source space with intermittent activation patterns makes the actual source space lose the low-rank property. To reconstruct a low-rank true source, we propose a novel ESI model that involves a spatial low-rank representation and a temporal Laplacian graph regularization, the latter of which guarantees the temporal smoothness of the source signal and eliminate the spurious ones. To solve the proposed model, an augmented Lagrangian objective function is formulated and an algorithm in the framework of alternating direction method of multipliers is proposed. Simulation results illustrate the effectiveness of the proposed method in terms of reconstruction accuracy.
I. Introduction
As a direct measurement modality of neural electrical firing patterns, electroencephalogram (EEG) has a higher temporal resolution up to millisecond compared to positron emission topography (PET) and functional magnetic resonance imaging (fMRI) [1]–[3]. However, a limitation of EEG is its low spatial resolution since the measurement is on the scalp rather than inside the brain. Given the recorded scalp EEG data, to reconstruct of the brain sources signal inside the brain is known as EEG inverse problem or EEG source imaging (ESI) [4]. ESI technique has been widely used in the study of language mechanisms, cognition process, sensory function, as well as the localization of epileptic seizure [1], [5].
Since the number of electrodes usually outnumbers that of brain sources, the ESI problem is highly ill-posed or mathematically underdetermined. A variety of methods have been proposed to address this challenging problem with different neurophysiological assumptions,formulated by various regularization techniques [4], [6]. There are two main types of inverse solvers for ESI: dipole fitting and distributed inverse solvers [7]. Dipole fitting empirically solves the MEG/EEG forward and inverse problems by characterizing the neural generators responsible for electrical potential detected on the scalp sensors [4], [8]. The algorithms proposed in recent years to solve the ESI problem within distributed dipoles paradigm can be further summarized into three categories in general, which are the (1) Bayesian framework [9]–[13], (2) state-space based algorithms [14]–[18], (3) models using sparse representation technique [19]–[22], [7], [23].
In many sensory or cognitive studies, the underlying activated cortical responsible for the signal processing are relatively focal and thus sparse, which makes the spatial sparse constrain not only mathematically necessary but also neurologically reasonable [1]. One widely accepted assumption is that a sparse spatial structure is favored than a complicated source configuration to explain the same data [24]. An representative early pioneering work is the ℓ2-norm based minimum norm estimate (MNE) inverse solver [25]. Based on MNE algorith, Pascual-Marqui et al. later proposed standardized low-resolution brain electromagnetic tomography (sLORETA) [26] that enforces spatial smoothness of the neighboring sources and normalizes the solution with respect to the estimated noise level. Some algorithms proposed to use combined multiple solvers, e.g., Weighted minimum norm-LORETA (WMN-LORETA) which combines the LORETA solver and a weighted minimum norm to compensate for deeper sources originate from the subcortical regions [27]. As the above mentioned algorithms are based on ℓ2-norm to different extents, the estimated source area is over-diffuse. By replacing ℓ2-norm by ℓ1-norm, minimum current estimate (MCE) [28] is proposed to overcome overestimation of active area sizes incurred by ℓ2-norm. Recent development of compressive sensing algorithm proved the ℓp (p ≤ 1) regularization on the original source signal usually provides a set of discrete sources distributed across the cortex due to high coherence of lead field matrix, in order to encourage reconstruction of extended source patches, it has been found that by enforcing sparsity in a transformed domain, e.g., total variation (TV) regularization [29], [30], [22], [31], focal source extents can be better estimated.
The aforementioned algorithms estimate source location at each time point independently, leading to discrepancy along the time direction. To encourage temporal smoothness, a number of regularization techniques based on spatiotemporal mixed norms have been developed, including the famous Mixed Norm Estimates (MxNE) which uses ℓ1,2-norm regularization [20], and time-frequency mixed-norm estimate (TF-MxNE) which uses structured sparse priors in time-frequency domain for better estimation of the non-stationary and transient source signal [19]. The disadvantage of MxNE is imposing ℓ2 norm can’t guarantee the temporal smoothness in neighboring time points when the source signal is corrupted, while the limitation of TF-MxNE is the lost of high temporal resolution by using of STFT Gabor matrix as a dictionary for time-frequency coefficients. Due to the existence strong spontaneous background source activations, discriminative source activation pattern corresponding to different cognitive tasks which provide more insights shall be reconstructed, Liu et al proposed to use label information of EEG data to find those discriminative source activation pattern [40], [32], [33].
One common drawback of the existing ESI algorithms is that they only consider noises on the sensor level and ignore the spurious noise from the cortex. If perfectly reconstructed, the estimated source is aggregated by task-related source and spurious noise in the source space. The true task-related sources will be corrupted by spurious sources, which motivates us to develop new algorithms to find true task-related source. There are two commonly accepted assumptions (1) spatially sparse (2) temporally continuous for the task-related source activation pattern, which inevitably leads to the low-rank property of the source space. To better discover the task-related source, we impose the low-rank term in the goal function as we consider it is a more direct constraint for spatial sparse. Also, we use a more direct penalty term to temporal smoothness, which is to penalize dissimilarity of temporally neighboring samples based on manifold graph embedding. It is worth-noting that we used the graph regularization term in our previous paper, however the graph is defined to be fully connected for all the points within one class [40], which inevitably drive all the activate patterns at different time points having the same magnitude, thus making the previously defined graph regularization term rely on a strong assumption and limit its future application for realistic cases.
In this paper, we propose a novel EEG source imaging model based on temporal graph regularized low-rank representation. The model is solved based on the alternating direction method of multipliers (ADMM) [34]. We conducted extensive numerical experiments to verify the effectiveness on discovering task related low-rank sources. The reconstructed solution is temporally smooth and spatially sparse. The contributions of our paper are summarized as follows:
1) We propose to consider the noise not only in the sensor level, but also in the source space.
2) A low-rank representation model (LRR) is proposed for the first time on EEG inverse problem inspired by the low-rank property of true task-related source configurations.
3) We redefined graph embedding regularization based on our previous paper that utilizes temporal vicinity information of samples to promote temporal smoothness.
4) A new algorithm based on ADMM is given which is capable of extracting the low-rank task-related source patterns.
II. Inverse Problem and Temporal Graph Structures
In this section, we briefly review the inverse problem and then discuss the design of temporal graph regularization.
A. The Inverse Problem
The cortex source activations propagate to EEG sensors through a linear mapping matrix called lead field matrix, and it can be described as the following linear model: where is the EEG data measured at a set of Nc electrodes for Nt time points, is the lead field matrix which maps the source signal to sensors on the scalp, each column of L represents the electrical field of a particular source to the EEG electrodes, represents the corresponding driving potential in Nd sources locations for the Nt time instants. Since the number of sources is much larger than electrodes, solving S given X is ill-posed with infinite feasible solutions, which necessitates a regularization term to be imposed. Generally, an estimate of S can be found by minimizing the following cost function, which is composed of a quadratic error and a regularization term: where ‖.‖F is the Frobenius Norm. The penalty term ⊖(S) is to encourage neurophysiologically plausible solutions. The regularization term take the form of ℓ2, ℓ1 or mixed norm. For example, spatially smooth formulation as in LORETA estimation or spatially sparse formulation with Least Absolute Shrinkage and Selection Operator (LASSO) estimate.
B. Temporal Graph Embedding
An important assumption on the source signal is that two temporal adjoint data points should have similar intrinsic activation pattern. In computer vision community, a lot of manifold learning methods have been proposed to find intrinsic similar structure on low-dimensional sub-manifolds embedded in a high dimensional ambient space, such as locally linear embedding [35], Locality Preserving Projection [36], Neighborhood Preserving Embedding [37]. A graph can be viewed as geometric neighborhood relationship between each vertex representing each data sample, the weight between vertex represents similarity between two points [38]. Inspired by the manifold theory [39], we use a regularization term to penalize the difference of two neighboring source signal. In our previous work, we use a graph regularization term to promote intra-class consistency [40], but the assumption is too strong by requiring all the reconstructed sources at different time points has the same location as well as signal magnitude as long as they belong to the same class. Now define a temporal graph regularization as where si is the i-th column of the matrix S, and a binary matrix W is designed as follows
The graph embedding matrix W contains temporal vicinity information. Nk(si) is the set containing k temporally closest points to si. This formulation intends to force neighboring source signal having similar pattern. The benefits are twofold, one is for temporal smoothness, another advantage is to make the reconstructed source denoised for intermittent spurious source activates. By defining D as a diagonal matrix whose entries are row sums of the symmetric matrix W, i.e., Dii =∑jWij, and denoting G = D − W, Rt(S) can be rewritten as: where tr(·) is the trace operator of a matrix, i.e., adding up all diagonal entries of a matrix.
III. Proposed EEG Source Imaging Model
Before we present our low-rank model with temporal graph structures, we comment on the limitations of traditional model and come up with the decomposition of task-related source with low-rank property and spontaneous non-task-related spurious sources that is sparsely distributed spatially and with transient patterns. We come up with a graph regularized low-rank representation model and detailed discussion on the purpose of each term in the goal function.
A. Decomposition of True and Spurious Sources
In general, two types of noises should be considered, one originates from inaccurate measurement of the sensors modeled by Guassian white noise, which is denoted as E in Eq.(1), the other type of noise is called biological noise that comes directly from the spontaneous activations in the source space, which are not task-related and termed as spurious source. The second types of noise (spurious source) also contributes to the EEG signal in the same way as ground truth source. A drawback of traditional models is that they didn’t distinguish the spurious sources from the true sources, and the estimated source can be composed of both task-rated source and spurious sources. To address the above mentioned problem, we propose to use a decomposed source spaces, composed of a low-rank source space and spurious sources originates from spontaneous biological noises. The illustration of decomposition of source space is given in Fig.2, where S1 has a low-rank property and S2 is sparse, and sum of S1 and S2 is no long low-rank, making X lose low-rank structure.
B. Basic Low-Rank Representation (LRR) Model
We argue that during a short period of task-related Evoked Repose Potential (EPR), the number of corresponding activated sources is sparse and remain activated during this period of time, which makes the EPR source matrix low-rank with most rows being zero. However, the brain signal is known to be heavy noisy, spontaneous activations are also active from different source spots. The low-rank representation model for the EEG inverse problem is introduced as follows: where β, λ and γ are positive scalars to balance the rank function, sparsity cost of sources and the reconstruction error. It is pointed out that ℓ1 on the error term is more robust to outliers, we use ℓ1 in the row-rank model [41]. The ‖E‖1,1 is defined as . Due to the discrete nature of the rank function, it is a common practice to use a surrogate nuclear norm ‖·‖* instead. The goal function is given as:
The above formulation is a simple version trying to estimate the task-related source activation pattern by using low-rank constraint. To promote the temporal smoothness, the Laplacian graph structure is included in the next section followed by the optimization algorithm.
C. LRR Model with Graph Regularization
Incorporating the previous temporal graph structures, we introduce our proposed model called low-rank representation with temporal graph structures ESI (LRR-TG-ESI). The model is composed of data fitting term to explain the EEG data, temporal graph embedding regularization term that promote temporal smooth, and a ℓ1 norm for sparsity penalty and nuclear norm for the low-rank structure of ground true source. By combining the low-rank prior and the temporal graph regularization, we propose the following model for ESI: where λ,β,α > 0 are tuning parameters to balance the trade-off of different terms. Our proposed model is able to enforce row-sparse via low-rank prior and temporal smoothness via temporal graph regularization while fitting the EEG data X. Compared to earlier works on the ESI problem, both the low-rank prior and graph regularization is novel, although the graph regularization term has been discussed in our early paper [40], but it is not defined on the temporal manifold, and the previous definition in [40] make the magnitude of source signal to be equal intra-class, which is not realistic in real world. To further consider the spatial smoothness, a total variation term can be imposed as another penalty term, such as first order total variation (TV) regularization in Ref. [30], [42], fractional order TV in [29], [31], and similar algorithm can be derived under the framework of ADMM, however further investigation with constraints of TV is our future work.
IV. Numerical Algorithm
To solve (8), an algorithm in the ADMM framework is developed. The augmented Lagrangian function of (8) is By some simple algebra, (9) can be reformulated as where T1 and T2 are Lagrangian multipliers and μ is a parameter for the augmented Lagragian term. The variables are updates alternately in a Gauss-Seidel manner by minimizing the augmented Lagrangian function, with other variables fixed. For symbolic simplicity, we rewrite Eq.(10) into the following form: where
If the augmented Lagrangian function is difficult to minimize with respect to a variable, a linearized approximate surrogate function can used, hence the algorithm bears the name Linearized Alternating Direction method [43], [38]. Updating S by minimizing (suppose we are at iteration k) is equivalent to minimize the following goal function with the other variables fixed: which is approximated by optimizing its linearizion at plus a quadratic proximal term, given as:
Here η is a constant satisfying where ‖·‖2 is the spectral norm of a matrix, i.e, the largest singular value. As long as (15) is satisfied, (14) is a good approximate to (13). The solution to (14) has a closed form using a singular value thresholding operator (SVT) [44] given as: where Θε(A) = USε(∑)VT is the SVT operator, in which U∑VT is the singular value decomposition of A and Sε(s) is defined as . is calculated as
To update M and E, it is equivalent to solve the following problem:
The general form of (18)–(19) is a ℓ1 norm proximal operator defined as with μ > 0. The above problem (20) has a closed form solution, called soft thresholding, defined by a shrinkage function, where (x)+ is x when x > 0, otherwise 0. The shrinkage function is defined as element-wise operator. Problem (18)–(19) has a close form solution described with the shrinkage function. After updating all the variables, these Lagrange multipliers are updated by
The parameter μ is updated by μ = min(ρμ, μmax). A summarized algorithm is given as Algorithm 1. We initialize the S with the estimate S0 from ℓ1 solver.
It’s worth noting that the data fitting term we use is ℓ1,1 norm of E in the model, and there are other options. Generally, if the Gaussian noise E is small, then the norm ‖E‖F, is an appropriate choice, but for random data corruption, ℓ1,1 should be used, and for sample specific data corruption, ℓ2,1 [45] [46] 47], should be used. It has been shown that ℓ2,1 is more robust to large outliers in some samples. Although the norm used in Algorithm 1 is ℓ1,1, it can be extended to ℓ2,1 norm of E, where ‖E‖2,1 is defined as
In stead of solving (19) to update E, the following goal function (22) needs to be solved to update E.
By substituting , if E* is the optimal solution of
Based on the Lemmas in Ref [48], the solution to (23) is where and ki is the i-th column of matrix E* and K respectively.
Convergence: The convergence of Algorithm 1 can be easily derived from [43]. Even though the update of M and E is separated in Algorithm 1, they can be combined in one step to become a larger block step and simultaneously solving for (M, E) which is the same case described by LADMAP algorithm [43]. The convergence analysis in [43] can be applied to our case, thus the algorithm convergence is guaranteed [38].
V. Numerical Experiments
In this section, we conducted several experiments to illustrate the effectiveness of our proposed method. Since both the nuclear norm and the graph regularization is relative new for the ESI problem, we started from the simple model (7) to help the readers understand the property and impacts of low-rank prior along with data fidelity term and sparsity term for source reconstruction. In the first experiment, we did a comprehensive exploration for different parameter settings by varying the weights between low-rank term, data fitting term and sparsity term. In the second experiment, we illustrate the temporal smoothing functionality of the graph regularization term for uncorrupted smooth source and corrupted source with abrupt signal jumps. In the third experiment, we give comprehensive numerical results by testing our algorithm against the benchmark algorithms to showcase the effectiveness of the proposed method in reconstructing task-related source, where we show that our algorithm can not only find the activated locations, but also reconstruct the time-course of source activation with high precision.
A. Head Model
A realistic head model, referred to as New York Head model [49], is used in our numerical experiment. The New York Head model is based on highly detailed MRI images derived ICBM152 anatomy, which is a nonlinear average of the T1-weighted structural MRI of 152 adults and calculated with state-of-the-art finite element electrical modeling. The New York Head model is considered to be highly accurate since it considers six tissue types when conducting segmentation, which is scalp, skull, CSF, gray matter, white matter, air cavities, with a native MRI resolution of 0.5 mm3. The dimension of the lead field matrix used in the numerical experiment part is 2004 by 108, representing a linear mapping from 2004 sources to 108 electrodes.
B. Experiment 1: Test on Simple Low-rank Model
To understand the property of low-rankness, we start from a simple model to help the readers understand the property of low-rank in the EEG source imaging and the validity of low-rank when recovering the accurate location as well as time course of source signal. Like in [50], eight octants are divided as regions of interest (ROI) are considered, which are Right Anterior Inferior (RAI), Right Anterior Superior (RAS), Right Posterior Inferior (RPI), Right Posterior Superior (RPS), Left Anterior Inferior (LAI), Left Anterior Superior (LAS), Left Posterior Inferior (LPI), Left Posterior Superior (LPI). In the simple experiments, we selected 2 different ROIs and randomly select one activated voxel in each of these ROIs, and a 4th order moving average time series is generated, as illustrated in Fig.3.
At each location, a time series with length of 500 were generated to represent the source activation time-course. At each time point, two randomly picked sources are activated to simulate the non-task related spurious noise with mean of 0 and variance to be 1. The task-related activate pattern has low-rank property, however, the noise corrupted source space is no longer low-rank. We repeated our experiment 50 times for all the combinations of λ and β, where λ = {0.01, 0.02, 0.03, 0.05, 0.1, 0.2, 0.5} and β = {0.005, 0.01, 0.015, 0.02, 0.05, 0.1}. The reconstructed error (RE) metric used here is where represents the reconstructed source. We define the where Ps and Pn are the power of signal and noise respectively. The violin plot is used to visualize the distribution of reconstruction errors, in corresponding to β and λ respectively in Fig.4 and Fig.5. Increasing β will penalize the strength of signal and make the reconstruction error to be large when λ is small. However, When λ is set to be 0.5, the weight of data fidelity is high, thus driving the solution have a better data fidelity, which can balance off the increase of β.
The averaged reconstruction error of 50 experiments for all of β and λ is given in Fig.6a and the averaged rank for the estimated source for all the combination of β and λ is given in Fig.6b. As we increase the value of λ, the rank is also increasing, which underlies the trade-off between explaining the data and finding the latent low-rank structure of the source space. Increasing λ means more weight on the data fitting term, and the spurious source can also be recovered, thus increasing the rank of the reconstructed source.
To empirically understand and explain the reconstruction error in Fig.6a, we visualize the curve fitting performance for truth source time series and the reconstructed source time-course corresponding to different level of reconstruction error. We picked the curve fitting cases when the reconstructed error is 0.2, 0.4, 0.8 and 1 respectively. For error equal to 0.2, we picked one experiment and plot the fitting of time series which demonstrated a very good curve fitting of the ground truth vs the reconstructed one as is shown in part (a) of Fig.7. Also we picked another experiment whose error equal to 0.4 and it is shown in part (b) of Fig.7. The curve fitting is slightly worse than the previous one but it is hard to notice the difference compared to the previous one, however according to our error metric, the error is 0.4. One thing to notice is that in both situations, the reconstructed rank equal to the ground truth rank, which is equal to 2, moreover, the estimated two active source location is exactly the ground true locations. We also examined the case when error is up to 0.8, and the curve fitting plot is given in part (c) of Fig.7. We examined this happened when we set λ = 0.01 and β = 0.1. In this case, the penalty for sparsity is 0.1, which means too much penalty for the sparse term and the signal magnitude is reduced by the shrinkage operation. We also notice than in the experiment, the error can be up to 1 no matter what parameter settings are given, as can be seen in the top region of Fig.4 and Fig.5. The curve fitting plot for the failed case is given in part (d) of Fig.7.
Although the situation is very rare when RE = 1, we want to visualize the activation pattern on the cortex to see how discrepant the reconstructed location compared to the true source location. For comparison, Fig. 8 is given when RE = 0.4 and RE = 1. The reconstructed source locations on two ROIs are exactly the same with the ground true location when RE = 0.4, when RE = 1, one source location is reconstructed perfectly while the other source location is not accurately located, however the neighboring sources are reconstructed.
To check how the rank of S evolve during the iterations, the boxplot of the rank at selected steps are given in Fig.9. Starting from an initial value with high rank, the rank of is decreasing as the iteration proceeds. We set the maximum rank to be 20, during the iteration process, the rank of S is converged to very small number for most of the cases.
C. Experiments 2: Test LRR with Temoral Graph Prior
In this section, we solve the LRR-TG-ESI problem (8) with graph regularization term to test its impact on the reconstructed signal. Under the same setting with Experiment 1, we assign different values {0.01, 0.02, 0.05, 0.1, 0.5} for the graph regularization parameter α. The original source signal was smooth, then it was corrupted with randomly number at some time points. There are also 2 randomly picked activated sources representing spurious sources with mean of 0 and variance to be 1. The “temporal smoothing” impact of the graph regularization is shown in Fig.10, where λ = 0.02 and β = 0.01. In Fig.10, the original signal is corrupted and not smooth at some time points, we set the neighbor size to be 2 (the closest signal before and after the one to be estimated) when calculating the Laplacian matrix. It is evident from the formulation (3) that the graph regularization term will decrease the dissimilarity of temporally neighbored reconstructed source. If α is set to be 0.5, the graph regularization term penalized heavily on the curvature of the reconstructed signal as is illustrated in Fig.10. We can see that with the temporal graph prior, the reconstructed source is more smooth. It is worth noticing that the main purpose of temporal graph prior is not to smooth the time course for the activated locations, the main purpose is to filter out the spurious activations that are short transients with abrupt jumps. Combined with the low-rank prior, the temporal graph prior can filter the spurious activations and reconstruct the task related activated source. The randomly planted spurious sources are filter out by penalizing the graph regularization and nuclear norm, and in most of the cases, the final rank is 2 can be achieved within a wide range of parameters. The time series plots of original EEG signal, corrupted EEG signal, and EEG signal recovered from the reconstructed sources from (8) by setting λ = 0.02 and β = 0.01, as well as the topoplots at 42 ms is given in Fig.11. To illustrate again the impact of the graph regularization, α is chosen from {0.01, 0.02, 0.05, 0.1, 0.5}. The first row is the EEG data generated from the task-related sources (persistent and low-rank), the second row is the EEG data from the task-related sources and the spurious source, the SNR is −0.72 dB, which means the energy of spurious source is slightly larger than the task-related sources. The 3rd row is the EEG data calculated from forward model after the source is reconstructed from our proposed model with α = 0.01. The 4th-7th row is the time series plots when α = 0.02, 0.05, 0.1, 0.5 respectively. The topoplots on the right part of Fig.11 is are sampled from 42 ms of the EEG data on the left.
D. Experiments 3: Comprehensive Comparison with Benchmark Algorithms
The purpose of previous numerical experiments is to validate each term of the goal function and to understand their properties. The trade-off between low-rankness, data fidelity, sparsity, temporal smooth are fully discussed by varying different parameters. In this part, a comprehensive study is conducted to compare the proposed algorithm with the popular ESI algorithms such as MNE [25], sLORETA [26], MCE and well as the state-of-art algorithm mixed-norm estimate (MxNE) [20].
We generated independent sources in different ROIs for easy validation purpose, the number of independent sources varied from 2 to 4 corresponding to different rank of ground-truth source, and the number of spurious source are generated by randomly activating the sources on the cortex with a random scalar whose mean value to be 0, and the variance is 1. Moreover, the noise on sensor level is also added to the EEG data. Two of the MCE algorithms are selected, which are Homotopy and FISTA [51]. For MxNE algorithm, we choose ℓ 2,1 to enforce ℓ2 norm on the temporal time series of each voxel and ℓ1 norm to impose spatial sparsity.
To measure the performance, we introduced 5 metrics, including 1) CPU time in seconds, 2) rank of the calculated source, 3) Sparsity, measuring the number of nonzero elements in the source space at each time point, 5) Reconstruction Error (RE) defined in Eq.(24), 5) Localization Error (LE), which is calculated using the shortest path algorithm over the irregular meshes from the reconstructed source location to the ground truth location. The LE metric is the most important one, since it measures the discrepancy in location, the other metrics give information of the property of the rendered solution. To calculate LE for each ROI with activated sources, we first locate source with the largest activation magnitude in this ROI, and calculate the shortest path distance from the located source to the ground truth location. We conduct the same procedure for all the activated ROI, and calculate the average value of all the distances at each time point. The final LE is the averaged distance value for all the 500 time points for each experiments.
All the algorithms are implemented in Matlab except MxNE which we call as a Python script (MultiTaskLasso.py) under the linear model of scikit-learn library [52] from Matlab. The formulation for MxNE is
We found that parameter tuning process for MxNE algorithm is very time consuming, even though there is only one parameter for the unweighted version. If the parameter γ in Eq.(25) is tuned for one experiment with good performance, and we use the same setting of parameter and the same setting to generate random source activation patterns, the reconstructed source matrix can be a zero matrix. We tuned the parameter γ according to the best LE performance when the rank is 2 and the number of spurious activated source is 2. By conducting experiments for parameter tuning of MxNE, we set γ in Eq.(25). For our proposed algorithm, we set λ = 0.01 and β = 0.01, which were tested to have good performance for the same case when the rank is 2 and the number of spurious activated source is 2, and the graph parameter α is also set to be 0.01. 10 experiments were conducted under the same setting and the performance of all the algorithms are summarized in Table I to Table III. The SNR is calculated after the noise signal is generated and it was averaged from 10 experiments under the same experiment setting. As can be seen from the tables, our algorithm is the most accurate to locate the task related activated source. The CPU time of our algorithm is between Homotopy and FISTA algorithm. The running time for Python version of MxNE is much faster than our proposed algorithm, but there are many cases that MxNE algorithm failed, thus making the overall accuracy drop significantly. Although our algorithm have more parameter, it is a more sophisticated model that allows better controllability to customize the weight of different terms in the goal function. Although MxNE has large RE and LE, it is still worth noting that the MxNE algorithm we used is a simple version with ℓ2,1 norm discussed in Ref. [20], and the algorithm to solve the goal function 25 is coordinate descent, other algorithms can be tested to solve the same problem. MCE model solve using ℓ1 algorithms Homotopy and FISTA can render good LE accuracy, and Homotopy outperforms FISTA in all the experiment with better speed, which confirms the comparison discussed in [40] [48].
VI. Conclusion and Future Work
In this paper, we propose to consider the noise not only in the sensor level, but also in the source space. we come up with an EEG source imaging model based on temporal graph structures and low-rank representation. The model is solved with our proposed algorithm based on ADMM. Numerical experiments are conducted to verify the effectiveness of the proposed work on discovering task related low-rank source extents. We delineate the discussion on the properties and impacts for each term in the cost function to help better understand our proposed graph regularized low rank representation model. Compared the traditional model, our proposed one can find the intrinsic task related activation patterns and suppress the spurious source patterns.