Data-driven parceling and entropic inference in MEG
Introduction
Non-invasive characterization of human brain activity remains one of today's most challenging issues in terms of image and signal processing. In particular, inferring the cortical source locations and intensities of a signal acquired outside the head, such as given by magnetoencephalography (MEG) and electroencephalography (EEG), is an ill-posed inverse problem. This situation still requires further investigations and methodological developments. Solving this problem is of great importance since MEG and EEG are the only neuroimaging techniques whose data directly reflect the electromagnetic nature of the neuronal activity. MEG and EEG provide an instantaneous measure of the whole brain activity, which is not the case of functional magnetic resonance imaging (fMRI) and positron emission tomography (PET). Until recently, a large number of articles have been published regarding the physics and the models for explaining the MEG and EEG signal generation. We refer the reader to Mosher et al. (1999) for a thorough review of the so-called “MEEG forward problem”. In the present paper, given a forward solution, we address the MEG inverse problem which consists in localizing the brain current sources of the measured magnetic field outside the head. Despite an ongoing controversy about EEG and MEG respective localization powers (Barkley, 2004), MEG is known to be much less sensitive to the anisotropy of the skull electric conductivity, hence providing a more focal field topography and some more tractable forward solutions (Baillet et al., 2001). Nevertheless, the proposed methodology can be equally well applied to MEG and EEG data.
During the past two decades, two families of methods have been proposed for solving the MEG/EEG inverse problem. The first approaches consist in looking for a very few putative active regions (typically less than five), each of them being modeled by an equivalent current dipole (ECD) (Scherg and von Cramon, 1986). These methods are still very popular because of being easy use and providing solutions that are easy to compare and interpret. However, they highly rely upon the user prior experience for specifying the number of active regions, whereas the more recent distributed methods do not require so hard a prior (Dale and Sereno, 1993). The latter type of approach consists of displaying a large amount of well located (and well oriented) dipoles all over the cortical surface. Such a model is usually based upon each subject segmented structural MRI, hence making MEG/EEG much closer to a true imaging technique. A reasonably dense cortical description enables one to locate and to fit the spatial extension of each activated area, below the limit of the MEG spatial resolution. However, use of distributed methods makes the inverse problem highly degenerated because of the large number of unknown parameters (the dipole moments). External quantitative priors are needed to derive a unique and realistic inverse solution. Most of the existing distributed solutions differ according to the type of constraints they incorporate. Various constraints such as anatomical, physiological, mathematical and functional prior knowledge have been considered so far (Hämäläinen and Ilmoniemi, 1994, Gorodnitsky et al., 1995, Baillet and Garnero, 1997, Liu et al., 1998, Dale et al., 2000, Phillips et al., 2002). One of the most popular distributed approach is the low resolution tomography algorithm (LORETA) also referred to as the smoothest inverse solution (Pascual-Marqui et al., 1994) (see Section 1.2).
In this paper, we introduce a new methodological framework that grasps both types of inverse approach. Through the use of a flexible and realistic distributed representation of the solution space, the present approach combines the advantage of a realistic and flexible model with the notion of regional activity represented by a common reduced set of parameters, as achieved by ECD models.
Considering a distributed source model made of n dipoles of fixed location and orientation (perpendicular to the cortical surface (Dale and Sereno, 1993)), the generative model for a single time sample of MEG data is given by the linear modelwhere M represents the measurements (random variable vector on the d sensors), Q the multivariate random variable made of the n dipole amplitudes and ε an additive measurement noise which is usually described by a zero-mean Gaussian independent and identically distributed (i.i.d.) process. The d × n matrix G is known as the lead-field operator which embodies the physics that links each dipole intensity (Q) with the magnetic fields (M). It is computed by solving the Maxwell's equations of electromagnetism using a particular model of the physical and geometrical properties of the head tissues (Mosher and Leahy, 1999). Localizing the brain sources of MEG activity using a distributed model thus consists of estimating Q by inverting Eq. (1). Because of the large number of dipoles compared to the number of sensors (a few thousand versus a few hundred), this system is highly underdetermined. It is moreover unstable due to the noise that corrupts the data. Therefore, regularization is needed to derive a unique and stable solution. A regularized solution is the source parameter distribution that satisfies a given trade-off between the goodness of data fit and a prior model (external prior or regularization term).
Goodness of data fit can be expressed as a strict data attachment constraint so that the ensemble Cm of the putative solutions q is defined bywhere m is the given realization of the observation variable. A weaker constraint may be considered through the allowance of a maximum variance δ so that
Then, in principle, if a solution does exist, one can use the Penrose pseudo-inverse operator G+ of G in order to compute a solution q* = G+m in Cm,δ when δ → 0. G+ verifies G G+G = G and is given bywhere II indicates the d × d identity matrix. In practice, such an exact solution is highly unstable and not even computable because of the ill-conditioned matrix to be inverted. In addition to the goodness of data fit constraint, a regularization criterion U(q) is then introduced so that q* is now defined as . Any such regularized inverse approach is then characterized by the type of prior knowledge or particular hypotheses expressed through the choice of U(q).
The classical Minimum L2-norm criterion uses U(q) = ||q||2 and the regularized solution is of the form q* = GMN+mathvariant="normal"m withλ being set to a finite and strictly positive value. LORETA, another well-known regularization method, enforces a quadratic smoothness constraint between neighboring sources in space withwhere the metric W represents the spatial Laplacian. The optimal solution q* = GLo+m is given by
As in Eq. (3), the regularization parameter λ tunes the relative weight of the two constraints: the goodness of the data fit (q* ε Cm,δ) and the smoothness prior (U(q*) ≃ 0). Beside the regularized L2-minimum norm solutions, L1-minimum norm has been also investigated and recently evaluated with distributed current models (Uutela et al., 1999). This so-called minimum current estimate (MCE) is based onwhere the w's are given weights for each dipoles. As expected, this constraint leads to more focal sources than the usual L2-minimum norm regularization scheme. The present work will compare those techniques with a thermodynamical (or informational) approach for this inverse problem. The proposed formalism is based on a probabilistic description of the whole system and the notion of averaged measurements. Close to the Bayesian techniques (Mohammad-Djafari and Demoment, 1988, and references therein), the entropic approach (Le Besnerais et al., 1999) gives a methodological framework that is quite appropriate for the study of such an underdetermined complex systems.
In Amblard et al. (2004), the authors revisited the probabilistic approach based on the maximum entropy on the mean (MEM) principle, early introduced by Clarke and Janday in the context of the biomagnetic inverse problem (Clarke and Janday, 1989). Apart from providing a promising alternative framework to the well-known Bayesian probabilistic methodology, this recent approach also proposes an improvement to the traditional distributed source model. For instance, the notion of region of activation is introduced in the model and characterized in terms of hidden random variables. This description somehow reduces the dimension of the problem and contributes to the overall regularization of the problem. However, the crucial point then lies in the definition of such regions. In Amblard et al. (2004), the general methodology was introduced, and regions of activation were arbitrarily defined, both in terms of location and spatial extension. In the present paper, we address this critical issue and propose to further improve the MEM-based inverse reconstruction by following an empirical Bayesian-like approach and use the functional data itself to drive a cortex parceling into anatomically and functionally coherent regions. To do so, we therefore apply the multivariate source prelocalization (MSP) (Mattout et al., 2005), which also allows to set some activation parameters associated with each cortical parcel.
In (1), the random variable (Q, ε) is described by the joint probability density . Under the reasonable assumption that the noise and the signal due to the cortical sources are statistically independent, this density can be written as the product of the marginals dP(q)dv(e). Averaging Eq. (1) over leads towhere EP (Q) indicates the expectation value of Q and m the expected observation. In practice, m is approximated by the evoked magnetic response which corresponds to the averaged observation over a large number of repetitions of the same task or stimulation. Then, as shown in Amblard et al. (2004), the MEM formalism is a suitable approach for solving the MEG inverse problem stated as follows: what is the optimal probability density dP on the sources so that Eq.(7) is satisfied?
Although it is not explicitly modeled in Eq. (7), it is worth noticing that the MEM approach implicitly accounts for the noise through the joint probability density . Noise is taken into account in the regularization process as a second order statistical contribution. The solution space is now the set Cm of all the probability distributions that satisfy (7), i.e.,where the source probability is expressed in terms of a reference or prior probability distribution dPref(q) and a density f (q, e). As in any regularization approach, dPref(q) includes all the prior knowledge on the source distribution. It corresponds to the prior probability distribution in a Bayesian formalism (Baillet and Garnero, 1997). This reference probability distribution is thus all the more essential: it defines the initial model to start with in order to fit the data. Moreover, anticipating on the next sections, we note that similarly to empirical Bayes framework, we may use data to infer some properties and features of the reference probability dPref(q). This is one of the main contributions of this paper. Given the reference dPref(q), the MEM solution is the probability that maximizes the Shannon entropy for which dPref is the zero (the rest state without any constraint),with
We briefly recall in Appendix A the derivation of that ultimately leads to the sources estimate
This solution explains the data in average and maximizes the information carried by the underlying probability density. One should furthermore emphasize that although the zero entropy dPref can be viewed as a prior density, it generally expresses initial assumptions that may need to be modified in order to improve the data goodness of fit. It is therefore natural to somehow exploit the informational content of the data in order to define dPref. Given such a ‘zero entropy’ solution, the source estimate is given by the following expression (see Appendix A)whereis the free energy of distributed model, andis the optimal Lagrange parameter, the σ2 contribution being the free energy from the Gaussian noise of variance σ.
This paper is organized as follows. Section 2 is devoted to the MEM formalism and describes the various derived algorithms, in particular the iterative MEM algorithms. In Section 3, the data-driven clustering (DDC) method is introduced together with the strategy for initializing the previous algorithm. The proposed methodology is the original and important result of this work. It has been evaluated on MEG simulated data according to the procedure described in Section 4.1. The results are presented in Section 4.3, and both the new inverse approach and its performances are discussed in the last section.
Section snippets
The reference probability law
As in Amblard et al. (2004), we here define the notion of region of activation by considering that the cortical surface is divided into K parcels. Each parcel is made of one or several neighboring dipoles of the distributed model that are expected to have a common functional behavior. The activation state of a given parcel k is then defined by a binary hidden random variable Sk. The zero-state Sk = 0 tells the absence of an active dipole in parcel k, whereas Sk = 1 (the active state) indicates
Clustering and initialization
In this section, we define the various steps that determine the model used for the MEM reconstruction. In the spirit of an empirical Bayesian approach, data are used for setting some prior information through the reference probability distribution. In particular, parcels need to be defined, and initial values of each parcel parameters have to be set. More specifically, we will construct the parcels around selected centers of activation. A way of selecting such centers of activation is described
The MEG data simulation
Since the MEG sources are widely believed to be restricted to the pyramidal neuron cells of the cortical strip (Nunez and Silberstein, 2000), a common approach within the distributed model framework consists in constraining the dipoles to be distributed onto the cortical surface extracted from a structural MRI (Dale and Sereno, 1993). After the segmentation of the MRI volume, dipoles are typically located at each node of a triangular mesh of the white/grey matter interface (Mangin, 1995).
Conclusion
In this paper, we further investigated and assessed the usefulness of the maximum entropy on the mean (MEM) principle for regularizing and solving the MEG inverse problem. The proposed framework is a general probabilistic approach which proved suitable for the introduction of needed multiple prior knowledge on the solution. These constraints can be specified in a flexible way since their nature and form may be of all kind, provided that they can be expressed in terms of probability
Acknowledgments
The authors would like to thank the anonymous referees for their useful and constructive comments that contributed to clarify the presentation and the ideas contained in this paper. They thank Line Garnero (LENA, France) for providing support and MEG data. J-M.L. would like to thank NSERC program for financial support. E.L. would like to thank Philippe St-Jean for providing him the computer accessibility to do the simulations presented in this work.
References (29)
Controversies in neurophysiology. MEG is superior to EEG in localization of interictal epileptiform activity
Prog. Clin. Neurophysiol.
(2004)- et al.
Dynamic statistical parametric mapping: combining fMRI and MEG for high-resolution imaging of cortical activity
Neuron
(2000) An fMRI constrained MEG source analysis with procedures for dividing and grouping activation
NeuroImage
(2002)- et al.
Neuromagnetic source imaging with focus: a recursive weighted minimum norm algorithm
Electroencephalogr. Clin. Neurophysiol.
(1995) - et al.
Multivariate source prelocalization (MSP): use of functionally informed basis functions for better conditioning the MEG inverse problem
NeuroImage
(2005) - et al.
Low resolution electromagnetic tomography: a new method for localizing electrical activity in the brain
Int. J. Psychophysiol.
(1994) - et al.
Bayesian fMRI time-series analysis with spatial priors
NeuroImage
(2005) - et al.
Anatomically informed basis functions for EEG source localization: combining functional and anatomical constraints
NeuroImage
(2002) - et al.
Evoked dipole source potentials of the human auditory cortex
Electroencephalogr. Clin. Neurophysiol.
(1986) - et al.
Visualization of magnetoencephalographic data using minimum current estimates
NeuroImage
(1999)
Biomagnetic source detection by maximum entropy and graphical models
IEEE Trans. Biomed. Eng.
A Bayesian approach to introducing anatomo-functional priors in the MEG-EEG inverse problems
IEEE Trans. Biomed. Eng.
Electromagnetic brain mapping
IEEE Signal Process. Mag.
The solution of the biomagnetic inverse problem by maximum statistical entropy
Inverse Probl.
Cited by (23)
Fast oscillations >40 Hz localize the epileptogenic zone: An electrical source imaging study using high-density electroencephalography
2021, Clinical NeurophysiologyCitation Excerpt :The inverse problem was solved using the Maximum Entropy on the Mean (MEM) method (Amblard et al., 2004). MEM is a nonlinear distributed inverse problem method, for which the prior model is built using a data-driven parcellation (DDP) technique in order to cluster the cortical surface into K parcels (Lapalme et al., 2006). To do so, we used the multivariate source pre-localization (MSP) method (Mattout et al., 2005) which is a projection method that estimates a coefficient, which characterizes the possible contribution of each dipolar source to the data.
Different neural pathways to negative affect in youth with pediatric bipolar disorder and severe mood dysregulation
2011, Journal of Psychiatric ResearchCitation Excerpt :Then, at the between-subjects level, the normalized log ratios were compared between the samples. To correct for multiple comparisons, a false discovery rate (FDR) threshold (Ossadtchi et al., 2004; Lapalme et al., 2006) of .05 is consistently used in recent fMRI (Nichols and Hayasaka, 2003; Genovese et al., 2002; Logan and Rowe, 2004) and MEG studies (Hirata et al., 2007; Kahkonen et al., 2007; Henson et al., 2007; Brunetti et al., 2007; Onoda et al., 2007). To balance the possibility of Type I and Type II errors in fMRI studies, Lieberman and Cunningham (2009) recommend a p < .005 intensity threshold and 20 voxel cluster cut-off to approximate an FDR equal to .05 (Lieberman and Cunningham, 2009).