Elsevier

NeuroImage

Volume 30, Issue 1, March 2006, Pages 160-171
NeuroImage

Data-driven parceling and entropic inference in MEG

https://doi.org/10.1016/j.neuroimage.2005.08.067Get rights and content

Abstract

In Amblard et al. [Amblard, C., Lapalme, E., Lina, J.M. 2004. Biomagnetic source detection by maximum entropy and graphical models. IEEE Trans. Biomed. Eng. 55 (3) 427–442], the authors introduced the maximum entropy on the mean (MEM) as a methodological framework for solving the magnetoencephalography (MEG) inverse problem. The main component of the MEM is a reference probability density that enables one to include all kind of prior information on the source intensity distribution to be estimated. This reference law also encompasses the definition of a model. We consider a distributed source model together with a clustering hypothesis that assumes functionally coherent dipoles. The reference probability distribution is defined as a prior parceling of the cortical surface. In this paper, we present a data-driven approach for parceling out the cortex into regions that are functionally coherent. Based on the recently developed multivariate source prelocalization (MSP) principle [Mattout, J., Pelegrini-Issac, M., Garnero, L., Benali, H. 2005. Multivariate source prelocalization (MSP): Use of functionally informed basis functions for better conditioning the MEG inverse problem. NeuroImage 26 (2) 356–373], the data-driven clustering (DDC) of the dipoles provides an efficient parceling of the sources as well as an estimate of parameters of the initial reference probability distribution. On MEG simulated data, the DDC is shown to further improve the MEM inverse approach, as evaluated considering two different iterative algorithms and using classical error metrics as well as ROC (receiver operating characteristic) curve analysis. The MEM solution is also compared to a LORETA-like inverse approach. The data-driven clustering allows to take most advantage of the MEM formalism. Its main trumps lie in the flexible probabilistic way of introducing priors and in the notion of spatial coherent regions of activation. The latter reduces the dimensionality of the problem. In so doing, it narrows down the gap between the two types of inverse methods, the popular dipolar approaches and the distributed ones.

Introduction

Non-invasive characterization of human brain activity remains one of today's most challenging issues in terms of image and signal processing. In particular, inferring the cortical source locations and intensities of a signal acquired outside the head, such as given by magnetoencephalography (MEG) and electroencephalography (EEG), is an ill-posed inverse problem. This situation still requires further investigations and methodological developments. Solving this problem is of great importance since MEG and EEG are the only neuroimaging techniques whose data directly reflect the electromagnetic nature of the neuronal activity. MEG and EEG provide an instantaneous measure of the whole brain activity, which is not the case of functional magnetic resonance imaging (fMRI) and positron emission tomography (PET). Until recently, a large number of articles have been published regarding the physics and the models for explaining the MEG and EEG signal generation. We refer the reader to Mosher et al. (1999) for a thorough review of the so-called “MEEG forward problem”. In the present paper, given a forward solution, we address the MEG inverse problem which consists in localizing the brain current sources of the measured magnetic field outside the head. Despite an ongoing controversy about EEG and MEG respective localization powers (Barkley, 2004), MEG is known to be much less sensitive to the anisotropy of the skull electric conductivity, hence providing a more focal field topography and some more tractable forward solutions (Baillet et al., 2001). Nevertheless, the proposed methodology can be equally well applied to MEG and EEG data.

During the past two decades, two families of methods have been proposed for solving the MEG/EEG inverse problem. The first approaches consist in looking for a very few putative active regions (typically less than five), each of them being modeled by an equivalent current dipole (ECD) (Scherg and von Cramon, 1986). These methods are still very popular because of being easy use and providing solutions that are easy to compare and interpret. However, they highly rely upon the user prior experience for specifying the number of active regions, whereas the more recent distributed methods do not require so hard a prior (Dale and Sereno, 1993). The latter type of approach consists of displaying a large amount of well located (and well oriented) dipoles all over the cortical surface. Such a model is usually based upon each subject segmented structural MRI, hence making MEG/EEG much closer to a true imaging technique. A reasonably dense cortical description enables one to locate and to fit the spatial extension of each activated area, below the limit of the MEG spatial resolution. However, use of distributed methods makes the inverse problem highly degenerated because of the large number of unknown parameters (the dipole moments). External quantitative priors are needed to derive a unique and realistic inverse solution. Most of the existing distributed solutions differ according to the type of constraints they incorporate. Various constraints such as anatomical, physiological, mathematical and functional prior knowledge have been considered so far (Hämäläinen and Ilmoniemi, 1994, Gorodnitsky et al., 1995, Baillet and Garnero, 1997, Liu et al., 1998, Dale et al., 2000, Phillips et al., 2002). One of the most popular distributed approach is the low resolution tomography algorithm (LORETA) also referred to as the smoothest inverse solution (Pascual-Marqui et al., 1994) (see Section 1.2).

In this paper, we introduce a new methodological framework that grasps both types of inverse approach. Through the use of a flexible and realistic distributed representation of the solution space, the present approach combines the advantage of a realistic and flexible model with the notion of regional activity represented by a common reduced set of parameters, as achieved by ECD models.

Considering a distributed source model made of n dipoles of fixed location and orientation (perpendicular to the cortical surface (Dale and Sereno, 1993)), the generative model for a single time sample of MEG data is given by the linear modelM=GQ+ϵ,where M represents the measurements (random variable vector on the d sensors), Q the multivariate random variable made of the n dipole amplitudes and ε an additive measurement noise which is usually described by a zero-mean Gaussian independent and identically distributed (i.i.d.) process. The d × n matrix G is known as the lead-field operator which embodies the physics that links each dipole intensity (Q) with the magnetic fields (M). It is computed by solving the Maxwell's equations of electromagnetism using a particular model of the physical and geometrical properties of the head tissues (Mosher and Leahy, 1999). Localizing the brain sources of MEG activity using a distributed model thus consists of estimating Q by inverting Eq. (1). Because of the large number of dipoles compared to the number of sensors (a few thousand versus a few hundred), this system is highly underdetermined. It is moreover unstable due to the noise that corrupts the data. Therefore, regularization is needed to derive a unique and stable solution. A regularized solution is the source parameter distribution that satisfies a given trade-off between the goodness of data fit and a prior model (external prior or regularization term).

Goodness of data fit can be expressed as a strict data attachment constraint so that the ensemble Cm of the putative solutions q is defined byCm={qs.t.Gq=m},where m is the given realization of the observation variable. A weaker constraint may be considered through the allowance of a maximum variance δ so thatCm,δ={qs.t.Gqm2<δ}.

Then, in principle, if a solution does exist, one can use the Penrose pseudo-inverse operator G+ of G in order to compute a solution q* = G+m in Cm,δ when δ → 0. G+ verifies G G+G = G and is given byG+=limλ0Gt(GGt+λII)1,where II indicates the d × d identity matrix. In practice, such an exact solution is highly unstable and not even computable because of the ill-conditioned matrix to be inverted. In addition to the goodness of data fit constraint, a regularization criterion U(q) is then introduced so that q* is now defined as ArgminqCm,δU(q). Any such regularized inverse approach is then characterized by the type of prior knowledge or particular hypotheses expressed through the choice of U(q).

The classical Minimum L2-norm criterion uses U(q) = ||q||2 and the regularized solution is of the form q* = GMN+mathvariant="normal"m withGMN+=Gt(GGt+λII)1,λ being set to a finite and strictly positive value. LORETA, another well-known regularization method, enforces a quadratic smoothness constraint between neighboring sources in space withU(q)=qtWqwhere the metric W represents the spatial Laplacian. The optimal solution q* = GLo+m is given byGLo+=W1Gt(GW1Gt+λII)1.

As in Eq. (3), the regularization parameter λ tunes the relative weight of the two constraints: the goodness of the data fit (q* ε Cm,δ) and the smoothness prior (U(q*) ≃ 0). Beside the regularized L2-minimum norm solutions, L1-minimum norm has been also investigated and recently evaluated with distributed current models (Uutela et al., 1999). This so-called minimum current estimate (MCE) is based onU(q)=iwi|qi|where the w's are given weights for each dipoles. As expected, this constraint leads to more focal sources than the usual L2-minimum norm regularization scheme. The present work will compare those techniques with a thermodynamical (or informational) approach for this inverse problem. The proposed formalism is based on a probabilistic description of the whole system and the notion of averaged measurements. Close to the Bayesian techniques (Mohammad-Djafari and Demoment, 1988, and references therein), the entropic approach (Le Besnerais et al., 1999) gives a methodological framework that is quite appropriate for the study of such an underdetermined complex systems.

In Amblard et al. (2004), the authors revisited the probabilistic approach based on the maximum entropy on the mean (MEM) principle, early introduced by Clarke and Janday in the context of the biomagnetic inverse problem (Clarke and Janday, 1989). Apart from providing a promising alternative framework to the well-known Bayesian probabilistic methodology, this recent approach also proposes an improvement to the traditional distributed source model. For instance, the notion of region of activation is introduced in the model and characterized in terms of hidden random variables. This description somehow reduces the dimension of the problem and contributes to the overall regularization of the problem. However, the crucial point then lies in the definition of such regions. In Amblard et al. (2004), the general methodology was introduced, and regions of activation were arbitrarily defined, both in terms of location and spatial extension. In the present paper, we address this critical issue and propose to further improve the MEM-based inverse reconstruction by following an empirical Bayesian-like approach and use the functional data itself to drive a cortex parceling into anatomically and functionally coherent regions. To do so, we therefore apply the multivariate source prelocalization (MSP) (Mattout et al., 2005), which also allows to set some activation parameters associated with each cortical parcel.

In (1), the random variable (Q, ε) is described by the joint probability density dP(q,e). Under the reasonable assumption that the noise and the signal due to the cortical sources are statistically independent, this density can be written as the product of the marginals dP(q)dv(e). Averaging Eq. (1) over dP leads toGEP(Q)=mwhere EP (Q) indicates the expectation value of Q and m the expected observation. In practice, m is approximated by the evoked magnetic response which corresponds to the averaged observation over a large number of repetitions of the same task or stimulation. Then, as shown in Amblard et al. (2004), the MEM formalism is a suitable approach for solving the MEG inverse problem stated as follows: what is the optimal probability density dP on the sources so that Eq.(7) is satisfied?

Although it is not explicitly modeled in Eq. (7), it is worth noticing that the MEM approach implicitly accounts for the noise through the joint probability density dP. Noise is taken into account in the regularization process as a second order statistical contribution. The solution space is now the set Cm of all the probability distributions that satisfy (7), i.e.,Cm={dP(q,e)=f(q,e)dPref(q)dν(e)s.t.GEP(Q)=m}where the source probability dP is expressed in terms of a reference or prior probability distribution dPref(q) and a density f (q, e). As in any regularization approach, dPref(q) includes all the prior knowledge on the source distribution. It corresponds to the prior probability distribution in a Bayesian formalism (Baillet and Garnero, 1997). This reference probability distribution is thus all the more essential: it defines the initial model to start with in order to fit the data. Moreover, anticipating on the next sections, we note that similarly to empirical Bayes framework, we may use data to infer some properties and features of the reference probability dPref(q). This is one of the main contributions of this paper. Given the reference dPref(q), the MEM solution is the probability dP* that maximizes the Shannon entropy for which dPref is the zero (the rest state without any constraint),dP*=ArgmaxdPCmSref(dP),whereSref(dPref)=0withSref(dP)=f(q,e)logf(q,e)dPref(q,e).

We briefly recall in Appendix A the derivation of dP*=dP*(q)dν(e) that ultimately leads to the sources estimateq*=EP*(Q).

This solution explains the data in average and maximizes the information carried by the underlying probability density. One should furthermore emphasize that although the zero entropy dPref can be viewed as a prior density, it generally expresses initial assumptions that may need to be modified in order to improve the data goodness of fit. It is therefore natural to somehow exploit the informational content of the data in order to define dPref. Given such a ‘zero entropy’ solution, the source estimate is given by the following expression (see Appendix A)q*=FQ*(ξ)|ξ=Gtλ*.whereFQ*(ξ)=logexp(ξtq)dPref(q).is the free energy of distributed model, andλ*=Argmaxλ(λtmFQ*(Gtλ)σ22λtλ).is the optimal Lagrange parameter, the σ2 contribution being the free energy from the Gaussian noise of variance σ.

This paper is organized as follows. Section 2 is devoted to the MEM formalism and describes the various derived algorithms, in particular the iterative MEM algorithms. In Section 3, the data-driven clustering (DDC) method is introduced together with the strategy for initializing the previous algorithm. The proposed methodology is the original and important result of this work. It has been evaluated on MEG simulated data according to the procedure described in Section 4.1. The results are presented in Section 4.3, and both the new inverse approach and its performances are discussed in the last section.

Section snippets

The reference probability law

As in Amblard et al. (2004), we here define the notion of region of activation by considering that the cortical surface is divided into K parcels. Each parcel is made of one or several neighboring dipoles of the distributed model that are expected to have a common functional behavior. The activation state of a given parcel k is then defined by a binary hidden random variable Sk. The zero-state Sk = 0 tells the absence of an active dipole in parcel k, whereas Sk = 1 (the active state) indicates

Clustering and initialization

In this section, we define the various steps that determine the model used for the MEM reconstruction. In the spirit of an empirical Bayesian approach, data are used for setting some prior information through the reference probability distribution. In particular, parcels need to be defined, and initial values of each parcel parameters have to be set. More specifically, we will construct the parcels around selected centers of activation. A way of selecting such centers of activation is described

The MEG data simulation

Since the MEG sources are widely believed to be restricted to the pyramidal neuron cells of the cortical strip (Nunez and Silberstein, 2000), a common approach within the distributed model framework consists in constraining the dipoles to be distributed onto the cortical surface extracted from a structural MRI (Dale and Sereno, 1993). After the segmentation of the MRI volume, dipoles are typically located at each node of a triangular mesh of the white/grey matter interface (Mangin, 1995).

Conclusion

In this paper, we further investigated and assessed the usefulness of the maximum entropy on the mean (MEM) principle for regularizing and solving the MEG inverse problem. The proposed framework is a general probabilistic approach which proved suitable for the introduction of needed multiple prior knowledge on the solution. These constraints can be specified in a flexible way since their nature and form may be of all kind, provided that they can be expressed in terms of probability

Acknowledgments

The authors would like to thank the anonymous referees for their useful and constructive comments that contributed to clarify the presentation and the ideas contained in this paper. They thank Line Garnero (LENA, France) for providing support and MEG data. J-M.L. would like to thank NSERC program for financial support. E.L. would like to thank Philippe St-Jean for providing him the computer accessibility to do the simulations presented in this work.

References (29)

  • C. Amblard et al.

    Biomagnetic source detection by maximum entropy and graphical models

    IEEE Trans. Biomed. Eng.

    (2004)
  • S. Baillet et al.

    A Bayesian approach to introducing anatomo-functional priors in the MEG-EEG inverse problems

    IEEE Trans. Biomed. Eng.

    (1997)
  • S. Baillet et al.

    Electromagnetic brain mapping

    IEEE Signal Process. Mag.

    (2001)
  • C.J.S. Clarke et al.

    The solution of the biomagnetic inverse problem by maximum statistical entropy

    Inverse Probl.

    (1989)
  • Cited by (23)

    • Fast oscillations &gt;40 Hz localize the epileptogenic zone: An electrical source imaging study using high-density electroencephalography

      2021, Clinical Neurophysiology
      Citation Excerpt :

      The inverse problem was solved using the Maximum Entropy on the Mean (MEM) method (Amblard et al., 2004). MEM is a nonlinear distributed inverse problem method, for which the prior model is built using a data-driven parcellation (DDP) technique in order to cluster the cortical surface into K parcels (Lapalme et al., 2006). To do so, we used the multivariate source pre-localization (MSP) method (Mattout et al., 2005) which is a projection method that estimates a coefficient, which characterizes the possible contribution of each dipolar source to the data.

    • Different neural pathways to negative affect in youth with pediatric bipolar disorder and severe mood dysregulation

      2011, Journal of Psychiatric Research
      Citation Excerpt :

      Then, at the between-subjects level, the normalized log ratios were compared between the samples. To correct for multiple comparisons, a false discovery rate (FDR) threshold (Ossadtchi et al., 2004; Lapalme et al., 2006) of .05 is consistently used in recent fMRI (Nichols and Hayasaka, 2003; Genovese et al., 2002; Logan and Rowe, 2004) and MEG studies (Hirata et al., 2007; Kahkonen et al., 2007; Henson et al., 2007; Brunetti et al., 2007; Onoda et al., 2007). To balance the possibility of Type I and Type II errors in fMRI studies, Lieberman and Cunningham (2009) recommend a p < .005 intensity threshold and 20 voxel cluster cut-off to approximate an FDR equal to .05 (Lieberman and Cunningham, 2009).

    View all citing articles on Scopus
    View full text