Elsevier

NeuroImage

Volume 21, Issue 4, April 2004, Pages 1732-1747
NeuroImage

Multilevel linear modelling for FMRI group analysis using Bayesian inference

https://doi.org/10.1016/j.neuroimage.2003.12.023Get rights and content

Abstract

Functional magnetic resonance imaging studies often involve the acquisition of data from multiple sessions and/or multiple subjects. A hierarchical approach can be taken to modelling such data with a general linear model (GLM) at each level of the hierarchy introducing different random effects variance components. Inferring on these models is nontrivial with frequentist solutions being unavailable. A solution is to use a Bayesian framework. One important ingredient in this is the choice of prior on the variance components and top-level regression parameters. Due to the typically small numbers of sessions or subjects in neuroimaging, the choice of prior is critical. To alleviate this problem, we introduce to neuroimage modelling the approach of reference priors, which drives the choice of prior such that it is noninformative in an information-theoretic sense. We propose two inference techniques at the top level for multilevel hierarchies (a fast approach and a slower more accurate approach). We also demonstrate that we can infer on the top level of multilevel hierarchies by inferring on the levels of the hierarchy separately and passing summary statistics of a noncentral multivariate t distribution between them.

Introduction

Functional magnetic resonance imaging studies are typically used to address questions about activation effects in populations of subjects. This generally involves a multisubject and/or multisession approach where data are analysed in such a way as to allow for hypothesis tests at the group level Holmes and Friston, 1998, Worsley et al., 2002, for example, to assess whether the observed effects are common and stable across or between groups of interest.

Calculating the level and probability of brain activation for a single subject is typically achieved using a linear model of the signal together with a Gaussian noise model for the residuals. This model is commonly called the general linear model (GLM), and much attention to date has been focussed on ways of modelling and fitting the (time series) signal and residual noise at the individual single-session level Bullmore et al., 1996, Woolrich et al., 2001, Worsley and Friston, 1995.

To be able to generate results that extend to the population, we also need to account for the fact that the individual subjects themselves are sampled from the population and thus are random quantities with associated variances. It is exactly this step that marks the transition from a simple fixed-effects model to a mixed-effects model,2 and it is imperative to formulate a model at the group level that allows for the explicit modelling of these additional variance terms Frison and Pocock, 1992, Holmes and Friston, 1998.

We can formulate the problem of group statistics in neuroimaging as being hierarchical Beckmann et al., 2003, Friston et al., 2002. For example, the different levels of the hierarchy could be separate GLMs for a session level, subject level and group level. In this paper, we attempt to deal with inference on these multilevel GLM hierarchies by utilising a fully Bayesian framework. Typically, the most important inference is at the top level of the hierarchy, for example, we may be looking for significance of a group mean. Whether we are looking to infer at the top level with the within-session FMRI time series data (Friston et al., 2002) or with summary statistic results from the level below Holmes and Friston, 1998, Worsley et al., 2002, a fully Bayesian approach provides us with the means to assess the full uncertainty in the parameter of interest (contrasts of regression parameters) at the top level; taking into account all of the unknown variance components (fixed and random) in the model.

Bayesian statistics provides the only generic tool for inferring model parameter probability distribution functions from data. It provides strict rules for the rational and consistent adjustment of belief (in the form of probability density functions) in the presence of new information (Cox., 1946), which are not available in the frequentist literature. The major consequences of this are twofold. First, we may make inference about the absolute value of the parameters of interest, that is, we may ask questions of our parameters such as, “What is the probability that θ lies in the interval [θ0, θ1]?”, a question unavailable to any frequentist technique. Frequentist statistics is typically limited to posing questions of the data under the “null hypothesis” that the parameter value is zero. Inference in a frequentist framework is then limited to the simple acceptance or rejection of this null hypothesis without being able make any statement about the parameter values. Second, Bayesian statistics gives us a tool for inferring on any model we choose and guarantees that uncertainty will be handled correctly. Only in certain special cases (not including the model presented here) is it possible to derive analytical forms for the null distributions required by frequentist statistics. In their absence, frequentist solutions rely on null distributions derived from the data (e.g., permutation tests), losing the statistical power gained from educated assumptions about, for example, the distribution of the noise.

These features of Bayesian analysis mean that we may make inference on physiological parameters of the haemodynamic response in the complex nonlinear balloon model (Friston, 2002), on spatial noise relationships in multivariate spatial autoregressive models of FMRI data (Woolrich et al., 2004b) or, in this paper, on higher level statistics in the presence of multiple variance components.

One important ingredient in a Bayesian approach is the choice of prior on the variance components and top-level regression parameters. Due to the typically small numbers of observations in neuroimaging above the first level (e.g., small numbers of subjects), this choice of prior is critical. To solve this problem, we introduce to neuroimage modelling the approach of reference priors, which drives the choice of prior such that it is noninformative in an information-theoretic sense. For GLMs where a frequentist solution is available, reference analysis gives the same inference as a frequentist approach. Importantly, reference analysis allows us to perform inference when frequentist solutions are unavailable.

Using fully Bayesian reference analysis, we propose two approaches to inferring at the top level; these are a fast approximation to the marginal posterior and a slower approach utilising Markov Chain Monte Carlo (MCMC) followed by a multivariate noncentral t distribution fit to the MCMC chains.

In Friston et al. (2002), the hierarchical model is solved “all in one” using the within-session FMRI time series data as input. However, in neuroimaging, where the human and computational costs involved in data analysis are relatively high, it is desirable to be able to make top-level inferences using the results of separate lower level analyses without the need to reanalyse any of the lower level data; an approach commonly called the summary statistics approach to FMRI analysis (Holmes and Friston, 1998). Within such a summary statistic split-level approach, group parameters of interest can easily be refined as more data become available.

In Holmes and Friston (1998), when inferring at the top level, this summary statistic split-level approach is shown to be equivalent to inferring all in one under certain conditions (e.g., the approach in Holmes and Friston, 1998, requires balanced designs). Beckmann et al. (2003) show that top-level inference using the split-level summary statistics approach can be made equivalent to the all-in-one approach with no restrictions, if we pass up the correct summary statistics (in particular, the covariances from previous levels). Furthermore, Beckmann et al. (2003) demonstrate that by taking into account lower level covariance heterogeneity, a substantial increase in higher level z statistic is possible. However, Beckmann et al. (2003) only show that this is the case when all variance components are known. Independently, in this paper, using the fully Bayesian approach, we show this equivalence for when the variance components (excluding autocorrelation) are unknown. The equivalence relies on the assumption that the summary statistics, which correspond to the marginal distributions of the GLM regressions parameters, can be represented as a multivariate noncentral t distributions. Between the first level (within session) and the second level, this can be shown analytically. For summary statistics at higher levels, this is an assumption which we test empirically using artificial data.

In summary, there are three main contributions presented in this paper. Firstly, we introduce reference analysis to neuroimaging. Secondly, we propose two inference techniques at the top level for multilevel hierarchies (a fast approach and a slower more accurate approach). Thirdly, we demonstrate that we can infer on the top level of multilevel hierarchies by inferring on the split levels separately and passing summary statistics between them.

We start in the Model section by considering the traditional two-level model. In the Inference section, using the reference analysis fully Bayesian framework, we show how inference on the two-level model can be split into separate inference on the two levels with the summary statistics of a multivariate noncentral t distribution being passed between the two levels of inference. We then propose two approaches to inferring at the top level. In Higher level models, we discuss how we can extend the split model inference approach to higher level models than the two-level model. In Multiple group variances, we also discuss how we can deal with multiple group variances under certain conditions. In the Artificial data section, we validate the crucial assumption of the marginal distribution of the GLM regressions parameters being a multivariate noncentral t distribution at levels higher than the first using artificial data. Finally, in the FMRI data section, we go on to show results on FMRI data.

Section snippets

Model

To begin with, we consider the familiar two-level univariate GLM for FMRI. For example, the model that in the first level deals with individual sessions for individual subjects, relating time series to activation, and in the second level deals with a group of subjects or sessions (or both), relating the combined individual activation estimates to some group parameter, such as mean activation level. Note that all models and inference in this paper are mass univariate, that is, each voxel is

Inference

There are no solutions in the frequentist literature to this model when the variance components are unknown. Furthermore, inference is highly sensitive to any assumptions made due to the low number of observations typically available at the subject level in FMRI.

Friston et al. (2002) have proposed an approximate Bayesian solution for the model all in one by assuming that the posterior over the regression parameters is multivariate normal. However, this does not fully incorporate the full

Higher level models

An increasing number of studies have three levels, in particular, a within session level, a session level and a subject level. With multiple sessions for multiple subjects, it becomes possible to model the between-session variance separately from the between-subject variance, and hence one can benefit from the improvements in sensitivity (due to heterogeneity of variance) this produces.

In the Two-level section, we showed that we could infer on the full two-level model using just the summary

Multiple group variances

We can use the framework we have described to work with multiple group variances at any level after the first level. An example of when this would be useful is when we might expect different between-subject variances for a patient group and a control group. We can easily deal with such multiple group variances if we limit ourselves to design matrices, which are “separable” with respect to the variance groupings.

We define a subdesign matrix as the part of the design matrix belonging to a group

Methods

In the Two-level section, we showed that the two-level model can be inferred upon using the summary statistics of the first-level model inference (Eq. (14)). This means that all-in-one and split-level inferences are equivalent when we infer on the top-level regression parameters. Here we use four different null artificial data sets from the two-level model for 400 voxels to validate the fast approximation and MCMC or BIDET inference we perform on Eq. (14).

Methods

Here, we consider two different FMRI data sets, both of which are simple motor tasks:

  • INDEX: index finger vs. rest tapping task.

  • SEQUENTIAL: sequential finger tapping vs. index finger tapping.

Each data set consists of single sessions for eight different subjects. In both data sets, the overall aim is to infer the group means at the top level. For each subject, echo planar images (EPI) were acquired using a 3-T system with TR = 3 s, time to echo (TE) = 30 ms, in-plane resolution 4 mm and slice

Conclusions

We have shown how multilevel hierarchical GLM inference can be split into different levels with the summary statistics of a multivariate noncentral t distribution being passed between the levels. This was achieved by formulating the model in a fully Bayesian framework and using reference analysis to drive our crucial choice of priors (see First level and Two-level). Using this framework, we have proposed two approaches to inferring at the top level. A fast approximation to the marginal

Discussion

When we attempt to infer on mixed effects models, we need to deal with the fact that the variance components are unknown. Classically, variance components tend to be estimated separately using iterative estimation schemes employing ordinary least squares (OLS), expectation maximisation (EM) or restricted maximum likelihood (ReML), see Searle et al. (1992) for details. As an example of a non-Bayesian approach, Worsley, 2001 estimates variance components at each split level of the model

Acknowledgements

The authors would like to acknowledge support from the UK MRC, EPSRC and GSK.

References (30)

  • K. Worsley et al.

    A general statistical analysis for fMRI data

    NeuroImage

    (2002)
  • J. Bernardo et al.

    Bayesian Theory

    (2000)
  • R. Brent

    Algorithms for Minimization without Derivatives

    (1973)
  • E. Bullmore et al.

    Statistical methods of estimation and inference for functional MR image analysis

    Magn. Reson. Med.

    (1996)
  • R. Cox

    Probability, frequency and reasonable expectation

    Am. J. Phys.

    (1946)
  • Cited by (1326)

    View all citing articles on Scopus
    1

    These authors contributed equally to this work.

    View full text