Elsevier

NeuroImage

Volume 26, Issue 1, 15 May 2005, Pages 99-113
NeuroImage

Increased sensitivity in neuroimaging analyses using robust regression

https://doi.org/10.1016/j.neuroimage.2005.01.011Get rights and content

Abstract

Robust regression techniques are a class of estimators that are relatively insensitive to the presence of one or more outliers in the data. They are especially well suited to data that require large numbers of statistical tests and may contain outliers due to factors not of experimental interest. Both these issues apply particularly to neuroimaging data analysis. We use simulations to compare several robust techniques against ordinary least squares (OLS) regression, and we apply robust regression to second-level (group “random effects”) analyses in three fMRI datasets. Our results show that robust iteratively reweighted least squares (IRLS) at the 2nd level is a computationally efficient technique that both increases statistical power and decreases false positive rates in the presence of outliers. The benefits of IRLS are apparent with small samples (n = 10) and increase with larger sample sizes (n = 40) in the typical range of group neuroimaging experiments. When no true effects are present, IRLS controls false positive rates at an appropriate level. We show that IRLS can have substantial benefits in analysis of group data and in estimating hemodynamic response shapes from time series data. We provide software to implement IRLS in group neuroimaging analyses.

Introduction

Traditional statistical inference relies on three fundamental assumptions: (1) errors are independent from one another, (2) errors are normally distributed (or have another known distributional form), and (3) error variance is constant across levels of the predicted values. Statisticians emphasize the importance of evaluating these assumptions for each analysis tested, as violations of the assumptions can produce both false positive and false negative results and undermine the interpretability of inferential statistics (e.g., P values).

The field of statistics has developed a number of diagnostic tools to check assumptions, many of them graphical (Luo and Nichols, 2003, Neter et al., 1996). However, applications that require testing of a large number of statistical models pose a problem: it is nearly impossible to check assumptions and make individual decisions about how to address potential violations in each case (but see Luo and Nichols, 2003). Neuroimaging data (e.g., PET, fMRI, SPECT) are a prototypical example of this situation, as separate regression models are typically fit for each of 30,000–100,000 voxels in the brain.

In such cases, outliers in the data can create violations of the normality and equality of variance assumptions, and they can have a disproportionately large impact on the statistical solution (see Fig. 1A). This is true particularly with large, artifact-prone datasets such as those typical in neuroimaging experiments (Langenberger and Moser, 1997, Le and Hu, 1996, Ojemann et al., 1997). Outliers are likely to exist in some proportion of the regression analyses (e.g., in some voxels). In most cases, these outliers will cause decreased power (or, equivalently, will lead to higher false negative rates), but in some cases, they will lead to higher FPRs. This unpredictability of the effects of outliers is particularly problematic because it means that a simple correction (e.g., an alpha or P value correction) is not available.

Robust regression techniques are a class of statistical tools designed to provide estimates and inferential statistics that are relatively insensitive to the presence of one or more outliers in the data (Huber, 1981, Hubert et al., 2004, Neter et al., 1996). When outlying values are present in the data, violations of distributional assumptions can lead to reduced power and increased false positive rates. Robust techniques can substantially increase power while maintaining an appropriate false positive rate (Huber, 1981, Neter et al., 1996). Robust techniques are particularly useful when a large number of regressions are tested and assumptions cannot be evaluated for each individual regression, such as with neuroimaging data. The techniques we describe here are explicitly designed to deal with outliers, and may complement other techniques, such as data filtering and incorporation of Bayesian priors, designed to increase robustness to artifacts (Ciuciu et al., 2003, Smith et al., 2002, Woolrich et al., 2004).

In neuroimaging, analyses are conducted both individually for each subject and in a group analysis across subjects. Within a subject, a common strategy is to fit a multiple regression model to the time series data at each voxel (Worsley and Friston, 1995, Worsley et al., 1997). In this case, outliers (or other violations of the statistical assumptions) in the time series can substantially influence the fit of the model. Robust regression can minimize the impact of these outliers. In a typical group analysis, individual brains are first warped or anatomical regions are delineated so that voxels correspond to the same brain regions in each subject (Ashburner and Friston, 1999, Toga and Thompson, 2002). Regression parameters (or contrasts composed of a linear combination of parameters, e.g., A–B, task-control) are saved for each subject at each voxel or region of interest, and a test is performed on the parameter values, treating individual subjects' parameters as a random effect. Robust regression can be used at this level as well, minimizing the impact of outlying subjects.

Group analyses, also called “second level” or “random effects” analyses in the neuroimaging literature, can be a simple one sample t test, to test whether activation values differ from zero, or a more complex model involving repeated measures or behavioral predictors. Investigators are typically interested in (1) whether certain brain regions are activated by the task (i.e., whether contrast values differ from zero), and (2) whether behavioral scores (e.g., performance, behavioral depression scale scores, etc.) correlate with regional brain activation. These two tests correspond to tests of the intercept and slope of a simple linear regression model at the second level. Our simulations focus on this case as an illustrative example.

There are three principal reasons why robust regression techniques may be particularly important for analyzing neuroimaging data. First, as described below, there are good reasons to suspect that artifactual outliers are common in such data. Second, it is often unfeasible to check assumptions for each individual regression analysis due to the number of separate regression analyses performed (Luo and Nichols, 2003, provide a solution), and thus an efficient robust algorithm that dampens the effects of outliers would be advantageous. Finally, as noted above, robust techniques may increase statistical power (decreasing the false negative rate) and may prevent false positives in the presence of outliers or skew in the data.

Neuroimaging experiments may be more outlier-prone than many other methodologies due to the number and nature of processes that may produce artifacts, which we review briefly below. Because of the large number of comparisons that are performed in a typical “massively univariate” analysis, outliers are extremely likely to occur in some comparisons (i.e., somewhere in the brain). However, multivariate analyses (Buchel et al., 1999, McKeown et al., 2003) are not immune to outliers. In fact, they are more influenced by outliers than univariate approaches. As the multivariate space becomes more sparsely sampled (e.g., the ratio of variables to samples grows), extreme values at some time-points can have extremely large leverages, and thus extreme influences on the overall solution.

Fig. 1A shows an example of a problematic dataset with n = 10 and no true effect. The data (black dots) were drawn from a null-hypothesis (Ho) distribution, shown by the shaded circle, with no correlation between the predictor (x axis) and the data (y axis). Noise from a larger-variance distribution, shown by the dashed circle, was added to one data point, marked with a square. The regression line shows a statistically significant false positive effect, caused primarily by the highly influential outlier point.

The decision to drop or downweight outliers is an important one, and the best answer depends on the nature of the data. A central issue is whether outliers are likely to arise from some process that the researcher might be interested in modeling (e.g., higher order interaction terms) or if they arise from a process that is of little theoretical interest (e.g., data collection artifacts). If from the former, outliers should not be dropped. Rather, the model should be adjusted to account for them. For example, skewed, non-normal distributions can be modeled using maximum likelihood procedures or additional predictors, such as interactions or polynomial terms, can be added to an OLS model. On the other hand, if outliers are likely to arise from processes that the researcher is uninterested in modeling, their influence should be dampened or eliminated.

Various kinds of acquisition artifacts are present in fMRI BOLD data, some of which are slice or region specific, and others of which are global. Changes in gradients may produce spikes at particular time points or a range of time points. Local changes in magnetic field inhomogeneity produce artifacts specific both in space and time. The presence of such artifacts can influence an individual subject's regression parameter estimates (betas) dramatically, and thus create outliers in group analyses (i.e., random effects analyses across individual participants).

Even small movements of the head may produce large artifacts in fMRI signals. Artifacts are local in time and space. They are greater at the edges of the brain and around fluid space because magnetic field homogeneity is most sensitive to perturbations in these regions and because voxels are shifted in and out of fluid spaces with head motion (Hutton et al., 2002, Ward et al., 2002, Wu et al., 1997). In addition, they induce magnetic susceptibility changes that cannot be captured by realignment algorithms or inclusion of movement parameters in linear statistical models (Wu et al., 1997).

Heartbeat and breathing both induce pulsatile motion in the brain, which creates artifacts in the time series directly, by moving brain tissue with respect to the sampling grid, and indirectly, by inducing magnetic susceptibility artifacts (Frank et al., 1993, Frank et al., 2001, Kruger and Glover, 2001). Troublingly, these artifacts may often correlated to some degree with the task design. Although some spurious correlation can be expected by chance, many cognitive and emotional states produce changes in respiration. Task-correlated physiological artifact can create outliers in individual subjects' regression parameter estimates, and the magnitude of these effects varies widely across participants, exacerbating the problem in individual differences analyses at the group level. If there is a systematic physiological noise-induced bias in one individual participant, they are likely to be an outlier in the group of participants, and robust regression could prove beneficial at the group level.

Group analyses are often performed by warping or normalizing each participant's brain to a reference template, and thereafter assuming that each voxel covers the same anatomical brain tissue and functional brain region for each participant. If this process fails for a particular brain region within even one subject—or functional localization is different for that subject—the parameter estimates for that subject can become outliers in group analysis.

This final category of outliers is very important, because behavioral (X) outliers exert high leverage on the parameter estimates (subject activation contrast scores) throughout the whole brain. This kind of outlier may be caused by error or inaccuracy in behavioral measurement, or because a participant is drawn from a different population from other participants.

A successful application of robust regression to neuroimaging should demonstrate that (1) the technique is more sensitive in brain regions that are known to show true positive responses, (2) the technique improves the reliability of estimates, and (3) FPRs are reduced in regions known not to show true responses.

We address each of these in a simulation comparing several methods of robust techniques with OLS, using parameters and sample sizes similar to those encountered in imaging studies. We then apply the first two of these criteria in three experiments of real fMRI data. In each experiment, we have a priori expectations for regions that should be active, and we perform brain-wise and region-of-interest (ROI) analyses comparing robust IRLS and ordinary least squares (OLS) in those regions.

Experiment 1 is a cognitive task that requires both left- and right-handed responses in a single-trial event-related fMRI design. We compare sensitivity of OLS and IRLS to contralateral and ipsilateral primary motor (M1) responses in a group of subjects (“random-effects” analysis). Experiment 2 compares OLS and IRLS random-effects analyses of brain-wise responses to anticipation of pain. Experiment 3 employs a visual-motor paradigm with long inter-trial intervals (ITIs, 30 s). In this experiment, we explore the effects of using IRLS at the individual subject level.

Section snippets

Linear modeling framework

Before turning to robust regression techniques, we briefly review the general linear model (GLM). GLM finds the combination of predictors, each scaled by some value (βi), that best fits the data. In algebraic terms, the GLM projects data (y) in an n-dimensional space (n independent data observations) onto a k-dimensional model subspace (k predictors). This framework is described by the equation:y=Xβ+ɛwhere X is the n × k model matrix whose columns contain values for the predictors, β is a k × 1

Simulation: comparing regression methods

We compared FPR and experimental power for five regression techniques: OLS, IRLS with bisquare and Huber weighting functions, univariate outlier removal (Univ), and multivariate outlier removal (Mahal). Fig. 2 shows the results of simulations for all n (5, 10, 15, 25, 40) at q = 0.1, m = 3, and t = 3 for the intercept term (i.e., detecting activations in a random effects analysis), where q is the proportion of outliers, m is the standard deviation of the outlier noise distribution, and t is the

Discussion

The results from both simulations and experimental data demonstrate that robust estimation methods can offer substantial benefits in neuroimaging analyses. Robust techniques are well suited for cases in which artifactual outliers may exist in the data, and automated robust techniques, such as IRLS, offer substantial improvements when each regression analysis cannot be individually checked for violations of assumptions. Both situations are true of neuroimaging data. Outliers in time series data

Acknowledgments

We would like to thank Martin Lindquist for his helpful advice. This research was supported by grant MH60655 to the University of Michigan (John Jonides, P.I.). Software is available from: http://www.columbia.edu/cu/psychology/tor/.

References (47)

  • J. Ashburner et al.

    Nonlinear spatial normalization using basis functions

    Hum. Brain Mapp.

    (1999)
  • L. Becerra et al.

    Reward circuitry activation by noxious thermal stimuli

    Neuron

    (2001)
  • C. Buchel et al.

    The predictive value of changes in effective connectivity for human learning

    Science

    (1999)
  • P. Ciuciu et al.

    Unsupervised robust nonparametric estimation of the hemodynamic response function for any fMRI experiment

    IEEE Trans. Med. Imag.

    (2003)
  • K.D. Davis et al.

    Event-related fMRI of pain: entering a new era in imaging pain

    NeuroReport

    (1998)
  • W.H. DuMouchel et al.

    Integrating a robust option into a multiple regression computing environment

  • L.R. Frank et al.

    Pulsatile flow artifacts in 3D magnetic resonance imaging

    Magn. Reson. Med.

    (1993)
  • L.R. Frank et al.

    Estimation of respiration-induced noise fluctuations from undersampled multislice fMRI data

    Magn. Reson. Med.

    (2001)
  • K.J. Friston et al.

    Characterizing evoked hemodynamics with fMRI

    NeuroImage

    (1995)
  • D.A. Gusnard et al.

    Searching for a baseline: functional imaging and the resting human brain

    Nat. Rev., Neurosci.

    (2001)
  • S. Hayasaka et al.

    Nonstationary cluster-size inference with random field and permutation methods

    NeuroImage

    (2004)
  • D.C. Hoaglin et al.

    Exploring Data Tables, Trends, and Shapes

    (1985)
  • P.J. Huber

    Robust statistics

    (1981)
  • M. Hubert

    Multivariate outlier detection and robust covariance matrix estimation—discussion

    Technometrics

    (2001)
  • M. Hubert et al.

    Robust methods for partial least squares regression

    J. Chemom.

    (2003)
  • M. Hubert et al.

    A robust PCR method for high-dimensional regressors

    J. Chemom.

    (2003)
  • M. Hubert et al.

    A fast method for robust principal components with applications to chemometrics

    Chemom. Intell. Lab. Syst.

    (2002)
  • M. Hubert et al.

    Robustness

  • C. Hutton et al.

    Image distortion correction in fMRI: a quantitative evaluation

    NeuroImage

    (2002)
  • J. Jensen et al.

    Direct activation of the ventral striatum in anticipation of aversive stimuli

    Neuron

    (2003)
  • P. Kochunov et al.

    An optimized individual target brain in the Talairach coordinate system

    NeuroImage

    (2002)
  • G. Kruger et al.

    Physiological noise in oxygenation-sensitive magnetic resonance imaging

    Magn. Reson. Med.

    (2001)
  • K.W. Langenberger et al.

    Nonlinear motion artifact reduction in event-triggered gradient-echo fMRI

    Magn. Reson. Imag.

    (1997)
  • Cited by (238)

    View all citing articles on Scopus
    View full text