An automatic pre-processing pipeline for EEG analysis (APP) based on robust statistics

https://doi.org/10.1016/j.clinph.2018.04.600Get rights and content

Highlights

  • A novel automatic pre-processing pipeline for both resting state and evoked EEG data is proposed.

  • The proposed automatic pipeline is tested in both clinical and healthy populations.

  • The proposed automatic pipeline is as reliable as pre-processing by EEG experts.

Abstract

Objective

With the advent of high-density EEG and studies of large numbers of participants, yielding increasingly greater amounts of data, supervised methods for artifact rejection have become excessively time consuming. Here, we propose a novel automatic pipeline (APP) for pre-processing and artifact rejection of EEG data, which innovates relative to existing methods by not only following state-of-the-art guidelines but also further employing robust statistics.

Methods

APP was tested on event-related potential (ERP) data from healthy participants and schizophrenia patients, and resting-state (RS) data from healthy participants. Its performance was compared with that of existing automatic methods (FASTER for ERP data, TAPEEG and Prep pipeline for RS data) and supervised pre-processing by experts.

Results

APP rejected fewer bad channels and bad epochs than the other methods. In the ERP study, it produced significantly higher amplitudes than FASTER, which were consistent with the supervised scheme. In the RS study, it produced spectral measures that correlated well with the automatic alternatives and the supervised scheme.

Conclusion

APP effectively removed EEG artifacts, performing similarly to the supervised scheme and outperforming existing automatic alternatives.

Significance

The proposed automatic pipeline provides a reliable and efficient tool for pre-processing large datasets of both evoked and resting-state EEG.

Introduction

The electroencephalogram (EEG) is a non-invasive tool for the investigation of human brain function, which has been continuously used for almost one century (Niedermeyer and Lopes da Silva, 2005). However, EEG data are typically contaminated with a number of artifacts. Artifacts are undesired signals that may affect the measurement and change the EEG signal of interest. These artifacts may arise from non-physiological noise sources that originate outside the participant, such as the grounding of the electrodes causing power line noise at 50/60 Hz and at its harmonics, interferences with other electrical devices, or imperfections in electrode settling. Artifacts may also arise from physiological noise sources originating within the participants, such as the ones produced by head, eye, or muscle movements (Urigüen and Garcia-Zapirain, 2015). Head movements may result in spikes and discontinuities due to a rapid change of impedance at one or several electrodes. Reflective eye movements occur frequently and are normally picked up by the frontal electrodes in the frequency range of 1–3 Hz (within the delta wave range). Blinking also contaminates the EEG signal, usually causing a more abrupt change in its amplitude than eye movements. Finally, every movement of the participant generates muscular artifacts that can be found everywhere on the scalp at frequencies higher than 20 Hz (within the beta and gamma waves range).

One simple way to deal with these artifacts is to remove segments of the data that exceed a certain level of artifact contamination, for example, signal amplitudes greater than ±100 µV. However, this coarse approach may lead to the loss of a great amount of data that could still contain artifact-free information, therefore potentially compromising the subsequent analysis and interpretation of the data. This is true for both evoked-related potentials (ERP) and resting-state (RS) signal fluctuations. Moreover, since participant generated artifacts may overlap in the spectral domain, and on many EEG channels, with the signal of interest, simple spatial and frequency band filtering approaches may be inefficient to remove this kind of artifacts (Tatum et al., 2011). Another method that is commonly used to clean-up EEG data is independent component analysis (ICA; Makeig et al., 1996). Assuming that neuronal signals and noise recorded on the scalp are independent of each other, then the EEG signal can be described by their linear summation. The ICA is used to decompose the EEG data in statistically independent sources (ICs), so as to separate the neuronal and noise contributions to the signal. The artifactual ICs can then be identified and subsequently subtracted from the EEG data, yielding an artifact-free signal.

Usually, pre-processing of EEG data, including the classification of artifactual ICs, is performed under expert supervision. However, with the advent of both high-density EEG arrays (64-256 channels) and studies of large populations, yielding increasingly greater amounts of data, supervised methods have become excessively time consuming. To cope with this, and to minimize subjectivity, automatic methods have recently been presented (Abreu et al., 2016a, Abreu et al., 2016a, Abreu et al., 2016b; Bigdely-Shamlo et al., 2015, Hatz et al., 2015, Nolan et al., 2010). Fully automated statistical thresholding for EEG artifact rejection (FASTER; Nolan et al., 2010), for instance, enables a fully automated pre-processing of ERP data, based on computing z-scores of different signal metrics, and threshold them in order to detect bad channels, bad epochs and artifactual ICs. Tool for automated processing of EEG data (TAPEEG; Hatz et al., 2015) uses a similar approach for the automatic pre-processing of RS EEG data. However, because they are based on z-scores, these approaches are not robust to outliers and as a consequence they tend to have high rejection rates of artifact-free signal. A more promising approach is to use robust statistics instead. For example, the Prep pipeline (Bigdely-Shamlo et al., 2015) provides an automatic pre-processing pipeline including filtering and bad channels identification using the RANSAC (random sample consensus) algorithm. However, in this case the identified bad channels are assumed to be globally bad. Thus, if a channel contains artifactual periods, these are neglected and left in the pre-processed EEG data. Moreover, supervised inspection of pre-processed data for bad epochs is necessary since the Prep pipeline does not provide this feature.

Here, we present APP, a novel Matlab® based fully automatic pipeline for pre-processing and artifact rejection of EEG data (including both ERP and RS data), which is based on state-of-the-art guidelines for EEG pre-processing, ICA decomposition, and robust statistics. APP consists of: (1) high-pass filtering; (2) power line noise removal; (3) re-referencing to a robust estimate of the mean of all channels; (4) removal and interpolation of bad channels; (5) removal of bad epochs; (6) ICA to remove eye-movement, muscular and bad-channel related artifacts; and (7) removal of epoch artifacts. At each step of the pipeline, a number of relevant parameters are estimated from the data and outliers are detected based on a robust data-driven outlier detection scheme.

APP was tested on ERP data from 61 healthy participants and 44 schizophrenia patients performing a visual discrimination task, and on RS data from 68 healthy participants. The inclusion of patient data in the validation of APP is of particular interest since one of the primary applications of EEG is the study of clinical populations. Furthermore, many of these populations, schizophrenia patients in particular, are known to produce more artifacts than healthy volunteers, which is a challenge to automatic pre-processing. We compare APP to three state-of-the-art automatic artifact removal methods, FASTER, TAPEEG, and Prep pipeline, which have shown to be effective at removing a wide range of EEG artifacts. We also compared APP with supervised artifact removal by experts using the CARTOOL software (Brunet et al., 2011).

Section snippets

Methods

The proposed pre-processing and artifact removal method APP is first described, including a detailed description of each step. Then, the artifact removal methods FASTER, TAPEEG, and Prep pipeline, as well as the supervised artifact removal by experts, against which APP is compared, are described. Finally, the data acquisition and analysis methods used to validate the proposed method are presented.

Results

The results obtained by applying the proposed data pre-processing and artifact removal pipeline APP, as well as its alternative pipelines, are presented here, first for the ERP data and then for the RS data.

Discussion and Conclusion

EEG data are usually contaminated by numerous artifacts and require expert supervision for artifact identification and removal. However, with the increasing size of available datasets due to increasing numbers of EEG channels and study participants, supervised data pre-processing becomes impractical, paving the way for automatic pre-processing methods.

In this study, we propose a novel automatic pipeline (APP) for EEG pre-processing and artifact detection and removal, which makes use of

Conflict of interest statement

None of the authors have declared any conflict of interest.

Acknowledgments

This work was partially funded by the Fundação para a Ciência e a Tecnologia under grants FCT UID/EEA/50009/2013 and FCT PD/BD/105785/2014, and the National Centre of Competence in Research (NCCR) Synapsy (The Synaptic Basis of Mental Diseases) under grant 51NF40-158776.

References (37)

  • J. Onton et al.

    Imaging human EEG dynamics using independent component analysis

    Neurosci Biobehav Rev

    (2006)
  • F. Perrin et al.

    Spherical splines for scalp potential and current density mapping

    Electroencephalogr Clin Neurophysiol

    (1989)
  • S. Romero et al.

    A comparative study of automatic techniques for ocular artifact reduction in spontaneous EEG signals based on clinical target variables: A simulation case

    Comput Biol Med

    (2008)
  • A.C. Tang et al.

    Validation of SOBI components from high-density EEG

    NeuroImage

    (2005)
  • M. Bach

    The Freiburg Visual Acuity test–automatic measurement of visual acuity

    Optom Vis Sci

    (1996)
  • A. Belouchrani et al.

    A blind source separation technique using second-order statistics

    IEEE Trans Signal Process

    (1997)
  • N. Bigdely-Shamlo et al.

    The PREP pipeline: standardized preprocessing for large-scale EEG analysis

    Front Neuroinform

    (2015)
  • D. Brunet et al.

    Spatiotemporal Analysis of Multichannel EEG: CARTOOL

    Comput Intell Neurosci

    (2011)
  • Cited by (44)

    • A novel robust Student's t-based Granger causality for EEG based brain network analysis

      2023, Biomedical Signal Processing and Control
      Citation Excerpt :

      However, using this strategy for EEG-based directed brain network analysis is rare. In fact, a variety of studies have analyzed the character of ocular artifacts, which could be summarized in following aspects: 1) Both eye movements and eye blinks cannot be controlled as they are natural activities of human beings [18–20]; 2) Ocular artifacts generated by eye blinks are typical transient events with higher amplitude than normal EEGs, which could be treated as notable outlier data [21–23]; 3) EEG segments contaminated with eye blink artifacts are present non-Gaussian structure [24,25], which could be described as heavy-tail distribution [26]. Thus, in this paper, by assuming that the model’s residual obeys the Student’s t-distribution, a novel brain network estimation method is proposed and solved by an iterative method which is designed in the variable Bayesian structure [6,16].

    • HAPPILEE: HAPPE In Low Electrode Electroencephalography, a standardized pre-processing software for lower density recordings

      2022, NeuroImage
      Citation Excerpt :

      As a result, there remains a current and growing need for software that standardizes and automates the processing and removal of artifacts in low-density EEG data. There is now an extensive collection of automated EEG processing pipelines (e.g., Andersen 2018; APP, da Cruz et al. 2018; MADE, Debnath et al. 2020; EEG-IP-L, Desjardins et al. 2021; HAPPE, Gabard-Durnam et al. 2018; Hatz et al. 2015; FASTER, Nolan et al. 2010; Automagic, Pedroni et al. 2019; EPOS, Rodrigues et al. 2020). However, their reliance on independent component analysis (ICA) to segregate and correct artifacts makes them unsustainable for low-density data, as the limited number of channels provides insufficient independent components for robust artifact isolation.

    View all citing articles on Scopus
    View full text