An automatic pre-processing pipeline for EEG analysis (APP) based on robust statistics
Introduction
The electroencephalogram (EEG) is a non-invasive tool for the investigation of human brain function, which has been continuously used for almost one century (Niedermeyer and Lopes da Silva, 2005). However, EEG data are typically contaminated with a number of artifacts. Artifacts are undesired signals that may affect the measurement and change the EEG signal of interest. These artifacts may arise from non-physiological noise sources that originate outside the participant, such as the grounding of the electrodes causing power line noise at 50/60 Hz and at its harmonics, interferences with other electrical devices, or imperfections in electrode settling. Artifacts may also arise from physiological noise sources originating within the participants, such as the ones produced by head, eye, or muscle movements (Urigüen and Garcia-Zapirain, 2015). Head movements may result in spikes and discontinuities due to a rapid change of impedance at one or several electrodes. Reflective eye movements occur frequently and are normally picked up by the frontal electrodes in the frequency range of 1–3 Hz (within the delta wave range). Blinking also contaminates the EEG signal, usually causing a more abrupt change in its amplitude than eye movements. Finally, every movement of the participant generates muscular artifacts that can be found everywhere on the scalp at frequencies higher than 20 Hz (within the beta and gamma waves range).
One simple way to deal with these artifacts is to remove segments of the data that exceed a certain level of artifact contamination, for example, signal amplitudes greater than ±100 µV. However, this coarse approach may lead to the loss of a great amount of data that could still contain artifact-free information, therefore potentially compromising the subsequent analysis and interpretation of the data. This is true for both evoked-related potentials (ERP) and resting-state (RS) signal fluctuations. Moreover, since participant generated artifacts may overlap in the spectral domain, and on many EEG channels, with the signal of interest, simple spatial and frequency band filtering approaches may be inefficient to remove this kind of artifacts (Tatum et al., 2011). Another method that is commonly used to clean-up EEG data is independent component analysis (ICA; Makeig et al., 1996). Assuming that neuronal signals and noise recorded on the scalp are independent of each other, then the EEG signal can be described by their linear summation. The ICA is used to decompose the EEG data in statistically independent sources (ICs), so as to separate the neuronal and noise contributions to the signal. The artifactual ICs can then be identified and subsequently subtracted from the EEG data, yielding an artifact-free signal.
Usually, pre-processing of EEG data, including the classification of artifactual ICs, is performed under expert supervision. However, with the advent of both high-density EEG arrays (64-256 channels) and studies of large populations, yielding increasingly greater amounts of data, supervised methods have become excessively time consuming. To cope with this, and to minimize subjectivity, automatic methods have recently been presented (Abreu et al., 2016a, Abreu et al., 2016a, Abreu et al., 2016b; Bigdely-Shamlo et al., 2015, Hatz et al., 2015, Nolan et al., 2010). Fully automated statistical thresholding for EEG artifact rejection (FASTER; Nolan et al., 2010), for instance, enables a fully automated pre-processing of ERP data, based on computing z-scores of different signal metrics, and threshold them in order to detect bad channels, bad epochs and artifactual ICs. Tool for automated processing of EEG data (TAPEEG; Hatz et al., 2015) uses a similar approach for the automatic pre-processing of RS EEG data. However, because they are based on z-scores, these approaches are not robust to outliers and as a consequence they tend to have high rejection rates of artifact-free signal. A more promising approach is to use robust statistics instead. For example, the Prep pipeline (Bigdely-Shamlo et al., 2015) provides an automatic pre-processing pipeline including filtering and bad channels identification using the RANSAC (random sample consensus) algorithm. However, in this case the identified bad channels are assumed to be globally bad. Thus, if a channel contains artifactual periods, these are neglected and left in the pre-processed EEG data. Moreover, supervised inspection of pre-processed data for bad epochs is necessary since the Prep pipeline does not provide this feature.
Here, we present APP, a novel Matlab® based fully automatic pipeline for pre-processing and artifact rejection of EEG data (including both ERP and RS data), which is based on state-of-the-art guidelines for EEG pre-processing, ICA decomposition, and robust statistics. APP consists of: (1) high-pass filtering; (2) power line noise removal; (3) re-referencing to a robust estimate of the mean of all channels; (4) removal and interpolation of bad channels; (5) removal of bad epochs; (6) ICA to remove eye-movement, muscular and bad-channel related artifacts; and (7) removal of epoch artifacts. At each step of the pipeline, a number of relevant parameters are estimated from the data and outliers are detected based on a robust data-driven outlier detection scheme.
APP was tested on ERP data from 61 healthy participants and 44 schizophrenia patients performing a visual discrimination task, and on RS data from 68 healthy participants. The inclusion of patient data in the validation of APP is of particular interest since one of the primary applications of EEG is the study of clinical populations. Furthermore, many of these populations, schizophrenia patients in particular, are known to produce more artifacts than healthy volunteers, which is a challenge to automatic pre-processing. We compare APP to three state-of-the-art automatic artifact removal methods, FASTER, TAPEEG, and Prep pipeline, which have shown to be effective at removing a wide range of EEG artifacts. We also compared APP with supervised artifact removal by experts using the CARTOOL software (Brunet et al., 2011).
Section snippets
Methods
The proposed pre-processing and artifact removal method APP is first described, including a detailed description of each step. Then, the artifact removal methods FASTER, TAPEEG, and Prep pipeline, as well as the supervised artifact removal by experts, against which APP is compared, are described. Finally, the data acquisition and analysis methods used to validate the proposed method are presented.
Results
The results obtained by applying the proposed data pre-processing and artifact removal pipeline APP, as well as its alternative pipelines, are presented here, first for the ERP data and then for the RS data.
Discussion and Conclusion
EEG data are usually contaminated by numerous artifacts and require expert supervision for artifact identification and removal. However, with the increasing size of available datasets due to increasing numbers of EEG channels and study participants, supervised data pre-processing becomes impractical, paving the way for automatic pre-processing methods.
In this study, we propose a novel automatic pipeline (APP) for EEG pre-processing and artifact detection and removal, which makes use of
Conflict of interest statement
None of the authors have declared any conflict of interest.
Acknowledgments
This work was partially funded by the Fundação para a Ciência e a Tecnologia under grants FCT UID/EEA/50009/2013 and FCT PD/BD/105785/2014, and the National Centre of Competence in Research (NCCR) Synapsy (The Synaptic Basis of Mental Diseases) under grant 51NF40-158776.
References (37)
- et al.
Ballistocardiogram artifact correction taking into account physiological signal preservation in simultaneous EEG-fMRI
NeuroImage
(2016) - et al.
Objective selection of epilepsy-related independent components from EEG data
J Neurosci Methods
(2016) - et al.
EEGLAB: an open source toolbox for analysis of single-trial EEG dynamics including independent component analysis
J Neurosci Methods
(2004) - et al.
Estimation of interpolation errors in scalp topographic mapping
Electroencephalogr Clin Neurophysiol
(1996) - et al.
Reliability of fully automated versus visually controlled pre- and post-processing of resting-state EEG
Clin Neurophysiol
(2015) - et al.
An adjusted boxplot for skewed distributions
Comput Stat Data Anal
(2008) - et al.
Independent component analysis: algorithms and applications
Neural Netw
(2000) - et al.
Reference-free identification of components of checkerboard-evoked multichannel potential fields
Electroencephalogr Clin Neurophysiol
(1980) - et al.
FASTER: Fully Automated Statistical Thresholding for EEG artifact Rejection
J Neurosci Methods
(2010) - et al.
EEG coherency: I: statistics, reference electrode, volume conduction, Laplacians, cortical imaging, and interpretation at multiple scales
Electroencephalogr Clin Neurophysiol
(1997)
Imaging human EEG dynamics using independent component analysis
Neurosci Biobehav Rev
Spherical splines for scalp potential and current density mapping
Electroencephalogr Clin Neurophysiol
A comparative study of automatic techniques for ocular artifact reduction in spontaneous EEG signals based on clinical target variables: A simulation case
Comput Biol Med
Validation of SOBI components from high-density EEG
NeuroImage
The Freiburg Visual Acuity test–automatic measurement of visual acuity
Optom Vis Sci
A blind source separation technique using second-order statistics
IEEE Trans Signal Process
The PREP pipeline: standardized preprocessing for large-scale EEG analysis
Front Neuroinform
Spatiotemporal Analysis of Multichannel EEG: CARTOOL
Comput Intell Neurosci
Cited by (44)
Time-resolved EEG signal analysis for motor imagery activity recognition
2023, Biomedical Signal Processing and ControlA novel robust Student's t-based Granger causality for EEG based brain network analysis
2023, Biomedical Signal Processing and ControlCitation Excerpt :However, using this strategy for EEG-based directed brain network analysis is rare. In fact, a variety of studies have analyzed the character of ocular artifacts, which could be summarized in following aspects: 1) Both eye movements and eye blinks cannot be controlled as they are natural activities of human beings [18–20]; 2) Ocular artifacts generated by eye blinks are typical transient events with higher amplitude than normal EEGs, which could be treated as notable outlier data [21–23]; 3) EEG segments contaminated with eye blink artifacts are present non-Gaussian structure [24,25], which could be described as heavy-tail distribution [26]. Thus, in this paper, by assuming that the model’s residual obeys the Student’s t-distribution, a novel brain network estimation method is proposed and solved by an iterative method which is designed in the variable Bayesian structure [6,16].
HAPPILEE: HAPPE In Low Electrode Electroencephalography, a standardized pre-processing software for lower density recordings
2022, NeuroImageCitation Excerpt :As a result, there remains a current and growing need for software that standardizes and automates the processing and removal of artifacts in low-density EEG data. There is now an extensive collection of automated EEG processing pipelines (e.g., Andersen 2018; APP, da Cruz et al. 2018; MADE, Debnath et al. 2020; EEG-IP-L, Desjardins et al. 2021; HAPPE, Gabard-Durnam et al. 2018; Hatz et al. 2015; FASTER, Nolan et al. 2010; Automagic, Pedroni et al. 2019; EPOS, Rodrigues et al. 2020). However, their reliance on independent component analysis (ICA) to segregate and correct artifacts makes them unsustainable for low-density data, as the limited number of channels provides insufficient independent components for robust artifact isolation.
The HAPPE plus Event-Related (HAPPE+ER) software: A standardized preprocessing pipeline for event-related potential analyses
2022, Developmental Cognitive NeuroscienceDEEP: A dual EEG pipeline for developmental hyperscanning studies
2022, Developmental Cognitive Neuroscience