Pattern analysis of EEG responses to speech and voice: Influence of feature grouping
Introduction
Electroencephalography (EEG) and magnetoencephalography (MEG) are commonly used to study the time course of neural information processing in the human brain with high temporal resolution. In most cases, EEG/MEG studies rely on the comparison of averaged responses to repeated presentations of experimental conditions either in the temporal domain (event-related potentials [ERPs] or fields [ERFs], respectively) and/or in the frequency domain (event-related desynchronization and synchronization) (Pfurtscheller and Lopes Da Silva, 1999). Often, the statistical analyses (and related inferences on neural processing) are limited to a-priori specified (spectro-) temporal windows of interest – at channel or estimated source level – and therefore only a small subset of the measured signal is actually utilized.
This article illustrates several approaches to EEG data analysis based on pattern recognition (e.g. Bishop, 2007, Duda et al., 2001). In contrast to the conventional approach where a single dependent variable is examined (univariate statistics), these techniques exploit the information content in patterns of dependent variables (features), which are extracted from the measured signals. Pattern recognition allows analyzing EEG data in a more exploratory and data-driven manner and – similar to the recent developments in fMRI (e.g. Haynes and Rees, 2006) – promises to complement conventional approaches for EEG/MEG analysis.
A typical application of pattern recognition methods includes three steps, (1) extracting and selecting features (i.e. dependent variables), (2) learning a model with a machine-learning algorithm, and (3) determining the generalization ability of the learnt model using an independent evaluation dataset. In EEG/MEG, various types of features can be considered, ranging from signal amplitude in the temporal domain (e.g. Rieger et al., 2008) to power or phase information in the frequency domain (Kerlin et al., 2010, Luo and Poeppel, 2007, Rieger et al., 2008). Specific transformations, such as wavelet coefficients (Åberg and Wessberg, 2007, Rieger et al., 2008), and coherence measures (Besserve et al., 2007) can also be used. Furthermore, features can be differently grouped in the (spectral-) temporal and spatial domain. For example, limiting the information to pre-defined temporal windows of interest is essential to many realizations of EEG-based brain-computer interface (BCI) systems (e.g. Birbaumer, 2006, Blankertz et al., 2011, Wolpaw et al., 2002). Alternatively, the information contained in a sliding time interval of EEG data can be used, e.g. to detect the occurrence of seizures in epileptic subjects (Schad et al., 2008). Concerning the spatial (channel) domain, many BCI systems employed spatial filters (i.e. linear combinations of channels; see Blankertz et al., 2011) to enhance performances. For the same reason sophisticated feature selection or reduction methods were applied in BCI systems (see Bashashati et al., 2007).
Several machine-learning algorithms have been used to learn the relation between selected features of the EEG/MEG data and experimental labels. These algorithms include simple correlation (e.g. Luo and Poeppel, 2007), support vector machines (SVMs) (Vapnik, 1995), linear discriminant analysis (LDA) (e.g. Duda et al., 2001), and neural networks or Bayesian approaches (Bishop, 2007). Most frequently, learning algorithms are based upon linear models (e.g. Lotte et al., 2007, Rieger et al., 2008, van Gerven et al., 2009) due to their fast computation, robustness and simplicity of results interpretation.
To determine the generalization ability of the computed model, an independent set of test data is required. This can be done at single-subject level, splitting the measured data into training and testing sets (e.g. Luo and Poeppel, 2007) or across subjects, using a subset of subjects for training and the other for evaluating the generalization performance (e.g. Kerlin et al., 2010).
In this study, we consider and evaluate the effects of differently combining and grouping the features in the temporal (predefined windows, shifting window, whole trial) and channel domain (single channel, multichannel) in the context of a neuro-cognitive EEG paradigm. Using Gaussian Naïve Bayes (GNB; Mitchell, 1997) classification, we analyze data from an auditory EEG study aimed at understanding the task dependence of the cortical mechanisms underlying the processing of voice and speech identification (Bonte et al., 2009) and illustrate the results of each possible feature combination in the temporal and channel domain.
Section snippets
Materials and methods
Machine-learning approaches for the analysis of neuroimaging data require single trials to be described by an n-dimensional vector of features. In our approach, basic features are defined as EEG voltages and include time (samples) and measurement channels (electrodes). In particular, we consider six types of classification analyses derived from combining three types of features grouping in the temporal domain (predefined windows, shifting windows, whole trial) with two approaches to handle the
Predefined windows
We first considered the classifications of speakers and vowels in five predefined temporal intervals (N1, P2, N270, P340, LateP). Fig. 2.a shows – for the single channel case – group classification results for speaker and vowel grouping during the speaker (top panels) and vowel task (lower panels), respectively. To estimate reproducibility across subjects, we created topographic maps depicting, at each channel, the number of subjects with a significant classification performance. For each
Pattern recognition and EEG data
We have illustrated different strategies for analyzing EEG data using a pattern recognition algorithm. We have shown that it is feasible to distinguish experimental conditions above chance level, at the fine-grained level of speaker and vowel identity. Although low, our single-trial classification accuracies were significant even at a single subject level, which indicate that – although noisy – EEG single trial responses carry information on the neural processing of individual speech sounds.
Conclusions
We have illustrated different ways of analyzing EEG data by means of a pattern classification algorithm. Outcomes of the analyses show that grouping or separating available features (channels, time windows) helps highlighting different aspects of information content in the data. Because of the high temporal resolution of EEG (and MEG) a shifting window approach with sequential multi-channel classifications proved to be the most valuable as it allows tracing the temporal evolution of stimulus
Acknowledgments
Financial support by the Netherlands Organization for Scientific Research, Innovative Research Incentives Scheme VENI Grant 451-07-002 (MB) and VIDI Grant 452-04-330 (EF) is gratefully acknowledged. We thank Giancarlo Valente for comments and discussions.
References (35)
- et al.
Thinking the voice: neural correlates of voice perception
Trends Cogn. Sci.
(2004) - et al.
Single-trial analysis and classification of ERP components — a tutorial
NeuroImage
(2011) - et al.
EEGLAB: An open source toolbox for analysis of single-trial EEG dynamics including independent component analysis
J. Neurosci. Meth.
(2004) - et al.
Combining multivariate voxel selection and support vector machines for mapping and classification of fMRI spatial patterns
NeuroImage
(2008) - et al.
Distinct functional substrates along the right superior temporal sulcus for the processing of voices
NeuroImage
(2004) - et al.
Phase patterns of neuronal responses reliably discriminate speech in human auditory cortex
Neuron
(2007) - et al.
Machine learning classifiers and fMRI: a tutorial overview
NeuroImage
(2009) - et al.
Event-related EEG/MEG synchronization and desynchronization: basic principles
Clin. Neurophysiol.
(1999) - et al.
Predicting the recognition of natural scenes from single trial MEG recordings of brain activity
NeuroImage
(2008) - et al.
Application of a multivariate seizure detection and prediction method to non-invasive and intracranial long-term EEG recordings
Clin. Neurophysiol.
(2008)