Designing optimal spatial filters for single-trial EEG classification in a movement task

https://doi.org/10.1016/S1388-2457(98)00038-8Get rights and content

Abstract

We devised spatial filters for multi-channel EEG that lead to signals which discriminate optimally between two conditions. We demonstrate the effectiveness of this method by classifying single-trial EEGs, recorded during preparation for movements of the left or right index finger or the right foot. The classification rates for 3 subjects were 94, 90 and 84%, respectively. The filters are estimated from a set of multi-channel EEG data by the method of Common Spatial Patterns, and reflect the selective activation of cortical areas. By construction, we obtain an automatic weighting of electrodes according to their importance for the classification task. Computationally, this method is parallel by nature, and demands only the evaluation of scalar products. Therefore, it is well suited for on-line data processing. The recognition rates obtained with this relatively simple method are as good as, or higher than those obtained previously with other methods. The high recognition rates and the method's procedural and computational simplicity make it a particularly promising method for an EEG-based brain–computer interface.

Introduction

The study of surface EEG as a possible new communication channel for severely disabled persons has a long history (Nirenberg et al., 1971), and has received increased attention recently (e.g. Farwell and Donchin, 1988, Sutter and Tran, 1990, Wolpaw and McFarland, 1994, Kalcher et al., 1996, Pfurtscheller et al., 1997). The EEG allows the observation of gross electrical fields of the brain, and reflects changes in neural mass activity associated with various mental processes. Since some mental processes result in distinguishable EEGs, a person that can produce such mental processes at will, has the potential to use them for communication. The feasibility of this communication depends on the extent to which the EEGs associated with these mental processes can be reliably recognized automatically. The electro-physiological phenomena investigated most in the quest for an automatic discrimination of mental states are event-related potentials (EP) (Farwell and Donchin, 1988, Sutter and Tran, 1990), and localized changes in spectral power of spontaneous EEG related to sensorimotor processes (see e.g. Wolpaw and McFarland, 1994, Kalcher et al., 1996, Pfurtscheller et al., 1997).

It is well known that planning and execution of movement leads to a short-lasting and circumscribed attenuation known as event-related desynchronization (ERD, Pfurtscheller and Aranibar, 1979) of rhythmic EEG components in the alpha and beta band (Gastaut, 1952, Chatrian et al., 1959, Kuhlman, 1978, Pfurtscheller et al., 1996). In the case of finger or hand movement, the desynchronization starts in the contralateral sensorimotor cortex during the planning phase and stays asymmetrical over both hemispheres until movement onset. Recordings from subdural electrodes show similar behavior, but the responses are more localized and the changes in spectral power are much enhanced (Toro et al., 1994).

The ERD of mu and central beta rhythms can be seen as a correlate of exited or activated sensorimotor areas, where thalamo-cortical information exchange and processing takes place (Steriade and Llinas, 1988). An interesting observation is that at the same moment in time, different cortical areas can display focal attenuated (ERD) and focal enhanced mu and beta components. The latter phenomenon is known as event-related synchronization (ERS) and may be seen as a correlate of active inhibited or deactivated cortical areas (Pfurtscheller, 1992). It may be hypothesized that the structure controlling the patterns of simultaneous cortical desynchronization and synchronization, and hence, the gating of the thalamo-cortical information transfer, is the reticular thalamic nucleus (Yingling and Skinner, 1977). In the case of hand movement, ERD can be found over the hand area and ERS over the foot area. Foot or toe movement can result in a foot area ERD and simultaneously in a hand area ERS (Pfurtscheller et al., 1996). The observation of simultaneously attenuated and enhanced EEG rhythms can be used to classify brain states related to the planning or even imagination of different types of limb movements. Recently, imagined movements of the right and left hand were classified correctly in 80% of single trials in 3 trained subjects (Pfurtscheller et al., 1997).

Such imagination prompted changes of sensorimotor rhythms have been suggested as a possible means to re-establish communication in patients with severe motor disturbances (Wolpaw et al., 1991, Wolpaw and McFarland, 1994). In order to be useful in practical applications, such a system must achieve close to 100% classification accuracy.

In many of these experiments, the EEG is recorded on a multitude of channels placed in a dense grid covering large parts of the brain. Given that the sensorimotor rhythms originate from very localized areas in the cortex, we expect that not all signals recorded from different sites contribute the same amount of information to the classification, and some may only contribute noise. The central electrodes overlying primary sensorimotor areas will be most important for discrimination. Since the skull and the scalp cause a spatial smearing of the cortical signals, electrodes close to sensorimotor areas will also contain relevant information. However, with increasing distance from sensorimotor areas, the recorded signal will be increasingly contaminated by cortical activity unrelated to the movement to be discriminated. Two consequences arise from this situation: first, the signals from different electrodes have to be weighted in some way, in order to reflect their relevance for the classification task. Second, the correlations between signals from neighboring electrodes can be used to suppress the noise in individual channels.

The weighting problem has been attacked mostly with ad hoc procedures, i.e. the features derived from an electrode are weighted or selected not by a criterion determined from the data directly, but a posteriori by their importance for the classifier. This is done, for example, by the modification to the Learning Vector Quantization scheme (LVQ) introduced by Pregenzer et al. (1994), in order to render it distinction sensitive (DSLVQ). Another example is Peters et al. (1997), who trained neural networks on the AR coefficients of each individual channel. Those networks that perform best on a validation set of trials become members of a committee, the size of which is chosen for optimal performance of the committee. Classification is decided by a ‘vote’ among the committee members. This classifier works well only when some channels have a high signal-to-noise ratio. The channels left out of the committee are those that bring more noise than signal to the decision-making. In a situation of low signal-to-noise ratios in all channels, this method will fail, even if a high signal-to-noise ratio can be achieved by spatial filtering.

The possibility to achieve better signal-to-noise ratios by making use of the inherent correlation between neighboring channels has so far not been exploited. On the contrary, one of the commonly extracted features, spectral properties from the individual time-series, no longer contain this a priori information.

The method we advocate here is based on a decomposition of the raw signals into spatial patterns that are extracted from the data of two populations of EEGs in a manner that maximizes their differences. These spatial patterns provide a weighting of the electrodes, which is derived directly from the data. We will show how these patterns reflect the underlying physiological processes.

The method used for the extraction of the patterns from the data is based on the method of Common Spatial Patterns (CSP) which was introduced in the field of EEG analysis by Koles et al. (1990). They used the method to classify normal versus abnormal EEGs (Koles et al., 1994), to both extract abnormal components from EEGs (Koles, 1991), and to localize sources (Koles et al., 1995, Soong and Koles, 1995).

In brief, this method takes as input two sets of spatial patterns representing two classes into which other sets of spatial patterns are later to be classified. Below, an element in such a set, a spatial pattern will be the amplitudes of an N-channel EEG at a given instant in time. A set of patterns may consist of the T spatial patterns that make up the EEG of a single trial, recorded at T consecutive points in time. Or a set may consist of the union of recordings from several trials.

The latter is the case for the sets used to calibrate the method, that is, the sets of spatial patterns from which the method extracts the spatial features later used for classification. The former is the case when the EEG of a single trial is to be classified. In either case, we note that time is no more than an index distinguishing different patterns recorded at different times. Due to the temporal high-pass filtering, the signals have zero mean. Therefore, average patterns are unsuited to distinguish the classes and covariances have to be used instead.

As output, calibration of the method gives an ordered list of characteristic spatial patterns. These characteristic patterns define directions in pattern space that are optimally suited for distinguishing between the two classes. A time-series of patterns that belongs to either one or the other class, will, after an appropriate transformation, scatter maximally along the first direction and minimally along the last, if it belongs to the first class, and vice versa if it belongs to the other class. The second and the second-to-last directions in the list are the second best directions for this discrimination, and so on for the other directions in the list. Given an EEG, one then treats it as a set of T spatial patterns, projects these onto the most discriminatory characteristic patterns, and calculates the variance of the T values resulting from each projection. These variances are the features on which classification of the EEG is based, using a simple linear classifier. Typically only a few directions in pattern space, i.e. a few characteristic patterns, are sufficient for the discrimination task, so these patterns may be thought of as spatial filters that select the most relevant spatial aspects for the discrimination task. We thus obtain a drastic reduction in the dimensionality of the problem while making use of the information contained in all channels.

The goal of the work presented in this paper was to pioneer the application of optimal spatial filters to the task of single trial EEG classification. The data used were recorded during the planning phase of 3 different types of movement; left and right index finger movement and right foot movement. Some of the results presented here were reported in preliminary form in Müller-Gerking et al., (1997).

Section snippets

Experiment and data

The experimental protocol followed a classical memorized delay task. Three subjects were asked to perform one of 4 movements (pressing a micro-switch with the left or right index finger, flexing the toes of the right foot, or moving the tongue to the upper gum) after a series of stimuli. Each trial started with a short warning tone (warning stimulus, WS). One second after a WS, a visual cue (CUE) appeared on a computer screen in front of the subject, indicating which movement was to be done.

Results

In this paper, we concentrate on the time segment 0.5–1 s after RS. For most trials, this segment immediately precedes the actual movement; in some trials, movement actually happens within this time window. Classification rates as functions of experimental time will be presented elsewhere (Müller-Gerking, Pfurtscheller, and Flyvbjerg, in preparation).

Spatial patterns

The most discriminative spatial patterns shown in Fig. 1, Fig. 2, Fig. 3 are easily related to the ERD patterns found with movement preparation and execution (e.g. Pfurtscheller et al., 1996). For example, Fig. 1 shows that movements of the left index finger as compared with the movements of the right index finger are characterized by increased EEG activity on the electrodes overlying the ipsilateral hand representations of sensorimotor cortex, and vice versa for right finger movements.

Stated

Acknowledgements

J.M.G. gratefully acknowledges fruitful discussions with P. Grassberger, J. Martinerie, C. Neuper, B. Peters, B. Renault and F. Varela. H.F. thanks W. Bialek for useful discussions. The research was partially supported by the ‘Fonds zur Förderung der wissenschaftlichen Forschung’ in Austria, project P11208MED.

References (33)

Cited by (761)

View all citing articles on Scopus
1

Supported in part by the Austrian ‘Fonds zur Förderung der wissenschaftlichen Forschung’, project P11208MED.

View full text