Construction of invariant features for time-domain EEG/MEG signals using Grassmann manifolds

A challenge in interpreting features derived from source-space electroencephalography (EEG) and magnetoencephalography (MEG) signals is residual mixing of the true source signals. A common approach is to use features that are invariant under linear and instantaneous mixing. In the context of this approach, it is of interest to know which invariant features can be constructed from a given set of source-projected EEG/MEG signals. We address this question by exploiting the fact that invariant features can be viewed as functions on the Grassmann manifold. By embedding the Grassmann manifold in a vector space, coordinates are obtained that serve as building blocks for invariant features, in the sense that all invariant features can be constructed from them. We illustrate this approach by constructing several new bivariate, higher-order, and multidimensional functional connectivity measures for static and time-resolved analysis of time-domain EEG/MEG signals. Lastly, we apply such an invariant feature derived from the Grassmann manifold to EEG data from comatose survivors of cardiac arrest and show its superior sensitivity to identify changes in functional connectivity. Author Summary Electroencephalography (EEG) and magnetoencephalography (MEG) are techniques to non-invasively measure brain activity in human subjects. This works by measuring the electric potentials on the scalp (EEG) or the magnetic fluxes surrounding the head (MEG) that are induced by currents flowing in the brains’ grey matter (the ”brain activity”). However, reconstruction of brain activity from EEG/MEG sensor signals is an ill-posed inverse problem and, consequently, the reconstructed brain signals are linear superpositions of the true brain signals. This fact complicates the interpretation of the reconstructed brain activity. A common approach is to only use features of the reconstructed activity that are invariant under linear superpositions. In this study we show that all invariant features of reconstructed brain signals can be obtained by taking combinations of a finite set of fundamental features. The fundamental features are parametrized by a high-dimensional space known as the Grass-mann manifold, which has a rich geometric structure that can be exploited to construct new invariant features. Our study advances the systematic study of invariant properties of EEG/MEG data and can be used as a framework to systematize and interrelate existing results. We use the theory to construct a new invariant connectivity measure and apply it to EEG data from comatose survivors of cardiac arrest. We find that this measure enables superior identification of affected brain regions.


Introduction
Electroencephalographic (EEG) and magnetoencephalographic (MEG) signals arise from current sources within the brain's gray matter and are related to these sources by Maxwell's equations [1].Because Maxwell's equations are linear, EEG/MEG sensor signals are (nearly) instantaneous linear mixtures of the underlying source signals and this fact complicates their interpretation of terms of the distribution of underlying current sources [2].This particularly applies to functional connectivity, which refers to statistical dependencies between two or more brain signals [3] and is one of the fundamental concepts in systems neuroscience.Thus an observed statistical dependence between the signals measured at two different EEG/MEG sensors generally does not imply a statistical dependence between the signals from the brain regions underneath the sensors [4].In other words, functional connectivity analysis of sensor EEG/MEG data is susceptible to spurious functional connectivity.Invasively recorded electrophysiological signals, such as cortical surface and local field potential recordings, are subject to the same mixing problem, although to a much lesser extent [5,6,7,8,9].This mixing problem is usually dealt with in two steps.First, the EEG/MEG sensor signals are used to reconstruct the underlying source signals in the grey matter by inverting an EEG/MEG forward model.This is referred to as inverse modeling or source reconstruction [1,10,11].Because the EEG/MEG inverse problem is ill-posed [1], inverse methods typically do not completely unmix the source signals and so the reconstructed source signals are still instantaneous linear mixtures of the true source signals, although the mixing is less severe.Residual mixing of source signals is occasionally referred to as signal leakage [2] and complicates the assessment of functional connectivity.A common second step, therefore, is to analyze the reconstructed source signals by using connectivity measures that are invariant under instantaneous linear mixing.Several such measures have been proposed, including the imaginary coherence [12], the lagged coherence [13], the (weighted) phase-lag index [14,15], symmetrized phase-modulation functions [16], and the multivariate interaction measure [17] to name just a few.
Within the context of this two-step approach, a relevant question is whether there are more invariant connectivity measures and if there is a systematic way of obtaining them all.This question has been addressed only for stationary Gaussian frequency-domain signals [17].In the current study, we address this question for arbitrary time-domain signals, thus including non-stationary and non-Gaussian signals.In other words, we construct a finite set of invariant connectivity measures from which all other invariant connectivity measures can be constructed by appropriately combining them.Although we are mainly interested in measures of functional connectivity, there exist invariant measures that char-acterize other aspects of multivariate signals and that have not yet been used to analyze EEG/MEG signals.Examples of such measures are the multivariate skewness and kurtosis measures proposed in [18] which relate to multivariate non-Gaussianity.Instead of the term measure we will use the more broadly used term feature.Thus, a feature is any real-valued function of a given EEG/MEG data-set.
The first goal of our study is to obtain a finite set of invariant features in terms of which all other invariant features can be constructed.In mathematical invariant theory, such features are referred to as fundamental.Our second goal is to use these fundamental features to construct new invariant connectivity measures.Besides bivariate connectivity measures, we construct higher-order and multidimensional connectivity measures.The term higher-order connectivity refers to statistical dependencies between more than two signals that cannot be reduced to pairwise dependencies.So k-th order connectivity refers to the connectivity between k signals.Third-and fourth-order connectivity have been observed in spike-train data [19,20] and have recently received increased attention due to the discovery of discontinuous phase transitions in complex systems with higher-order interactions [21].
As far as we are aware, higher-order brain connectivity has not yet been investigated with EEG/MEG or extracellular electrophysiology.The term multidimensional connectivity refers to functional connectivity between multivariate signals [17,22,23] and can be used, for instance, to assess connectivity between regions-of-interest, each comprising k signals.We will construct both dynamic and static versions of such measures.
Let X be an n × k EEG/MEG data matrix of rank k, where n is the number of time points and k is the number of observed (i.e.reconstructed) source signals involved in the analysis.The construction of a fundamental set of invariant features for X is based on the following basic observation.
The columns of X can be viewed as vectors in n-dimensional Euclidean space R n .Due to residual mixing, each of these vectors is a linear combination of the true, but unknown, source signals.The observed and true signals, however, span the same k-dimensional subspace of R n .Both the observed and true signals can therefore be viewed as bases for this subspace and the effect of mixing is a change of basis.In other words, invariant features of X are those that are independent of the chosen basis and thus only depend on the subspace itself, that is, on how the subspace is situated in R n .It is therefore natural to construct features in EEG/MEG data matrices directly in terms of subspaces as this ensures invariance under mixing.This observation reduces the problem of finding all invariant features of X to finding all functions on the space of k-dimensional subspaces of R n .This space is known as the Grassmann manifold [24].
In Section 2 we illustrate the general approach by using a simple example, provide the necessary mathematical background, and introduce the Grassmann manifold.In section 3 we discuss the Grassmann approach to the EEG/MEG mixing problem and give a formal definition of invariant features.In Section 4 we describe two fundamental sets of invariant features.In Section 5 we use the fundamental features to construct new bivariate, higher-order, and multidimensional features, and features based on spatial activation patterns.As a proof of concept, we further apply one of our invariant features to EEG data from comatose survivors of cardiac arrest (Section 6), and demonstrate that these invariant features are more sensitive to disease-induced effects than conventional functional connectivity metrics.

Illustrative example
In this section, we consider the simplest scenario in which we can define a non-trivial property of EEG data that is invariant under instantaneous linear mixing.Although simple, it contains most of the mathematical concepts that are needed to treat the general scenario.In Section 2.2 we will introduce these concepts in a more formal way.Let x = (x 1 , x 2 ) T ∈ R 2 be a reconstructed time-domain EEG signal from a given location within the brain and assume that either x 1 or x 2 is non-zero.Note that the signal is very short; it only comprises two time points.Due to signal leakage, we only know x up to an unknown non-zero scalar a.
In other words, if is the true (but unknown) signal, then x = ax ′ with a unknown.Is there anything that can be said about the true signal on the basis of the reconstructed signal?Let f (x) be a feature of the observed signal.A feature contains information about the true signal only if f (ax) = f (x) for all signals x and non-zero scalars a.We refer to such a feature as invariant under mixing.An example of a feature that is not invariant is the power of Note that the power ratio can only be calculated if Because of mixing, we do not know x exactly, only that it lies on the line through x and the origin in R 2 .We denote this line by [x].Because we do not know where x lies on [x], no information is lost when working with [x] instead of with x.So there is a correspondence between the set of signals that cannot be distinguished from each other due to mixing and lines in R 2 .In other words, by working with lines instead of with signals, the redundancy in the presentation of the signals that is introduced by signal leakage is removed.We denote the space of lines through the origin in R 2 by RP 1 .Invariant features correspond to functions f : RP 1 → R. How can we exploit this observation to obtain all invariant features of x?
The key property of RP 1 is that it is one-dimensional, in the sense that it can locally be parametrized by a single coordinate.Such a set is called a one-dimensional manifold.The one-dimensional manifold RP 1 is known as the projective line.For example, the non-horizontal line that is determined by the point (X, Y ) T ∈ R 2 with Y ̸ = 0 can be mapped to R by the function ϕ(X, Y ) = (X/Y, 1) and this establishes a correspondence between the subset of RP 1 consisting of non-horizontal lines and (a copy of) R. The coordinate of the line hence is X/Y .Figure 1 provides an illustration.This implies that any function on RP 1 can be written as a function of the single coordinate X/Y , at least locally.In terms of the EEG signal x = (x 1 , x 2 ) T it means that for x 2 ̸ = 0, any invariant feature is a function However, essentially there is only one independent invariant feature, namely the ratio x 1 /x 2 .
A drawback of the above approach to finding all invariant features of the signal x is that any parametrization of the projective line is necessarily local, because the projective line is topologically distinct from the real line.Another drawback is that any point on the projective line has many different parametrizations and making a choice requires singling out a particular time point of x (in the above case the second time-point x 2 ) which is unnatural in most experimental contexts.In this study we therefore focus on a different approach, which is to embed the projective line in a vector space V and use the vector space coordinates to obtain global coordinates.These global coordinates will then serve as building blocks for invariant features as explained below.
Which vector space V and embedding p : RP 1 → V should we use?We first note that lines through the origin in R 2 are one-dimensional linear subspaces of R 2 and that these correspond to 2 × 2 projection matrices of rank one: With every line in R 2 we can associate a unique projection matrix of rank one, namely the matrix of the orthogonal projection onto that line.Also, with every such matrix we can associate a unique line, namely the line onto which the matrix projects.So we take V to be the vector space of symmetric 2 × 2 matrices and let p(x) be the orthogonal projection matrix onto the line [x]: This embedding allows to identify the projective line with its image under the function p, which is the (non-linear) subspace of V consisting of projection matrices of rank one.Note that because the matrix p(x) is symmetric, the vector space V is three-dimensional, and hence we obtain the following coordinate function c : RP 2 → R 3 : Note that the coordinates of c(x) are indeed invariant under scaling of x by non-zero numbers a.
Furthermore, every invariant feature of x can be build from these three coordinates and every function of the coordinates of c(x) is an invariant feature of x.For example, the invariant feature f (x 1 , x 2 ) = x 2 1 /x 2 2 can be build by dividing the first coordinate of c(x) by its third coordinate.We do note that the coordinates of c(x) are not independent.Indeed, since the projective plane is one-dimensional, there can only exist one independent invariant, whereas c(x) has three coordinates.So there must be two independent relations between the coordinates of c(x).Indeed, we see that c 1 (x)c 3 (x) = c 2 (x) 2 and

Mathematical background
In this section we list several mathematical concepts that are used in later sections.Each concept is illustrated by the example from Section 2.1 and some further examples that are relevant for the study of invariant features of EEG/MEG data matrices.
EEG/MEG data frame.An EEG/MEG data frame (or simply a frame) is an n × k matrix whose columns are linearly independent EEG/MEG signals of length n from k different locations within the brain (so k ≤ n).A frame is denoted by X.We also refer to X as a k-frame in R n .A frame is called orthonormal if its columns have unit norm and are orthogonal or equivalently if X T X = I k , where I k is the k ×k identity matrix.Figure 2 provides an illustration of a general and an orthonormal frame.
Example: For n = 2 and k = 1 we obtain the example from Section 2.1.More generally, for k = 1, frames correspond to EEG/MEG signals of length n.
Canonical angles and correlations.Let V 1 and V 2 be k 1 -and k 2 -dimensional subspaces of R n and let m = min{k 1 , k 2 }.The canonical angles θ 1 , . . ., θ m between V 1 and V 2 completely characterize the relative orientations of V 1 and V 2 in R n and are defined recursively as follows.The first canonical angle θ 1 is defined by where p 1 ∈ V 1 and q 1 ∈ V 2 are unit vectors that maximize the inner product p T 1 q 1 .The second principal angle θ 2 is defined by where p 2 ∈ V 1 and q 2 ∈ V 2 are unit vectors that maximize the inner product p T 2 q 2 and are orthogonal to the first principal vectors (i.e.p T 2 p 1 = 0 and q T 2 q 1 = 0).Continuing in this way we obtain m unique principal angles.Figure 3 provides an illustration.
The canonical angles can be calculated by using the singular value decomposition as follows [25].
Take orthonormal frames Y 1 and Y 2 that span V 1 and V 2 , respectively, and compute the singular value where Let X 1 and X 2 be k 1 -and k 2 -frames in R n .The canonical correlations between X 1 and X 2 are the cosines of the canonical angles between the columns spans [X 1 ] and [X 2 ] of X 1 and X 2 .They characterize the properties of the cross-covariance matrix X T 1 X 2 that are independent of the bases in which the frames are expressed.They can be calculated by whitening X 1 and X 2 to obtain orthonormal frames Y 1 and Y 2 (see below) and subsequently calculating the singular values of the cross-covariance matrix Y T 1 Y 2 between the whitened frames.
Example: The canonical angle between two vectors x 1 and x 2 in R n coincides with the ordinary notion of angle and their canonical correlation coincides with the Pearson correlation coefficient.
Manifold.A manifold M of dimension n is a space that locally looks like n-dimensional Euclidean space R n .This means that for every point p ∈ M there exists a neighbourhood U ⊆ M and a map is one-to-one and onto.The image ϕ(q) of a point q ∈ U contains the local coordinates of q.Intuitively, one can think of n-dimensional manifolds as curved n-dimensional spaces.Figure 4 provides an illustration.
Examples: Euclidean space R n is a manifold of dimension n.Other examples are the 1-dimensional sphere S 1 from Section 2.1, which is a manifold of dimension 1, and the two-sphere S 2 , which is a manifold of dimension 2. The space of k-frames in R n is an nk-dimensional manifold called the noncompact Stiefel manifold.We will denote it by F n,k .The space of orthonormal k-frames in R n is a manifold of dimension nk − k(k + 1)/2 called the compact Stiefel manifold.We will denote it by called the Grassmann manifold.We will denote it by G n,k .The Stiefel manifold can be mapped to the Grassmann manifold by the map π : where [X] denotes the span of the columns of X.The pre-image π −1 (V ) of a k-plane V consists of all k-frames that span V .
Whitening transformation.A whitening transformation is a map ϕ : Group action.Let G be a group and let X be an arbitrary set.G is said to act on X if every g ∈ G induces a bijection L g : X → X from X to itself, such that L e is the identity map on X and , where • denotes concatenation of maps.
Examples: Let X be the non-compact Stiefel manifold F n,1 of 1-frames in R n (i.e. the manifold of EEG signals of length n).The group R/{0} acts on F n,1 by scalar multiplication: This describes the effect of mixing in the example from Section 2.1.More generally, the general linear group GL k acts on F n,k by right-multiplication: L A X = XA.Similarly, the orthogonal group O k acts on the compact Stiefel manifold F 0 n,k by right-multiplication.
Orbit.Let G be a group acting on a set X and let x ∈ X be an element of X .The subset of X that is traced out by letting elements of G act on x is called the orbit of x.It is the set Examples: The orbit of an EEG signal x ∈ F n,1 under the above action of R/{0} is the line in R n that is spanned by x, excluding the origin.More generally, the orbit of a k-frame X in R n under the action of the general linear group GL k on the non-compact Stiefel manifold F n,k is the k-plane [X] that is spanned by the columns of X, again excluding the origin.
Orbit space.Let G be a group acting on a space X .The space of all orbits of G is called the orbit space and is denoted by X /G.So points on X /G correspond to orbits.There is a map π : X → X /G that maps x to its orbit [x].
Examples: The orbit space of the action of R/{0} on F 2,1 is the projective line RP 1 (see Section 2.1).More generally, the orbit space of R/{0} acting on F n,1 is (n − 1)-dimensional projective space RP n−1 .Even more generally, the orbit space of the action of GL k on F n,k is the Grassmann manifold G n,k .So points on G n,k correspond to k-planes in R n .The Grassmann manifold is also equal to the orbit space of the action of the orthogonal group O k on the compact Stiefel manifold F 0 n,k : An explicit bijection between these two representations of the Grassmann manifold is the map that Alternating tensor.Let V be an n-dimensional vector space.After choosing a bases, V can be identified with R n .A k-tensor T on V can now be thought of as a k-dimensional array with sides of length n.The entries of T are the coordinates of the tensor with respect to the chosen basis.We denote the entries of T by T i1,...,i k where each of the indices runs from 1 to n.So a 1-tensor on R n is an n-dimensional vector with entries T i1 for i 1 = 1, . . ., n and a 2-tensor on R n is an n × n matrix with entries T i1,i2 for i 1 , i 2 = 1, . . ., n, etc.Any permutation σ of the set of k letters can be written as the concatenation of transpositions and the sign of σ is sgn(σ) = (−1) m , where m is the number of transpositions.A permutation is odd if its sign is negative and even if its sign is positive.A k-tensor T on R n is alternating if, under permutation of its subscripts, it changes sign according to the sign of the permutation: for all permutations σ of k letters.Alternating k-tensors on R n form a vector space of dimension n k which is commonly denoted by Examples.Alternating 2-tensors on R n satisfy T j,i = −T i,j and hence are the same as anti-symmetric n-dimensional matrices.Similarly, alternating 3-tensors on R n satisfy T jik = T ikj = T kji = −T ijk and Wedge product.Let x 1 , . . ., x k be vectors in R n .With these vectors we can associate an alternating k-tensor on R n which is denoted by Its entries are given by for i 1 , . . ., i k ranging between 1 and n and where x j (m) is the m-th entry of x j .The magnitude of . the square root of the sum of its squared entries) equals the k-dimensional volume of the parallelpiped spanned by x 1 , . . ., x n .
Their wedge product is the following 2-tensor on R 3 : Note that x ∧ y is an anti-symmetric matrix and hence is determined by its three upper-triangular entries.These entries can be placed in a 3-dimensional vector and form the cross-product between x and y: x Thus, the wedge product is a generalization of the cross-product to more than two vectors of higher dimension.The magnitude of the cross-product x × y is the surface area (i.e. the 2-volume) of the parallelogram (i.e. a 2-dimensional parallelpiped) spanned by x and y.
Projectivation of a vector space.Let V be an n-dimensional real vector space.The projectivation of V is the space of lines through the origin in V .It is obtained by identifying each non-zero vector v ∈ V with its scalar multiples av for a ∈ R/{0}.We will denote the projectivation of V by P(V ).
Examples: The projectivation of R 3 is the space of lines in R 3 .Since lines in R 3 correspond to pairs of antipodal points on the unit sphere S 2 as any antipodal pair uniquely determines a line through the origin, the projectivation of R 3 is the projective plane RP 2 , which also equals G 3,1 .Topologically, it is obtained by taking S 2 and "gluing" (i.e.identifying) antipodal points.Figure 5 provides an illustration.
3 A geometric approach to the EEG/MEG mixing problem

The EEG/MEG mixing problem
In this section we describe the mixing problem for EEG/MEG data matrices, give a geometric interpretation of it, and explain the relation to the Grassmann manifold.
Let X ′ be a real n × k matrix whose columns carry the true source signals from k locations within the brain.Here n ≥ k is the number of time-points.We assume that the signals are linearly independent so that X ′ has rank k.We refer to an n × k matrix of rank k as a k-frame in R n (see Section 2.2).
Let X be the real n × k matrix whose columns carry the observed (i.e.reconstructed) source signals from the same k brain locations.Due to residual mixing, the observed signals are instantaneous linear mixtures of the true signals: where A is a k × k matrix with real entries, which we will refer to as the mixing matrix.The mixing problem for source-space EEG/MEG data matrices is that we do not observe X ′ , but only X.
Two main approaches to the mixing problem are the following.The first is recent and is restricted to situations in which the mixing matrix is known.This is the case if the projection from sensor-space to source-space is carried out by a linear transformation such as the minimum norm solution, lowresolution electrical tomography, or a linearly constrained minimum variance beamformer [26,27,10].
The mixing matrix then coincides with the resolution matrix associated with the inverse transformation [28,29] and this fact can be used to define invariant connectivity measures that depend on the mixing matrix [30,31,29,32].In this approach, the mixing matrix is not required to be invertible.
We will not discuss this approach in the current study.
The second approach does not require knowledge of the mixing matrix.The strategy is to define connectivity measures that are invariant under mixing by all possible mixing matrices.Examples of such measures where mentioned in Section 1 and include the imaginary coherence [12], the lagged coherence [13], the (weighted) phase-lag index [14,15], symmetrized phase-modulation functions [16], and the multivariate interaction measure [17].This approach requires the mixing matrix to be invertible, which can be a valid assumption, depending on which brain locations are included in the analysis.We focus on this second approach and our goal is to determine all possible invariant features of EEG/MEG data matrices, including functional connectivity measures.In the remainder of the text, we therefore assume that the mixing matrix A is invertible.
To explain the role of the Grassmann manifold in solving the EEG/MEG mixing problem, we first describe the mixing problem in geometric terms.The key observation is that, because the mixing matrix A is invertible, the columns of the observed frame X and those of the true frame X ′ span the same k-dimensional subspace of R n .This is illustrated in Figure 6.A k-frame in R n corresponds to a basis of a k-dimensional subspace (i.e. a k-plane) of R n and the effect of mixing is to change to a different basis.Hence the properties of a k-frame that are invariant under mixing, are those that are independent of the choice of basis and thus only depend on how the k-plane is embedded in R n .Examples of invariant properties are the angles that the k-plane makes with the standard coordinates axes of R n .
We now make the connection with the Grassmann manifold G n,k of k-planes in R n .We first note that k-dimensional mixing matrices form the general linear group GL k (see Section 2.2).Recall from Section 2.2 that EEG/MEG data frames correspond to points on the non-compact Stiefel manifold F n,k .With every mixing matrix A ∈ GL k we can associate an invertible map L A : F n,k → F n,k from the non-compact Stiefel manifold to itself, namely the map This map thus describes the effect of mixing on EEG/MEG data frames.It defines an action of the group GL k on F n,k (see Section 2.2).Also recall from Section 2.2 that the orbit O(X) of an EEG/MEG data frame X ∈ F n,k is the subspace of F n,k that is traced out when letting GL k act on X: The orbit of X consists of those frames that have the same column space as X.In other words, the orbit space of X is the k-plane in R n that is spanned by X.This plane is denoted by [X].Because of mixing, EEG/MEG data frames that are in the same orbit cannot be distinguished and it is therefore more natural to work in the orbit space, rather than with frames.As explained in Section 2.2, the orbit space is the Grassmann manifold G n,k .

Mixing-invariant features
Note that in this definition we have assumed that a feature is defined for every data frame X ∈ F n,k .In this study, we are interested in features that are invariant under linear and instantaneous mixing and particularly in invariant measures of functional connectivity.
We refer to a feature f as invariant under linear and instantaneous mixing if for all A ∈ GL(k) and X ∈ F n,k and for some integer w called the weight of f .Following standard practice in invariant theory [33] we refer to features with zero and non-zero weights as absolute and relative, respectively.EEG studies mostly focus on absolute invariant features such as the imaginary part of coherence, the phase lag index, the (weighted) phase-lag index, symmetrized phase-modulation functions, the multivariate interaction measure, and the standardized imaginary covariance.
There are, however, several reasons for preferring the more general definition in Eq. (11).First, although absolute invariant features can be used as estimators for corresponding population quantities, relative invariant features can be used as statistics for testing statistical hypotheses.An example is the imaginary part of the cross-spectrum, which is an invariant feature with weight 2. Although it depends on the power of the signals and therefore is not a good measure of functional connectivity, it can be used in testing for significant imaginary coherence, because the imaginary coherence is zero if and only if the imaginary part of the cross-spectrum is zero.Second, from a mathematical perspective, including relative invariant features is more natural [33] and many absolute invariant features can be constructed by taking ratios of relative invariant features.An example is the lagged coherence, which is the ratio of two invariants of weight 2.

Fundamental invariant features
The Grassmann manifold G n,k is a manifold of dimension k(n − k) and hence coordinates can be constructed in a neighborhood of every point, yielding k(n − k) independent invariant features.However, these features are only defined locally and therefore do not cover all EEG/MEG data frames.Furthermore, the standard way of calculating local coordinates requires selecting k arbitrary time-points, which is unnatural in most experimental settings.By embedding the Grassmann manifold in a vector space, the coordinates of the vector space can be restricted to the (embedded) Grassmann manifold and hence yield coordinates that are defined for all EEG/MEG data frames.This is illustrated in Figure 7.
Recall from Section 3.2 that invariant features can be viewed as real-valued functions on the Grassmann manifold.Any invariant feature can thus be expressed in these coordinates and they therefore constitute building blocks for invariant features.In the context of EEG/MEG data, we will refer to these coordinates as fundamental features.
We consider two sets of fundamental features.The fundamental affine features are obtained by embedding the Grassmann manifold G n,k in the vector space of symmetric n × n matrices, which gives n(n + 1)/2 fundamental features.The fundamental projective features are obtained by embedding the Grassmann manifold in (the projectivation of) the vector space of alternating k-tensors on R n , which gives n k − 1 fundamental features.Because any point [X] on the Grassmann manifold can be represented by either affine or projective features, the corresponding coordinates contain the same information about [X].However, the affine and projective features highlight different aspects of the EEG/MEG data frame X and one of them can be more natural than the other in a given setting.

Fundamental affine features
The fundamental affine features of a data frame X ∈ F n,k are obtained by embedding the Grassmann manifold G n,k in the vector space of real symmetric n × n matrices.This is done by mapping the k-plane [X] to the orthogonal projection matrix P (X) onto that k-plane: A more precise notation would be P ([X]) but we will use P (X) to emphasize that the projection matrix can directly be calculated from the data frame X.In order for the map P to be well-defined on G n,k , the matrix P (X) should not depend on which k-frame X is used to represent the k-plane [X].
So we need that P (XA) = P (X) for all real invertible k × k matrices A. That this is indeed the case follows directly from Eq. ( 12).Furthermore, the map P is indeed an embedding (i.e. is one-to-one), because any two different k-planes have different projection matrices.The image of G n,k under this embedding is the set of n-dimensional projection matrices of rank k.
The Grassmann manifold G n,k can hence be thought of as the (non-linear) subspace of the vector space of symmetric n × n matrices consisting of orthogonal projection matrices of rank k, or equivalently, with trace k: where Sym n denotes the vector space of n-dimensional real symmetric matrices [24].The dimension of Sym n is n(n + 1)/2 and so the embedding can be used to associate with every EEG/MEG data matrix X, n(n+1)/2 invariant features, namely the upper-triangular entries of the matrix P (X).Since invariant features correspond to functions on G n,k , any invariant feature can be expressed in terms of these invariant features.We will refer to the upper-triangular entries of P (X) as the fundamental affine features of X.
What is the interpretation of the fundamental affine features in the context of EEG/MEG data?
Let X be an n × k data frame with rows x 1 , . . ., x n .The vector x i ∈ R k is the spatial pattern (across the k brain locations) at time-point i.The (i, j)-th entry of P (X) is The fact that any invariant feature is a function of these entries is a classic result in statistics [34].Eq. (14) shows that P (X) i,j is the covariance between the spatial patterns at times i and j of the whitened data matrix.To make this more explicit, note that whitening of X does not change its column span: If W is a whitening transformation for X and Y = XW is the whitened data frame, then that is, X and Y correspond to the same point on the Grassmann manifold.We can therefore just as well work with Y so that Eq. ( 14) becomes Write y 1 , . . ., y n for the rows of Y and y 1 , . . ., y k for its columns.So y i ∈ R k is the spatial pattern at time-point i of the whitened data matrix and y j ∈ R n is the observed signal at location j.The (i, j)-th entry of P (Y ) hence is the covariance between the patterns y i and y j at time-points i and j.We see that P (Y ) is the temporal covariance matrix of the whitened EEG/MEG signals.In particular, its i-th diagonal entry P (Y ) i,i = i y 2 i is the total power of the spatial pattern at time-point i.In Section 5.4 we will use the matrix P (Y ) to construct invariant features of EEG/MEG data frames that relate to multivariate non-Gaussianity.

Fundamental projective features
The fundamental projective features of a data frame X ∈ F n,k are obtained by embedding the Grassmann manifold G n,k in the vector space k R n of alternating k-tensors on R n .The embedding map is known as the Plücker embedding.The image of a point [X] ∈ G n,k is obtained by taking the wedge product of the columns of X: which is an alternating k-tensor on R n (see Section 2.2).This image is only determined by [X] up to a scaling factor: for all invertible k × k matrices A. To make the map well-defined, all scalar multiples of the tensor ∧ k (X) need to be considered as the same object.We need, in other words, to work with lines in k R n , rather than with individual points.We thus need to work in the projectivation P( k R n ) of the vector space k R n (see Section 2.2).In the context of EEG/MEG data, we will refer to points on the embedded Grassmann manifold as fundamental projective features.Eq. ( 18) shows that the fundamental projective features are invariants of weight k.
As an example we consider the bivariate case (i.e.k = 2).Let X ∈ F n,2 be an EEG/MEG 2frame.For clarity, we use a slightly different notation and denote the columns of X by x ∈ R n and y ∈ R n .So x and y are observed EEG/MEG signals from two brain locations.The image of (x, y) under the Plücker embedding is Its coordinates are x i y j − y i x j , which are invariant features of weight 2. The product x i y j can be thought of as the instantaneous linear interaction between the activity in the first location at timepoint i and the activity in the second location at time-point j.Because y i x j is obtained from x i y j by interchanging the time-points i and j, x i y j − y i x j measures the lack of temporal reversibility in the interaction.Thus, a non-zero value of x i y j − y i x j reflects temporal irreversibility of the interaction at time-points i and j.It is closely related to the time-resolved connectivity measure proposed in [35].
As an illustration, consider the following simple model.Suppose x and y are harmonic oscillations of the same frequency ω rad/s and are sampled with N samples per second: and for i = 0, 1, 2, . . ., n and where ϕ 1 and ϕ 2 are phases.The Plücker embedding of (x, y) has the following coordinate at (i, i + m): Note that the image does not depend on i, but only on the lag m = j − i and that the image vanishes if ϕ 1 = ϕ 2 (mod π), it vanishes.So if x and y are in-phase or in anti-phase, the image vanishes and hence it indeed measures the lack of temporal reversibility in the interaction at lag m, We now consider the trivariate case (i.e.k = 3).Let x, y, z ∈ R n be the observed signals from three different locations within the brain.The image of (x, y, z) under the Plücker embedding is the which is an invariant of weight 3. Observe that it involves three time-points i, j, and k and that it is constructed by summing triple instantaneous products of the signals at time-points i, j, and k over all six permutations of these three time-points, taking into account the signs of the permutations.For example, the product −x j y i z k is obtained from x i y j z k (the original ordering of the signals) by interchanging time-points i and j, which is a permutation with negative sign.This particular form is what makes it invariant under mixing.A non-zero value of ∧ 3 (x, y, z) i,j,k reflects the lack of symmetry of the instantaneous linear interaction at time-points i, j, and k under permutation of these time-points and hence is a measure of temporal asymmetry.
We now consider the general case.The fundamental projective invariants involving k signals x 1 , . . ., x k have similar properties.The (i 1 , . . ., i k )-th entry of ∧ k (x 1 , . . ., x k ) can be written as where X i1,...,i k is the k × k submatrix of X obtained by selecting rows i 1 , . . ., i k [36].The entries of ∧ k (x 1 , . . ., x k ) are (time-resolved) k-th order connectivity measures.A non-zero value of Eq. ( 20) reflects the lack of symmetry of the instantaneous linear interaction at time-points i 1 , . . ., i k under permutation of these time-points and hence is a measure of temporal asymmetry.In particular, the fundamental projective invariants for k = 4 can be viewed as invariant versions of the recently proposed time-resolved edge connectivity measure in fMRI research [37,38,39,40,41].
In Section 5.2 we show how the fundamental projective invariants can be combined to build timeaveraged higher-order connectivity measures that are invariant to linear and instantaneous mixing.
The obtained measures can be used to study static higher-order brain interactions using EEG/MEG.

Construction of mixing-invariant EEG/MEG features
The fundamental projective features of an EEG/MEG data matrix containing two signals are instantaneous bivariate connectivity measures and are hence suitable for time-resolved connectivity analyses.
They are also relative invariants and hence can be used for hypothesis testing, but not for quantifying connectivity strength.In Section 5.1 we construct a bivariate connectivity measure of weight zero that can be applied to stationary signals.It is obtained by appropriate averaging and normalization of the fundamental projective features.In Section 5.2 we do this for more than two signals, thereby obtaining a higher-order connectivity measure of weight zero that can be applied to stationary signals.

Bivariate connectivity measures
Let x, y ∈ R n be observed EEG/MEG signals from two brain locations.The fundamental projective features of the pair (x, y) are the entries of the anti-symmetric n × n matrix For a given lag m > 0, and typically much smaller than n, we construct a static connectivity measure f m by averaging the entries ∧ 2 (x, y) i,i+m over time-points i: It measures the temporal irreversibility of the linear interaction between the signals x and y at lag m and can be used as a test-statistic for testing the null-hypothesis of temporal reversibility at lag m.It is, however, a relative invariant (it has weight 2) and therefore cannot be used to quantify the extent to which the signals are irreversible.But by dividing it by an invariant feature with the same weight, an absolute invariant connectivity measure is obtained as described below.
Let Σ be the 2 × 2 covariance matrix of the pair (x, y) and observe that det(Σ) = σ 2 x σ 2 y − γ is an invariant of weight 4, so that is an invariant connectivity measure of weight zero.Note that the measure in Eq. ( 23) is only welldefined if Σ is invertible.We can further combine the measures ξ m (x, y) by averaging over lags to obtain the following connectivity measure: where M is the maximum lag to be taken into account.An appropriate value for M can be based on the characteristic time-scale of the covariance function γ m (x, y) of (x, y).
The above measure can be expressed in terms of the cross-correlation function of the pair (x, y) as follows.Recall that the cross-covariance function γ m (x, y) of the pair (x, y) is defined by for m ≥ 0 and γ m (x, y) = γ −m (y, x) for m < 0. This implies and hence where r m (x, y) = γ m (x, y)/σ x σ y is the cross-correlation function of (x, y) at lag m and r = γ 0 (x, y)/σ x σ y is the instantaneous correlation between x and y.Eq. (27) shows that ξ(x, y) is a measure for the total asymmetry of the cross-correlation function of (x, y).In other words, ξ(x, y) is a measure for the temporal irreversibility of the linear interactions between x and y.

Third-order connectivity measures
In this section we consider the case of 3 signals.Let x, y, z ∈ R n be signals recorded from three locations within the brain.The fundamental projective features of the triplet (x, y, z) are the entries of the alternating 3-tensor ∧ 3 (x, y, z).For given lags (m 1 , m 2 ) with 0 ≤ m 1 , m 2 ≤ n − m, where m = max{m 1 , m 2 }, we average the entries ∧ 3 (x, y, z) i,i+m1,i+m2 over time-points i to obtain the following static third-order connectivity measure: It measures the lack of symmetry of the linear interaction between the three signals at lags m 1 and m 2 under permutation of these lags and hence is a measure of temporal asymmetry.As in the bivariate case, this connectivity measure is a relative invariant (it has weight 3) but can be normalized to obtain an absolute invariant connectivity measure: where Σ = X T X is the covariance matrix of the n × 3 EEG/MEG data matrix X = [x y z], which is an invariant of weight six.Note that this measure is only defined if Σ is invertible.We can further combine the different lags by averaging to obtain the following connectivity measure: where M is the maximum lag to be taken into account.An appropriate value for M can be based on the characteristic time-scales of the covariance function γ m1,m2 (x, y, z)

Multidimensional connectivity measures
In this section we describe how to construct invariant multidimensional connectivity measures between two data frames X 1 ∈ F n,k1 and X 2 ∈ F n,k2 obtained from two brain regions, comprising k 1 and k 2 voxels.Invariance in this context pertains to mixing within each of the two regions (i.e.intra-regional mixing).Thus for a connectivity measure f (X 1 , X 2 ) we require that for all invertible matrices A 1 and A 2 , which describe the mixing within the regions.To avoid introducing Cartesian products of Grassmann manifolds of different dimensions, we assume that the regions have the same number of voxels (i.e.k 1 = k 2 = k) but the formulas apply to the general situation as well.
Features that are invariant under intra-regional mixing characterize how pairs of k-planes are embedded in R n .Such features can be divided into two classes, namely those that only depend on the relative positions of the planes and those that also depend on the absolute positions of the planes.We focus on features that characterize relative positions.They correspond to symmetric (i.e.permutation-invariant) functions of the canonical angles between V 1 and V 2 (see Section 2.2) or, equivalently, to symmetric functions of the singular values of the spatial cross-covariance matrix Σ = Y T 1 Y 2 between the whitened frames Y 1 and Y 2 .Note that such features are also invariant under reordering of the two regions (i.e. they are undirected).
Therefore, any function f that satisfies E[f (0)] = 0 is a functional connectivity measure between the multivariate activity in the two brain regions.This type of connectivity is referred to as multidimensional [17,22,23].There exist infinitely many invariant multidimensional connectivity measures.
A large body of literature has shown that EEG monitoring allows early and reliable prediction of good or poor recovery of comatose patients after cardiac arrest [45,46,47,48].We include EEG data from hundred patients, as described in our previous work [49].All patients underwent continuous EEG monitoring (standard 10-20 system) as part of routine clinical care.Details of data acquisition and inclusion criteria are also reported in [49].In brief, we included fifty patients with a poor neurological outcome and fifty patients with a good neurological outcome, defined by the cerebral performance category (CPC) score, assessed six months after cardiac arrest.The CPC score quantifies the neurological outcome on a 5-point scale.A CPC=1 (no cerebral damage) or CPC=2 (moderate disability) is generally considered a favorable outcome, while the scores CPC=3 (severe disability), CPC=4 (minimal conscious state or non-responsive sleep-wake syndrome) or CPC=5 (death) reflect a poor neurological outcome.
We selected five minutes of EEG data around 48 hours after cardiac arrest in patients who had developed a discontinuous or continuous EEG at this interval.Artefacts were excluded using an automated custom computer algorithm that rejected flat channels, sensor noise, and muscle artefacts [50].
In addition, we used independent component analysis (ICA) to identify electrocardiogram artefacts [51].Every ICA component with a strong correlation with the electrocardiogram signal (defined as a Pearson correlation greater than 0.6) was rejected.This resulted in 0-2 rejected ICA components for every subject.Artefact rejection was followed by source reconstruction of the EEG data using exact low-resolution brain electromagnetic tomography (eLORETA) [52] as implemented in Fieldtrip [53].A template T1-weighted image was used to compute a Boundary Element Method (BEM) head model for every subject [54].Source reconstructed data were parcellated using the automated anatomical atlas (AAL) [55], and time-series within a parcel were averaged to obtain a single time-series per parcel.Only cortical parcels were included for further analysis [56], resulting in 78 virtual electrodes.
Source reconstructed data were band-pass filtered using a finite impulse response (FIR) filter into the canonical frequency bands: delta (1-4 Hz), theta (4-8 Hz), and alpha (8-13 Hz).For every frequency band, we computed phase synchrony using an existing measure that is invariant to signal leakage, i.e.
the phase lag index (PLI) [14].In addition, we computed the temporal irreversibility metric (Eq.( 27), with a maximal lag of a second) to the band-pass filtered signals recorded from the virtual electrodes.
Statistical testing between groups was performed using t-tests with correction for multiple comparisons using the false discovery rate [57].
Figure 8-A and Figure 8-B show the group-averaged temporal irreversibility matrix and corresponding connectivity maps for patients with good and poor outcomes for the delta band respectively.In the poor outcome group, there is overall stronger temporal irreversibility in the delta band across many brain areas.This is also confirmed by formal statistical testing as demonstrated in Figure 9C, with group differences in the strength of temporal irreversibility especially between left and right temporal and motor areas.Note that these statistically significant group differences were more pronounced than for the phase lag index (PLI) for the same data (Figure 9D).In fact, we observe that regions for which the PLI showed significant effects are a subset of the regions that are significantly different for temporal irreversibility.Hence, this suggests that our invariant temporal irreversibility is more sensitive to detecting disease-induced effects.There were no significantly different results for temporal irreversibility for other frequency bands.

Conclusions and discussion
In this study we have described two sets of fundamental features of EEG/MEG data matrices that are invariant under linear and instantaneous mixing and from which all other invariant features can be constructed by combination, namely the affine and projective features.These two sets of features can hence be viewed as building blocks of invariant features.We have illustrated their usefulness by constructing several new invariant features of EEG/MEG data matrices, including bivariate, higher-order, and multidimensional connectivity measures.They were derived by exploiting the correspondence between invariant features and functions on the Grassmann manifold and subsequently using the topological and geometric properties of the Grassmann manifold.The main advantage of invariant features over general (non-invariant) features, is that they allow for a functional interpretation because they are not subject to signal leakage.We expect that our results will be useful in general analyses of EEG/MEG data and in classification and prediction problems using machine learning in particular.For instance, the fundamental features can be used as features in classification or prediction tasks.
In our study, invariance of the derived features was guaranteed by working directly with planes (i.e. points on the Grassmann manifold) instead of with frames.This approach can be generalized to other analyses of EEG/MEG data and hence allows designing signal processing pipelines, the results of which are invariant by construction.For example, averaging of EEG/MEG data matrices (over subjects, conditions, etc.) can be done on the Grassmann manifold [58] and ensures that the average matrices have full column-rank.This approach is in the same spirit as using the manifold of symmetric positivedefinite (SPD) matrices for averaging covariance matrices, which preserves positive-definiteness [59].
The manifold of SPD matrices has been proposed as a general framework for carrying out functional connectivity analyses of fMRI data [60,61,59].Other methods that are available for Grassmannvalued data are linear regression, clustering, and kernel methods [62,58,63].
In discussing multidimensional connectivity measures, we assumed both regions to have the same number of voxels.This assumption, however, is not essential and the results are applicable to regions with different numbers of voxels.We made this assumption to keep the mathematical formalism to a minimum: If the regions have the same number of voxels, the corresponding data frames correspond to points on the Grassmann manifold, whereas if they have different numbers of voxels, we would have to discuss products of Grassmann manifolds of different dimensions, which requires additional concepts from algebraic geometry.A general theory for products of Grassmann manifolds is worked out in [43].Since the theory applies to products of arbitrary numbers of Grassmann manifolds, it can be used to generalize multidimensional connectivity measures to more than two brain regions of possibly different numbers of voxels.This would lead to connectivity measures that are both multidimensional and higher-order.
In discussing multidimensional connectivity measures, the data frames were thought to correspond to different brain regions.However, they can also thought to correspond to different trials, subjects, groups, or conditions.In these applications, it is evident that there is no mixing between the frames, since they are not obtained from simultaneous recordings of single subjects.If the frames are obtained from simultaneous recordings of single subjects, there might be mixing between the frames, depending on the exact brain regions.If the reconstructed source signals are obtained from a linear inverse operator, such as the minimum norm solution [10] or a linearly constrained minimum variance beamformer [26,27], the mixing of sources is described by the resolution operator, which is the concatenation of the forward and inverse operators [28,29].Inspection of the resolution operator then allows assessing the severity of mixing between the regions [23].Although this does not solve the mixing problem, analyses can at least be restricted to regions with minimal mixing.
We have applied one of the Grassmann-derived invariant measures, the temporal irreversibility metric, to EEG data of comatose survivors of cardiac arrest.We demonstrated that this metric is more sensitive to identifying affected brain regions between patients with a good and poor outcome than a conventional invariant phase synchrony measure, the phase lag index [64].Remarkably, regions identified with our metric include the subset of regions that were identified with the phase lag index.While the usage of asymmetry in connectivity may not only be justified by its insensitivity to signal leakage, it may also be more sensitive to long-distance communication between neuronal populations.Genuine connectivity between populations is associated with a conduction delay between populations and a presumed preferred direction of connectivity.The temporal irreversibility metric may be more sensitive to this characteristic than conventional phase synchrony, which does not distinguish consistent intervals of phase lag versus phase lead.We note that the temporal irreversibility metric is strongly related to the irreversibility metric introduced by the so-called "inside out framework" as introduced by Deco and coworkers [65].In line with previous work [66], we have demonstrated that at 48 hours after cardiac arrest, there were only significant differences in temporal irreversibility in the delta band between patients with a good and poor neurological outcome.Presumably, an increased functional connectivity of delta oscillations at this time point after arrest is associated with poor neurological   rectangles) together with their canonical angles θ 1 and θ 2 .The second canonical angle is the angle between the vectors p 2 and q 2 and equals zero, because p 2 = q 2 .This reflects the fact that V 1 and V 2 intersect in a line.G n,k of k-planes in R n (curved surface) embedded into a vector space.Any point [X] on the Grassmann manifold (black dot) can be identified with a unique vector in the vector space (red arrow) and the coordinates of the vector can therefore be used as coordinates of [X].In particular, any function on the Grassmann manifold (i.e.any invariant feature) can be expressed in terms of these coordinates.
In this sense, these coordinates constitute building blocks for invariant features.
n from the non-compact Stiefel manifold to the compact Stiefel manifold of the form ϕ(X) = XW for some invertible k × k matrix W and which restricts to the identity map on F o n,k .The transformation W will depend on X, but we suppress this from the notation.In the context of EEG/MEG signals, whitening decorrelates the signals in X and standardizes their variances to 1.A whitening transformation always exists and can be constructed, for instance, by taking a square root of the spatial covariance matrix X T X of X. Group.A group is a set G together with a product operation.The product operation G × G → G maps a pair (a, b) of group elements to another group element ab called the product of a and b.The product is assumed to have the following three properties: (i) (ab)c = a(bc), (ii) there is an element e ∈ G such that ea = ae = e for all a ∈ G (identity element), and (iii) for every a ∈ G there is a a ′ ∈ G such that aa ′ = a ′ a = e.The element a ′ is called the inverse of a and is denoted by a −1.Examples: The group R/{0} consists of non-zero real numbers with ordinary multiplication (a, b) → ab (see Section 2.1).The identity is 1.Another example is the group of invertible k × k matrices with real entries, together with matrix multiplication (A, B) → AB.This group is called the general linear group and is denoted by GL k .Yet another example is the set of k × k orthogonal matrices with real entries, again with matrix multiplication.It is called the orthogonal group and we denote it by O k .
recovery.Further clinical research in larger clinical cohorts is warranted to study the additional value of irreversibility in the delta band along existing prognostic EEG features in comatose patients after cardiac arrest.

Figure 3 :
Figure 3: Canonical angles.Shown are two 2-dimensional subspaces V 1 and V 2 of R 3 (red and blue

Figure 4 :
Figure 4: Manifold and local coordinates.Schematic illustration of an n-dimensional manifold M, together with a point p ∈ M and a surrounding neighborhood U ⊆ M. The function ϕ (red arrow) maps U to a subset V of n-dimensional Euclidean space R n .The inverse function ϕ −1 : V → U is a local parametrization of M. Every point on the manifold has such a local parametrization and they endow the manifold with local coordinates.

Figure 5 :Figure 6 :
Figure 5: Projectivation of R 3 .Shown is the unit 2-sphere S 2 , together with a line through the origin of R 3 and the intersection of the line with a pair of antipodal points on S 2 (red and blue dots).Points on the projectivation P(R 3 ) of R 3 correspond to pairs of antipodal points.

Figure 7 :
Figure 7: Embedding of the Grassmann manifold into a vector space.Shown is the Grassmann manifold

Figure 8 :
Figure 8: Temporal irreversibility (Eq.(27)) applied to EEG data of comatose survivors of cardiac arrest.Panels A and B show the average temporal irreversibility matrix across patients in each group.The column average of these matrices is plotted next to them as a brain plot.There is much stronger temporal irreversibility in patients with a poor outcome.Panels C and D show the outcome of statistical testing for the temporal irreversibility and phase lag index, respectively.Colours represent t-values for regions that show significant differences between groups (p < 0.05 FDR corrected).
and V are, respectively k 1 × m and k 2 × m matrices with orthonormal columns, and cos(Θ) Informally, a feature of a n × k EEG data frame X ∈ F n,k is any property of X that can be expressed by a single real number.Recall that the columns of X are the observed EEG signals from k different brain regions.Examples of features of individual signals are amplitude, power, peak frequency, entropy, and LZ complexity, among many others.Examples of features that depend on multiple signals of X are the Kuramoto order parameter, multivariate information measures, and all functional connectivity measures.Formally, a feature of an n