Abstract
Developing sensitive and reliable methods to distinguish normal and abnormal brain states is a key neuroscientific challenge. Topological Data Analysis, despite its relative novelty, already generated many promising applications, including in neuroscience. We conjecture its prominent tool of persistent homology may benefit from going beyond analysing structural and functional connectivity to effective connectivity graphs capturing the direct causal interactions or information flows. Therefore, we assess the potential of persistent homology to directed brain network analysis by testing its discriminatory power in two enigmatic examples of disease-related brain connectivity alterations: epilepsy and schizophrenia. We estimate connectivity from functional magnetic resonance imaging and electrophysiology data, employ Persistent Homology and quantify its ability to distinguish healthy from diseased brain states by applying a support vector machine to features quantifying persistent homology structure.
We show how this novel approach compares to classification using standard undirected approaches and original connectivity matrices. In the schizophrenia classification, topological data analysis generally performs close to random, while classifications from raw connectivity perform substantially better; likely due to topographical, rather than topological, specificity of the differences. In seizure discrimination from scalp electroencephalography data, classification based on directed persistent homology features provided comparable results to other methods, reaching 89 percent accuracy. Specific niche for topological data analysis opens when direct comparison of connectivity matrices is unsuitable - such as for intracranial electrophysiology with individual number and location of measurements. While standard homology performed overall better than directed homology, this could be due to notorious technical problems of accurate effective connectivity estimation.
1. Introduction
The general idea that mathematical approaches from geometry and algebraic topology can provide powerful tools and methods for the understanding of complex data is fairly recent (see, e.g., (Frosini, 1990; Edelsbrunner et al., 2002; Zomorodian and Carlsson, 2005; Carlsson, 2009)) and has lead to the birth and advancement of a completely new research branch, Topological Data Analysis (TDA). This is a fast growing and promising field at the intersection of Data Analysis, Algebraic Topology, Computational Geometry, Machine Learning and Statistics. Its main advantage is the possibility to analyse data by looking at its shape and underlying geometric structures arising from high-dimensional relations; this goal is achieved by providing a strong mathematical framework and manageable quantitative tools such as homology theory (Munkres, 1984; Hatcher, 2000). The structure of the data is then qualitatively and quantitatively assessed in the form of topological features (e.g. connected components, voids, tunnels or loops). This approach has been already widely applied and finds nowa-days applications in many science domains dealing with various complex systems (Wasserman, 2018; Chazal and Michel, 2017).
The human brain and its dynamical activity represents one of the archetypal examples of complex systems. Indeed, due to both the intricate nature of the structure of connections of its constituent units, as well as the highly nonlinear and potentially chaotic dynamics within each of these units, it is rightly considered a suitable target example system for application of such complex system methodology. In particular, the study of the complex patterns of brain network topology has become a flourishing areas of research since the 2000’s (Bullmore and Sporns, 2009).
There are three main approaches to assess the interactions between brain regions: structural, functional, and effective connectivity (see Friston (1994, 2011); Park and Friston (2013)). Structural connectivity is the pattern of physical connections between the considered brain regions, while functional and effective are based on neuronal activity signals. Functional connectivity is defined as the stochastic dependence between remote neurophysiological events and provides thus a non-directional (symmetric) measure of brain interactions. On the contrary, the effective connectivity is defined as the direct influences that one region exerts over another.
The most frequently used measure of functional connectivity is the (linear) Pearson correlation, which provides an accurate functional connectivity estimate for functional magnetic resonance imaging data under common processing settings (Hlinka et al., 2011), (Hartman et al., 2011). One of the applications in clinical neuroscience is the characterization of functional brain alterations in various diseases. The classification power of FC has been reported for schizophrenia (Lynall et al., 2010), multiple sclerosis (Zurita et al.), epilepsy (Liao et al., 2010), and many others brain diseases and states. On the other hand, effective connectivity, while in principle more informative with respect to the underlying dynamics since it specifies the direct influences among the brain regions, is more scarcely applied in practice, presumably since it is also much harder to estimate and the methods thus typically can handle only smaller networks (Smith, 2012). While there are many competing methods for estimation of effective connectivity offering different trade-off between advantages and drawbacks (Valdes-Sosa et al., 2011), there is a general agreement that if estimated well, they would provide valuable extra information on top of functional connectivity (Smith, 2012).
While a connectivity matrix characterizes only the strengths of all pair-wise interactions, more complex characterizations of the whole network as well as specific nodes thereof can be provided. This can be done by application of graph-theoretical tools to the graph constructed by using the connectivity matrix as an adjacency matrix (Telesford et al., 2011). However, the graph-theoretical analysis of brain connectivity suffers from multiple technical difficulties. One of them is that the classical approach requires applying a threshold to select which connections are relevant, i.e., strong enough to constitute an edge in the graph. There are multiple heuristics to deal with the threshold selection or even generalization of many graph metrics to weighted graphs, however no single optimal solution is available. A very attractive way to side-step the problem and consider all possible thresholds at once, without losing any information, is thus offered by already established methods of TDA, such as Persistent Homology.
Persistent Homology (PH) is one of the main tools adopted in TDA and is a multi-scale adaptation of the classical homology theories studied in Algebraic Topology. It allows a computation of the persistent topological features of a space at all different resolutions (here: densities of the thresholded connectivity graph), while revealing the most essential ones. In the PH-approach to complex networks, instead of focusing on a single thresholded graph, one analyses a family of simplicial complexes indexed by a (filtration) parameter, by considering all the possible thresholds at once.
Based on a robust mathematical theory and being stable with respect to small noise perturbations (Cohen-Steiner et al., 2007), PH has provided a novel qualitative and quantitative tool to study complex data (in the form of either point clouds, time series or connectivity networks). Its use has already been reported in many different fields: in neuroscience and neurological disorders (Lee et al., 2011), in endoscopy analysis (Dunaeva et al., 2016), angiography (Bendich et al., 2014), pulmonary diseases (Brodzki et al., 2018), finance (Gidea, 2017), fingerprint classification (Giansiracusa et al., 2019), image classification (Dey et al., 2017), to name few applications. The pipeline involving Persistent Homology in network analysis usually starts by considering (undirected, weighted) graphs as input. However, in many real world examples as in the study of effective connectivity, the input graphs are also equipped with a richer structure on the edges, namely the direction from the source node to the target. As directed graphs theoretically hold more information than their undirected counterparts (see e.g., Sec. 3.4), an extension of the PH pipeline to asymmetric networks might provide a more refined family of invariants. Some novel approaches have already been proposed in this direction by Masulli and Villa (2016); Reimann et al. (2017); Chowd-hury and Mémoli (2018); Turner (2019); Aktas et al. (2019); Lütgehetmann et al. (2020), although not yet widely adopted; notably the early applications include the analysis of brain structural connectivity networks (Reimann et al., 2017).
The developments in the fields of brain connectivity analysis and TDA call for their combined utilization. Many attempts have been done in this direction, for reviews see (Solo et al., 2018; Phinyomark et al., 2017; Lee, 2019). In this work, with the aim of investigating the promise of applications of topological analysis to effective connectivity networks, we provide a pioneering example of such endeavour for the task of classifying altered brain states in several settings, while we assess the improvement provided by these methodologies in contrast to dealing with the connectivity directly, or applying the TDA approach to functional connectivity data only.
To this end, we construct functional and effective connectivity matrices (in short, FC and EC) derived from brain activity time series; these are obtained by two most common functional neuroimaging modalities – (intracranial) electroencephalography ((i)EEG) and functional magnetic resonance imaging (fMRI) – using two prominent examples of disconnectivity-related brain diseases: epilepsy and schizophrenia. Following previous approaches Lee et al. (2011, 2012); Merelli et al. (2016); Stolz et al. (2018), we calculate the persistent homology of the (directed) filtrations associated to the brain functional (or effective) connectivity networks, and compute the associated PH-features (further referred to as directed persistent homology). In order to quantify their informative power, we apply a state-of-art classifier (a support vector machine, SVM), with the task to distinguish between healthy and diseased brain states. We use a range of most common quantitative PH features, namely Persistence Landscapes (PL) Bubenik and Dłotko (2017), Persistence Images (PI) Adams et al. (2017), Persistent Entropy (PE) Rucco et al. (2014) and Carlsson Coordinates (CC) Adcock et al. (2013). We compare the results with those obtained, under the same tasks, by a naive (SVM) classification applied to the FC and EC matrices directly.
The directed homology pipeline followed in this work was introduced in (Reimann et al., 2017) with applications in the study of the structural and functional connectivity in rat brain slices. This was also the first, and the only one known to the authors, application of this approach neuroscience.
Note that while the application of directed persistent homology analysis to human brain effective connectivity is novel, it builds upon the tools and ideas developed and applied recently in the field. For example, the undirected variants of the TDA features have been shown to be promising in applications to Alzheimer (Kuang et al., 2019b; Pachauri et al., 2011), ASD (Wong et al., 2016), depression (Khalid et al., 2014), Parkinson (Garg et al., 2016), or in the study of childhood stress (Chung et al., 2014). The first work, applying PH-methods to brain networks constructed by using Pearson correlation is due to Lee (see (Lee et al., 2011, 2012)) in the case of patients affected by ADHD. Among previous approaches in epilepsy, a persistent homology analysis has been already reported in (Merelli et al., 2016), where (weighted) PE is employed in order to detect the transition between the preictal and ictal states, or in (Piangerelli et al., 2018; Wang et al., 2015), where a classification analysis in epilepsy, based on PLs and PE, has used different filtrations (sublevel-based or lower star filtrations). In (Choi et al., 2014), a study of the epileptic abnormality is given in rat models. Within schizophrenia applications, the only analysis applying methods from TDA is, to the authors knowledge, the work (Stolz et al., 2018). To provide more insight into the promises and pitfalls of the application of directed persistent homology analysis of brain (effective) connectivity, the current study goes beyond single method application, and provides comparison with the application of (standard) persistent homology of undirected networks, as well as application of machine learning directly to the input connectivity matrices. This, together with tackling three types of neuroimaging data across two brain different dysconnectivity conditions, should provide a richer picture concerning the challenges of TDA application to brain connectivity data.
Our key observation is that standard TDA applied to effective connectivity is in principle able to discriminate diseased from healthy brain states, comparably to (previously used combinations of) TDA applied to functional connectivity. However, in many practical situations such as we demonstrate in the results section for the application to schizophrenia classification, the use of raw functional connectivity features is advantageous. This may be due to topographical, rather than topological, specificity of the particular changes related to schizophrenia – these topographically (regionally/spatially) specific changes are naturally picked up by classifiers using the raw functional connectivity matrices; while the topological methods effectively ignore the spatial information, as they are invariant with respect to permutation of nodes – a fact that may be commonly overlooked when considering the applications. However, in yet another scenarios, the direct connectivity comparison is not straightforwardly applicable, such as for inter-subject analysis of intracranial electrophysiology recordings in epilepsy, not only due to potentially different spatial location of the brain alterations, but also due to the technical challenge of inter-individual differences in the number and location of the measurement sites.
With the idea that the topological pipeline may suffer from heterogeneous subject biases, a probable cause of lower general performances, we have also introduced an alternative methodological pipeline reducing the underlying heterogeneity. We observe that this variation of the pipeline shows a remarkable increment of the performances in the TDA-approach, when applied to the EEG dataset. Most notably, good performances are also achieved when applied to the iEEG data, suggesting that TDA can be promising in fields where classical methods, like naive FC classification, can not be employed.
The paper is structured as follows. In Sec. 2 we recall the definition of directed Persistent Homology. In Sec. 3, we describe our pipeline and the common featurizations adopted in the classification tasks; we report our results in Sec. 4. In Sec. 5, we discuss our contributions. In the Appendix A, we review some mathematical concepts in more detail, including simplicial complexes, filtrations and (persistent) homology groups. Appendix B and Appendix C provide additional details concerning the data selection and processing. We finally report some additional results (Appendix D).
2. Persistent Homology of directed networks
In this section, following Reimann et al. (2017) and Lütgehetmann et al. (2020), we review the classical notions of simplicial complexes and persistent homology groups adapted to the directed frame-work, keeping always in mind the directed networks as the application example. We assume that the reader is familiar with the classical general concepts of simplicial complexes, filtrations and persistent homology groups (reviewed for convenience in Appendix A).
A graph G = (V, E), where V denotes the set of its vertices and E the set of its edges, is a directed graph if every edge e = (v0, v1) is given by an ordered pair of distinct vertices v0 (the source of the edge) and v1 (the target). For a given pair of vertices v0 and v1, both edges (v0, v1) and (v1, v0) are allowed, but loops, i.e., edges of type (v, v), are not. A higher dimensional generalization of directed graphs is given by the notion of directed simplicial complexes (see Def. A.1 for the classical definition of simplicial complexes):
An ordered simplicial complex Σ = {σα}α on a vertex set V is a non-empty family of finite ordered subsets σα ⊆ V with the property that, if σ belongs to Σ then every ordered subset τ of σ (ordered with the natural order induced by σ) belongs to Σ.
In analogy with the notation used for simplicial complexes, the elements of an ordered simplicial complex of cardinality d + 1 are called d-simplices. We say that an ordered simplicial complex Σ has dimension d if all its simplices have cardinality at most d + 1. For example, every directed graph (V, E) is a 1-dimensional ordered simplicial complex on V and, if the set of vertices V is 0{, …, n}, then the collection of all the ordered subsets of V is an n-dimensional ordered simplicial complex. When n = 1, the corresponding ordered simplicial complex is the directed graph with vertices 0 and 1 and ordered edges (0, 1) and (1, 0).
Recall that a k-clique in an unordered graph G is a complete subgraph on k vertices. Likewise, if G = (V, E) is a directed graph, then an ordered k-clique of G is a k-tuple (v1, …, vk) of vertices of G with the property that the pair (vi, vj) is an ordered edge of G for every i < j. For a given directed graph, we can construct an ordered simplicial complex by considering the collection of all the directed cliques (see the analogous construction of Ex. A.3):
The directed flag complex dFl(G) associated to a directed graph G is the ordered simplicial complex whose k-simplices are all the ordered (k + 1)-cliques of G.
Observe that the vertices and edges of a directed graph G are exactly the 1 − and 2-cliques of G, and that every directed graph is contained in its associated directed flag complex. For example, if G is the directed graph on V = {0, 1, 2}, with edges (0, 1), (1, 2) and (0, 2), then there is only one directed 3-clique, (0, 1, 2), and the associated directed flag complex dFl(G) consists of the same vertices and edges of G, together with the 2-simplex (0, 1, 2) corresponding to the unique directed clique.
A filtration of ordered simplicial complexes consists of a nested sequence of increasing ordered simplicial complexes (see also Def. A.4). For example, if Ø ⊆ G0⊆ … ⊆ Gn = G is a family of directed subgraphs of a directed graph G, then the sequence of directed flag complexes dFl(Gi) is a filtration of dFl(G).
In our applications, we first consider a weighted directed graph G, i.e., a directed graph for which every edge is given a numerical weight. For each weight wi we define the directed graph Gi as the (directed) subgraph of G with same vertices of G and only those edges corresponding to the weights w of G with w ≤ wi (note that at this level the weights correspond already to a distance, rather than similarity matrix, see also Ex. A.5). The weights wi of G, sorted in the increasing order, induce a sequence of directed subgraphs of G, hence a filtration of directed flag complexes dFl(Gi). Generally speaking, in order to extend it to a filtration of directed flag complexes one has to choose a weighting function also on the directed simplices (hence on the ordered cliques) extending the weights given on the edges. We use the intuitive maximum weighting function (the default in the software Flagser Lütgehetmann et al. (2020)), associating to an ordered simplex the maximal weight across all its edges. See Lütgehetmann et al. (2020) or the documentation at https://github.com/luetge/flagser for details on other possible weighting functions.
Once a filtration of (ordered) simplicial complexes is given, one can apply the classical homology methods for obtaining Persistent Homology groups, barcodes and Persistence Diagrams (see App. A.2 for a review). We will briefly refer to the Persistent Homology of the filtration of directed flag complexes associated to a weighted directed graph as its directed Persistent Homology (shortly denoted by DPH).
3. Methods
In the analysis of brain connectivity networks, we use Persistent Homology features as input for classification by an SVM classifier. The steps of the pipeline include (see also Figure 1):
Data
we use scalp and intracranial EEG (of epileptic patients) and fMRI (of schizophrenia patients and healthy controls) data in the form of multi-variate time series;
Graphs
for each dataset we construct (functional and effective) connectivity graphs by calculating Pearson correlation and Granger Causality respectively; see Sec. 3.2.
Topology
from the graphs we construct appropriate filtrations of simplicial complexes (see Sec. A.1 and Sec. 2). In particular, we have constructed the Vietoris-Rips filtration (see Ex. A.6) associated to the FC graph and the filtration of directed flag complexes associated to the directed EC graph (see Eq. (2.1)).
Homology
we compute the Persistent Homology of the given filtrations (PH for undirected and DPH for directed graph), getting their persistence Betti numbers (corresponding to the number of connected components for dimension 0 and to the number of loops for dimension 1, the two most common settings in the literature); see also App. A.2.
Machine Learning
from persistence Betti numbers, we get the features (vectors of real numbers) and apply a suitable classifier (SVM) to both the FC or EC graphs, and to the features (in particular, the Persistence Landscapes, Persistent Images, Carlsson Coordinates and Persistent Entropy (Sec. 3.5).
As a variation of this pipeline, we add an additional cleaning step in the case of EEG and iEEG datasets as described in Sec. 3.6.
3.1. Dataset
We use two prominent types of neuroimaging data in combination with two examples of pathological brain dynamics:
3.1.1. fMRI data (Schizophrenia)
We analyse the dataset that consists of fMRI recordings of 90 patients with a schizophrenia diagnosis and 90 healthy controls. Functional MRI data were collected with a 3T MR scanner (Siemens; Magnetom Trio) at Institute of Clinical and Experimental Medicine in Prague. T 2*-weighted images with BOLD contrast were acquired with voxel size 3×3×3 mm3 and TR/TE parameters of 2000/30ms. As anatomical reference the T1-weighted high-resolution image was used (TR/TE/TI = 2300/4.6/900 ms, voxel size = 1×1×1mm3). Initial data preprocessing was done using FSL routines (FMRIB Software Library v5.0, Analysis Group, FMRIB, Oxford, UK) and CONN toolbox (McGovern Institute for Brain Research, MIT, USA). Functional realignment and unwarping, slice-timing correction, structural segmentation into white matter and cerebrospinal fluid and structural normalization to the MNI space were done using CONN’s default preprocessing pipeline (defaultMNI), as well as functional normalization to the MNI space, outlier detection, and smoothing with 8mm kernel size. Six head-motion parameters with their first order temporal derivatives and five principal components of white-matter and cerebrospinal fluid time-series were regressed out to suppress measurement artifacts. Furthermore, to remove possible signal drift, time-series were linearly detrended and filtered by band-pass filter [0.009-0.08Hz]. See Kopal et al. (2020) and Oliver et al. (2019) for detailed prepocessing description. To extract the time series for further analysis, the brain’s spatial domain was divided into 90 non-overlapping regions of interest (ROIs) according to the AAL atlas; from each ROI we extract one BOLD time series by averaging the time series of all voxels in the ROI.
3.1.2. EEG data (Epilepsy)
We analyse both scalp and intracranial electroencephalography EEG data:
Scalp EEG data
We use EEG recordings from pediatric subjects with intractable seizures, collected at the Children’s Hospital Boston. The data are available at https://physionet.org/content/chbmit/1.0.0/, from the publicly open dataset Physionet. We analyse EEGs belonging to 18 subjects, with 23 scalp electrodes. In particular, we included patients with typical epileptic event of length between 20 and 120 seconds. The set of selected patients is provided in Appendix B. After isolating the epileptic event according to labels in the patient’s summary file, we also consider non-epileptic records, chosen randomly from the database (we extract segments of the same length as the seizure segments). In total, we use 102 ictal and 102 interictal data segments, the same amount of ictal and preictal parts for every particular subject. Sampling frequency is 256Hz, no special preprocessing is applied except for a band-pass filter [1-70Hz], notch filter at 50Hz, and removal of global signal for every data segment separately.
For each subject, we (separately) average the connectivity matrices for ictal and interictal periods, respectively. For the additional analysis using a modified pipeline (Section 3.6) we used a slightly modified data selection, which is described in Appendix B.
Intracranial EEG data
We use the anonymized dataset of 16 patients of the epilepsy surgery program of the Inselspital Bern, freely available at http://ieeg-swez.ethz.ch/, with no further preprocessing implemented. The intracranial electroencephalography (iEEG) recordings belong to 16 patients with pharmacoresistant epilepsies collected at the Sleep-Wake-Epilepsy-Center (SWEC) of the University Department of Neurology at the Inselspital Bern and the Integrated Systems Laboratory of the ETH Zurich. Each recording consists of 3 minutes of preictal segments, the ictal segment (ranging from 10 s to 1002 s), and 3 minutes of postictal time. In this work we only consider the recordings corresponding to the first 30 seconds of the first seizure as the ‘ictal’ data, and the 30 seconds segment starting 1 min before the first seizure start as our ‘preictal’ data.
For every patient we use the first minute of interictal data for estimation of baseline connectivity, which is later subtracted from both ictal and preictal connectivities.
3.2. Connectivity Networks
In order to estimate the brain networks we use both a symmetric (Functional Connectivity, Sec. 3.2.1) and an asymmetric approach (Effective Connectivity, Sec. 3.2.2):
3.2.1. Functional Connectivity
To obtain the functional connectivity network, we cross-correlate the time-series described in Sec. 3.1.2 and Sec. 3.1.1. Let N be the number of brain regions in the AAL atlas or number of electrodes in scalp EEG, then the N × N matrix A = (Ai,j) is defined using Pearson correlation between the time series corresponding to nodes i and j. In line with previous literature (Lee et al., 2011, Section 3), (Merelli et al., 2016, Section 3), (Stolz et al., 2018, Section 2.2), all entries of A that are not significant at α = 0.05 are set to 0, along with all negative values of functional connectivity. Below, we refer to this method as masked Functional Connectivity (masked FC).
3.2.2. Effective connectivity
The effective connectivity is quantified by the pairwise Granger causality (Barnett and Seth, 2013); furthermore, edges corresponding to links that do not show statistically significant causal effect (by means of the FACDA algorithm (Hlinka and Kořenek, 2018)) have their weight set to zero. In the following we refer to this method as masked Effective Connectivity, or more concisely, as masked EC. We use pairwise Granger causality instead of multivariate Granger Causality (which controls for all the other variables) to reach more robust estimates. Indeed, in the multivariate approach a greater number of parameters has to be estimated, potentially leading to less accurate estimates (Hlinka et al., 2013).
3.3. Persistent Homology computation
We analyse persistent homology groups of both the directed and undirected networks. We refer to Appendix A for a review of PH. When not otherwise specified, the persistent homology computations shown in the article refer to calculations implemented in dimension 1 (we refer to Appendix D for the corresponding results in dimension 0); homology groups are computed over the commonly used base field 𝔽2 (see Def. A.7).
3.3.1. PH of undirected networks
The undirected network constructed in Sec. 3.2 is a weighted graph. As explained in Ex. A.5, by sorting the weights in an increasing or decreasing order, we get a filtration of simplicial complexes, the WRCF. Equivalently, one can construct the associated Vietoris-Rips filtrations (see Ex. A.6). As the implementation of the WRCF can be computationally demanding, in this work we follow the second approach: we construct the quasi-metric space (X, d) associated to the matrix A (the matrix A being constructed as described in the previous subsection) with underlying space X given by the vertices of the weighted graph and quasidistance d(i, j) := 1 − A(i, j). We then construct the Vietoris-Rips filtration associated to this metric space and compute the associated PH and persistence diagrams. The computation of the associated Persistent Homology has then been implemented with the python library Ripser (Tralie et al., 2018).
3.3.2. PH of directed networks
The directed network presented in Sec. 3.2.2 provides a weighted directed graph, hence a directed flag complex as explained in Sec. 2. For consistency with the undirected case, we first rescale the matrix A via a linear map A ↦ Ã to span the interval [0, 1] and then apply the transformation x ↦ 1 − x to use the matrix 1 − Ã, getting a positive matrix where the lowest values represent the strongest connections. We consider the sequence of directed graphs sorted with the increasing order of the weights (of 1 − Ã). We extend the weights to the simplices of the induced flag complexes by giving to a simplex σ the maximum value across the weights of its edges. We then compute the associated persistent homology groups.
3.4. Why DPH? A motivational example with VAR processes
In this subsection we elucidate why an extension of the persistent homology techniques to direct networks is theoretically more fruitful than the classical PH approach on undirected networks. We assume that the reader is familiar with the concept of VAR(p) (vector autoregressive) processes (see Lutkepohl (2007) for a general overview).
Let X be a VAR(1) process of the form where Xt is an n-dimensional random vector, A is a n× n matrix of (real) coefficients and et is a white noise with 0 mean and covariance Σ; we say that the process X is generated by the matrix A.
Without loss of generality, we assume that X is stable (meaning that the matrix A has eigenvalues smaller in module than 1), has 0 mean and the variance Σ is the identity matrix Σ = 1. Then, the matrix sequence {Ai}i∈ℕ is absolutely summable and the covariance matrix of X (that is a stable process, hence stationary) can be written as where At denotes the transpose of the matrix A.
The correlation matrix of the process X depends only on its covariance matrix Var(X) = Cov(X, X), whereas the associated effective connectivity depends upon the generating matrix A (for simplicity, we use the matrix A directly here as the EC).
In order to document the extra richness of the directed version of PH, we consider two different VAR(1) processes X and Y generated by two matrices A and B with different topological structure but the same covariance matrix Cov(X, X) = Cov(Y, Y). These two processes share the same correlation matrix (used in the definition of FC), hence the associated (standard/undirected) PH will not distinguish them. However, as the associated EC matrices differ, the directed PH might be able to distinguish the aforementioned processes. We now document this using a simple low dimensional example.
The problem reduces to finding topologically different matrices A, B, such that AAt = BBt (this can be achieved by choosing an suitable orthogonal matrix Q, and by setting B := AQ. As a specific example, we set:
Let φ = π/3 and B = AQφ. The eigenvalues of A and B are smaller in module than 1 and the generated VAR(1) processes are stable, moreover, clearly AAt = BBt, hence, they have the same covariance matrix by Eq. (3.1), and as a consequence, the correlation matrices of X and Y have same persistent homology groups (see Fig. 2). On the other hand, when applied to the generating matrices A and B, DPH is able to distinguish the two processes (see Fig. 3), proving the advantage of DPH in distinguishing a larger class of processes.
We point out here that we cannot apply PH directly on B, as B is an asymmetric matrix. However, PH could indeed be in principle applied to the matrix A that is symmetric; however, it provides a different homology structure than the application of DPH. In particular, the persistent homology groups associated to A are concentrated in dimension 0, whereas DPH(A) has non-trivial homology classes in dimension 1 and 2 as well as in dimension 0 (see Fig. 4).
3.5. Featurizations of Persistent Homology
Persistent homology groups can be visualized by using Persistence Diagrams as birth-death planar representations (Sec. A.2). In order to apply a SVM classifier, we derive four most commonly used vectorizations, or featurizations, from the persistence diagrams: Persistence Landscapes, Persistence Images, Carlsson Coordinates and Persistent Entropy. In this way we get input real-valued vectors amenable to the classifier. We proceed by illustrating the definitions of these topological features.
3.5.1. Persistence Landscapes (PL)
Persistence landscapes have been introduced in order to perform PH-based statistical analysis by Bubenik and Dłotko (2017). For a birth-death interval (b, d), where d is a finite value, the function represents a triangle-shaped function over (b; d). Let {(bi, di)}i∈I be a (finite) collection of birth-death intervals. Let k-max({ai i}) denote the k-th maximum value of a finite family of real numbers {ai} i.
The persistence landscape of the set of birth-death coordinates {(bi, di)}i∈I is the family of functions λ = (λk : ℝ → ℝ)k∈N, where .
Observe that the functions λk(t) are piece-wise linear and that they are identically 0 when k is bigger than the cardinality |I| of the birth-death pairs. See Fig. 5 for an example of persistence landscape associated to a persistence diagram.
We follow the same method as Yesilli et al. (2019) to perform ML analysis on the set of persistence landscapes. Let 𝒟 = (Dp)p∈P be a set of persistence diagrams (e.g, each diagram Dp is the PD associated to a single brain network, at a fixed dimension). For every such persistence diagram Dp, let be the associated PL. Let K ⊆ ℕ be a fixed subset of indices of ℕ; for every k ∈ K consider the collection of functions varying with the index p (and fixed k).
By definition, for every k and p, the map is a piecewise linear function. Then, it has a set of critical points, i.e., those points at which the function changes its direction. Let z(k) be the vector of ℝn whose entries consist of all the critical points associated to the functions at k fixed; the entries of z(k) are sorted in increasing order, without repetitions.
For every chosen k∈ K, we have constructed a vector z(k) depending only on the critical points of the k-th functions of the PLs. The feature vector, input of the classifier, associated to the diagram Dp in degree k, is then the evaluation vector . See the work of Chazal and Michel (2017) for further details. In our work, we choose the set K ⊆ N to be K = {1} to capture the landscape containing information about the most persistent holes; hence, the analysis reported in Sec. 4 uses the first landscapes .
3.5.2. Persistence Images (PI)
An efficient and stable vector representation of persistent homology is given by the so called Persistence Images introduced by Adams et al. (2017). We start with a persistence diagram D = {(bi, di)} i∈I, a subset of ℝ2 as depicted in the example Fig. A.8 and apply the transformation T : ℝ2 → ℝ2 sending a birth-death pair (bi, di) to the birth-persistence pair (bi, pi) := (bi, di −bi). Let Dk : ℝ2 → ℝ be the normalized gaussian centered at (bk, pk) and with standard deviation σ, i.e.,
As set by default in line with Adams et al. (2017), we choose a specific non-negative weighting function W : ℝ2→ ℝ that is zero along the horizontal axis, continuous, and piece-wise differentiable, in particular a function depending only on the second coordinate: if p is the maximal possible persistence, W (x, y) is set to 0 if y ≤ 0, 1 if y ≥ p and y/p between 0 and 1.
We associate to the persistent diagram D a function ρ : ℝ2 → ℝ defined as weighted combination of gaussians:s
Let bmax and pmax be the maximum birth and maximum persistence across the births and persistences appearing in a diagram, i.e., bmax := maxi(bi) a pmax := maxi(di − bi). After choosing a grid of the domain [0, bmax] × [0, pmax], we compute the integral ∫ ρ(x, y)dxdy at each pixel-box of the grid-domain. This gives a discretization of the surface image of ρ(ℝ2), hence a matrix of values which is our persistence image. See Fig. 6 for an example of the output.
For a family of persistence diagrams, one can define a mean PI of the persistence images associated to the PDs in the family. As each single PI is a matrix of real values, the mean PI of the family is defined as the matrix given by the element-wise average of the matrices. However, this brings a technical issue, as we describe:
Remark 3.2. To construct the PIs, we have used the MATLAB code that accompanies the paper by Adams et al. (2017). Given a Persistence Diagram, one has to set parameters such as the resolution, the probability density and the weighting function. However, as already observed in (Stolz et al., 2018, Section 3.3), in order to compute the mean PI for a family of persistence diagrams, there is an additional pair of values to be chosen: the pair of maximum birth and maximum persistence. We set these values to the maximum across all subjects to achieve unbiased classification results. Note that an alternative group-specific normalization was reported to provide better classification results (Stolz et al., 2018), but in fact provides artificially inflated accuracy due to introducing group-specific bias; a topic that would be covered in more detail in a separate methodological report.
3.5.3. Carlsson coordinates (CC)
Another useful featurization method is given by the so-called Carlsson Coordinates (Adcock et al., 2013, Sec. 4.1.2 & 4.2.2). The method is based on an approach from algebraic geometry. For every persistence diagram D = {(bi, di)} i∈I in the form of birth-death coordinates, we construct five vector features as follows. Let pi = di − bi be the life associated to (bi, di) and let pM be the maximal death time dM := maxi(di). Following Adcock et al. (2013), we consider the following coordinates:
Morover, following (Yesilli et al., 2019, Sec. 4.4), we add a fifth coordinate f5(D) := maxi(pi) taking into account the maximal reached persistence. As suggested in (Adcock et al., 2013, Sec. 4.2.2), we normalize the coordinates f1, …, f4 by a scale factor 1/N, where N is the number of the birth-death pairs in the persistence diagram.
3.5.4. Persistent Entropy (PE)
Our last feature is Persistent Entropy, a topological adaptation of Shannon entropy in the context of persistent homology (Rucco et al., 2014), i.e., a number that measures the information encoded in the persistent homology groups (at a given dimension).
Let ℱ be a filtration of a simplicial complex Σ (see Def. A.4) and D = {(bi, di)} i∈I the set of birth-death coordinates associated to PH in dimension k. For every I ∈ I let pi := di − bi be the i-th persistence.
The persistent entropy E(B) associated to D is the number E(D) := — ∑i∈I si log(si) where si = pi/SD and SD = ∑i∈I pi.
3.6. Controlling for inter-individual variability in electrode placement
In real-world scenarios, a substantial variability in the estimated functional or effective connectivity networks between subjects can be due to confounding purely anatomical differences, or spatial mis-alignment of the observed locations. In an experimental situation with repeated measurements from the same subject, the design may potentially allow suppressing this inter-individual variability by data preprocessing.
In particular in iEEG data, a problem of data inconsistency across subjects arises, as every patient has his/her own unique position of electrodes (the implantation) that depends on the particular clinical requirement. Such implantation defines the basic structure of the connections between the recorded brain regions and, as a consequence, iEEG connectivity matrices can not even be directly compared between subjects because of the different data structure and the different sizes.
While this problem is much alleviated in scalp EEG with standardized electrode number and placement, inter-individual differences in brain and head anatomy may still cause substantial interindividual variability in EEG connectivity that may obscure intra-individual variability reflecting the brain state dynamics of interest.
Working with the basic topological features rather than with the matrix directly should help to sidestep this problem. However this may not perfectly resolve the issue, as the geometry of the patient’s data may still be systematically biased by the individual structure of electrode implantation and anatomy. Therefore, before performing the classification task, we add two steps of data cleaning to the general pipeline.
The first step - used to decrease the influence of implantation on the topology - is directly applied to the connectivity matrices computed by the procedure described in Sec. 3.2. The EEG and iEEG data contain epileptic (ictal) and non-epileptic (preictal and interictal) segments as described in Sec. 3.1.2. In both cases, apart from the ictal and immediately preictal segments, we estimate the connectivity within the clearly interictal segments, which in the following we refer to as the baseline connectivity. This baseline connectivity is subtracted from both the preictal and ictal connectivities to provide correction for the confounding inter-individual variability.
The second step in the data cleaning aimed to further control for inter-subject variability is performed after the feature computation and before applying the classification tasks. For each topological feature and patient, we consider the vector consisting of all the realizations of the topological feature across conditions (i.e. all the classified segments for that subject). We compute the mean and remove it from the whole vector of feature values: this gives a final vector with null mean for every patient and feature that has inter-individual biases accounted for, and can be used in the classification tasks.
3.7. Classification
To perform the classification (based on a set of features for each subject), we apply a linear support vector machine (SVM) that constructs a hyperplane in high-dimensional space on a training set of points. This hyperplane is then used for the classification of testing points. We train a SVM classifier based on the topological features of Sec. 3.5. We have chosen the SVM as it is the state-of-art method when deep learning can not be applied due to insufficient sample size. Validation of the classifier is made with the leave-two-out procedure on paired samples. For every subject, we train the classifier on the set consisting of all the points excluding the tested sample and its matching counterpart from the same subject from the other dataset. A similar approach has been used, for instance, in the studies by Castro et al. (2011), Mikolas et al. (2018) and Rubin-Falcone et al. (2018).
3.8. Software
In the construction of effective connectivity networks we have used the software companion of Barnett and Seth (2013) for the computation of Granger Causality and of Hlinka and Kořenek (2018) for the construction of the mask (see Sec. 3.2.2). For the calculations of persistent homology groups of undirected networks we have used the python library Ripser Tralie et al. (2018). For the computation of persistent homology of the directed graphs, we have used the library Flagser by Lütgehetmann et al. (2020) (via the python API reference pyflagser of the Giotto-tda Tauzin et al. (2020)), with the default settings. For computing the PIs, we have used the software that accompanies the paper of Adams et al. (2017); we set the variance of the Gaussians to 0.0001 and keep all the other parameters as by default. The pair maximum birth/maximum persistence is computed across all the instances (see Remark 3.2). For the calculation of PLs, we use our own MATLAB implementation of (Bubenik and Dłotko, 2017, Algorithm 1). For the implementation of support vector machine classifier, we use the MATLAB function fitcsvm. When otherwise specified, we use the default settings of the declared functions.
4. Results
The aim of the paper is to assess the applicability of (directed) persistent homology methods to brain functional/effective connectivity analysis. We have focused on two prominent neurological applications: epileptic seizure detection (using EEG data, both scalp and intracranial) and classification of schizophrenia/healthy brain states (using fMRI data). We follow the pipeline described in Sec. 3: for each subject, we first calculate the functional connectivity graph (Sec. 3.2.1) and the effective connectivity directed graph (Sec. 3.2.2), hence the associated Vietoris-Rips filtration (Ex. A.6) and the filtration of directed Flag complex (Def. 2.2), respectively; we then compute the persistent homology groups of the obtained filtrations as described in Sec. 3.3. We encode the PH information in the form of Persistence Diagrams, that we use to compute the associated topological features PL, PI, CC and PE. Here we potentially substract individual feature mean to control for electrode placement bias as described in Sec. 3.6 (this is not applicable to the fMRI data where the classification is only between subjects). We finally apply the SVM classifier (see Sec. 3.7). We compare the obtained classification results with those achieved by using the same SVM classification method applied to Functional Connectivity (FC) (in the undirected case) and to (masked) Effective Connectivity (EC) (in the directed approach). The results are shown in the subsections below. Note that for the results obtained on the iEEG data, a comparison with the naive FC/EC approach is not possible, because of different size and position of patient’s implantations.
We proceed by illustrating our results in the group analysis of the fMRI (schizophrenia) data first, of the group analysis of scalp EEG (epilepsy) data afterwards (following the classical pipeline first, and the alternative pipeline later), and we conclude with the analysis of the iEEG (epilepsy) dataset.
4.1. First dataset: fMRI - Schizophrenia
In this first analysis, we present the results of a classification using the fMRI time series of 90 healthy subjects and of 90 schizophrenia patients (see Sec. 3.1.1). We report the results in Table 1. We observe that the topologically-based classifications are close to random (with accuracy around 0.5) in both the directed and undirected approach. Hence, the schizophrenia patients are not topologically discriminated from the healthy subjects, although the FC-classification is relatively successful (with accuracy 0.79). In this case, the classification task based on the EC matrix itself did not achieve good results (accuracy 0.57, close to the performances of the directed topological features).
4.2. Second Dataset: scalp EEG - Epilepsy
Following the same pipeline as in the previous example, we present now a classification analysis concerning a different disease (epilepsy) and a different type of data (EEG). We use for this analysis the EEG recordings described in Sec. 3.1.
Our analysis of epileptic brain states is divided in the two cases reported below: we first assess a classification by the basic approach as applied to the fMRI data, as described in Sec. 4.2.1. Subsequently we report (in Sec. 4.2.2) the results of the analysis carried on the same dataset but involving the additional cleaning steps described in Sec. 3.6.
4.2.1. EEG analysis - default approach
In this analysis, we compare the functional and effective connectivity networks associated to the ictal and to the inter-ictal periods. We consider 18 subjects; each subject has both ictal and interictal data segments. For details, see Sec. 3.1.2. We apply the SVM classifier to the masked EC directed graphs, to the masked FC undirected graphs and to the topological features PL, PI, CC and PE obtained by the PH computation on both the obtained networks. We report the achieved classification values in Table 2.
The results show that the classifications based on the topological features perform again close to random, even though both the raw EC-based and raw FC-based classification are quite effective (with accuracy of 0.83 and 0.78). Therefore, when applied to both the functional and the effective connectivity networks and when compared with the results achieved by the raw matrices, we do not see any advantage in using the topological features.
However, we conjectured that the low performance of the persistent homology approach may be due to its high sensitivity to baseline inter-subject variability in the connectivity structure. Indeed, when applied to the discrimination of ictal and interictal states within a single subject, TDA provides good results, comparable in some cases to those achieved by EC or FC itself, see Appendix C. Therefore, further supported by this successful within-subject application, in the following group analysis we first attempt to correct for the inter-subject variability, before applying the classifier.
4.2.2. EEG analysis: alternative approach
In order to reduce the inter-subject variability of the EEG dataset, in the following group analysis we apply further cleaning steps to the main pipeline, following the methodological details described in Sec. 3.6. In this analysis we use the same dataset as in Sec. 4.2.1. The obtained results, in both dimension 0 and 1, are shown in Table 3.
We now see, in Table 3, that both DPH and PH achieve on average good performances, with around 70 percent accuracy on average. The directed and undirected method are comparable to FC, whereas EC performs close to random.
The results shown in Table 2 and in Table 3 suggest that the cleaning steps help at increasing the performance of the classification tasks from persistent homology features; on the other hand, the better performances could also be due to some specific properties of the considered data. In a last analysis, we have applied the alternative pipeline to a different dataset, namely to intracranial EEG data in the problem of epileptic seizure detection.
4.3. Third Dataset: iEEG - Epilepsy
In our last analysis, we use the iEEG dataset described in Sec. 3.1.2. As in the previous section, in order to reduce the inter-subject variability, we apply the cleaning steps of Sec. 3.6 before computing the homology features and the classification. The obtained results, in both dimension 0 and 1, are shown in Table 4.
The results show that good classification performances in the detection of the brain alterations have been achieved, e.g., with values of 0.94 for PE (in dimension 0) and 0.87 obtained by PE in dimension 1. These are comparable to the results obtained in the alternative pipeline for scalp EEG data analysis of Sec. 4.2.2 and the single subject analysis presented in Sec. C and consolidate the idea that the cleaning step procedures may help in the analysis of brain state detection via TDA. We point out here that, in this case, a direct comparison with the performances of the connectivity matrices across subjects via FC/EC is not applicable due to interindividual differences in the electrodes number and placement.
For the sake of completeness, we report in Tab. E.10 in the appendix also other detailed results of this last analysis, such as employing the full (not-masked) matrices as well and further 0-dimensional features.
5. Discussion
Topological Data Analysis methods, and specifically Persistent Homology, have provided new geometric tools and insights in data analysis investigations, already reported to be successful in several fields of science, computer science and engineering. In this work we assess the applicability, in the specific field of neuroimaging, of a possible extension of Persistent Homology to directed networks. To this end we use the example of two prominent neurological diseases (epilepsy and schizophrenia) with three different types of data (fMRI and scalp/intracranial EEG data). We compute common persistence featurizations (Persistence Images, Persistence Landscapes, Persistence Images, Carlsson Coordinates and Persistent Entropy) of (un)directed networks associated to these datasets and evaluate their discriminative power in distinguishing healthy and diseased brain states by employing an SVM classifier. To coherently evaluate the significance of the obtained classifying performances, we compare the results with those achieved by the same classification method applied directly to the effective (EC) and functional (FC) connectivity graphs. The methodological novelty of the paper consists in employing a directed Persistent Homology approach in the study of brain effective connectivity. While one of the advantages of Persistent Homology over other graph methods is its non-dependence on choices of opportune thresholds and this allows one to capture the information brought by all the thresholds at once, the extension of PH to directed graphs can further benefit from the information contained in the directions of the edges; a theoretical motivation of this idea is shown in Sec. 3.4. Furthermore, our choice is driven by the idea that effective connectivity is closer to a factual, “true”, brain connectivity and, as a consequence, to a refined understanding of the functional brain dynamics Smith (2012); Friston (2011).
In order to study effective connectivity networks from a topological point of view, there are two approaches: one can either construct an undirected graph from the directed effective connectivity network (hence losing some information) and then apply the usual PH pipeline or one can use a suitable directed version of PH. To the best of our knowledge, this is the first attempt at employing the latter, directed PH approach to analyse effective connectivity. However, this approach has been foreseen in a recent perspective paper by Lee (2019), suggesting that causal connectivity methods are to be examined and applied for classification of diseases.
Due to richness of the available methodologies, we necessarily had to limit our investigation by making some specific choices. For instance, many more featurizations of persistent homology are available in the literature, and we thus decided to focus only on some of the more common. Among other examples, kernel methods are also prominent, and we report some kernel-based classifications in Appendix D.
Moreover, there is the choice of potentially discarding the weak (potentially noise-related) links from consideration. Among prior studies analysing functional connectivity networks through persistent homology methods, some set statistically insignificant correlations (with p-values greater than 0.05) to 0 (see Merelli et al. (2016); Stolz et al. (2018)) and others do not (see Lee et al. (2011, 2012); Rathore et al. (2019)). The results reported in this work refer to connectivity networks with links surviving significance threshold of α = 0.05, but either approach can be equivalently undertaken; for sake of completeness, we report in Table D.7 the results obtained by using the full EC or FC matrix (from which we compute the distance matrix by the transformation 1 – | · |) and a comparison of the performances, showing that the performance does not substantially depend on the choice of these masks. For consistency with the undirected approach, in our study of the effective connectivity networks we introduce a similar masking approach: we first build the (pair-wise) Granger Causality graph associated to the brain data and then we set to 0 the connections in the network marked as dependent by applying the FACDA algorithm (Hlinka and Kořenek, 2018). Table D.7 shows that also in the case of the effective network, masking does not change the results substantially.
From our computations, we observe that the PH- and the DPH-based classification performances are generally comparable (see Table D.7 for a full comparison); however, we see that the naive classifications based on functional or effective connectivity generally outperform the topologically-based ones. The better performances of FC/EC on the top of TDA might be due to various reasons, among others to a less sensitivity of Persistent Homology to small localized variations of the absolute values of the input adjacency matrices: small variations of the metric properties of relatively small subgraphs, like the distances associated to a subset of nodes and represented by the entries of the adjacency matrix, are loosely distinguished by the associated topological features, but the same variations are better captured by classification methods based on the full raw matrices. This is amplified in the heterogeneous group-studies; in Sec. C, we report an analysis of the EEG epileptic data restricted to the single subjects showing better performances than the group-studies.
Driven by the idea that TDA methods would be more effective when tested on more homogeneous data, in the second part of our analysis on the EEG datasets, we have introduced further steps in the data processing and cleaning (see Sec. 3.6). A theoretically preferable procedure would require to use the covariance matrices instead of the correlations, because satisfying linearity properties and carrying more biological information, but this would involve numerical complications. We observe that, when following this variation of the pipeline, both our computations on the EEG and iEEG data achieve good performances, suggesting that the cleaning procedures may in fact be of importance in the analysis of these data.
Our second observation concerns the field of applicability: for some datasets (as for iEEG data), a classification analysis based on the raw FC/EC matrices can not be undertaken. This is because the analysis would depend on the intracranial electrodes’ positions used for the iEEG recording, and, consequently, on the different subjects. Investigations like those shown on the iEEG data might be the natural field of application of TDA within neuroimaging.
To summarise, we have shown that the (directed) persistent homology approach can be applied to brain functional/effective connectivity, obtaining in some cases good performances. However we suggest that a due caution in the use of TDA and of these featurizations should be taken, in line with very recent critical results on TDA based on functional connectivity in the literature Rathore et al. (2019); Ellis et al. (2019).
6. Conclusion
In this work we have shown that the directed extension of persistent homology can be successfully applied to classification of disease-related alterations of brain connectivity from neuroimaging data. However, cautious direct comparison to the performance of standard (undirected) persistent homology features as well as to the use of the raw effective (or functional) connectivity matrices for the same classification task suggests that there are important challenges to be overcome to fully utilize the theoretical promise of the topological data analysis.
We conclude that tackling the identified real-world neuroimaging challenges such as topographical versus topological specificity of the brain alterations, inter-individual biases and homogeneous versus heterogeneous sampling situations, would be necessary in order to firmly establish TDA also in the particular field of neuroimaging as a pragmatic tool.
A. Background: Topological Data Analysis
Topological Data Analysis is a recent and active research field of applied algebraic topology, whose main goal is to study the structure of datasets by means of topological frameworks. Among the tools of TDA maybe the most common is nowadays Persistent Homology, whose definition is based on the mathematical ideas of simplicial complexes, filtrations and homology. In this section we present a brief overview on this subject, we describe these concepts in the sections Sec. A.1 and Sec. A.2 below (referring to Zomorodian and Carlsson (2005); Otter et al. (2017) for a mathematical introduction and to Petri et al. (2013); Horak et al. (2009) for further studies of Persistent Homology with applications to complex networks).
A.1. Simplicial complexes
Simplicial complexes are combinatorial mathematical objects that generalize (in higher dimensions) the classical definition of graphs. Roughly speaking, a simplicial complex is a space built by assembling together points, edges, triangles, tetra-hedra and, more generally, high dimensional polytopes (in topology called simplices). We refer to Munkres (1984) for more details.
Let V be a set. An (abstract) simplicial complex on V is a collection Σ = {σα} of finite non-empty subsets σα of V such that:
for each v ∈ V, {v} belongs to Σ;
if σ ∈ Σ and τ ⊆ σ, then τ ∈ Σ.
The elements σα of a simplicial complex Σ are called simplices of Σ and if σβ is a subset of a simplex σα, then we say that σβ is a face of σα. The dimension of a simplex σ is one less its cardinality, i.e., if σ = {v0, …, vk} then dim σ = k, and we say that σ is a k-simplex. The 0-simplices {v}∈ Σ are called the vertices of the simplicial complex Σ. We will write V (Σ) for the set of vertices of Σ; with abuse of notation, we will make no distinction between the element v ∈ V and the vertex {v}∈ V (Σ).
In this work we consider only finite simplicial complexes, so that the family Σ = {σα} is finite. The dimension of the simplicial complex Σ is the maximal dimension across all its simplices.
Observation
Definition A.1 is rather abstract. However, one can always geometrically realize an (abstract) simplicial complex as a geometric sub-space of Rn. A k-simplex σ = {p0, …, pk} can be geometrically depicted as the convex hull of k+1 geometrically independent points p0, …, pk in Rk+1. For example, a 0-simplex is a point, a 1-simplex is depicted as a segment (a convex hull of its two end-points, which are again 0-simplices). Likewise, a 2-simplex is geometrically understood as a triangle, and so on. Roughly, for geometrically depicting a simplicial complex, one realizes each of its simplex as a subspace of Rn and then glues them to each other along common faces. See Fig. A.7 for an example of geometric realizations of different simplicial complexes (there denoted by Σ0, …, Σ5).
Example A.2. Recall that a graph G is a pair G = (V, E) given by a set V of vertices and a set E⊆ V × V of edges, where an edge is identified by the couple (v, w) of its distinct endpoints (so we do not allow multiple edges or loops). Hence, a graph is a simplicial complex whose vertex set is V and the 1-simplices are the edges of G.
If Σ and Σ′ are simplicial complexes, a simplicial map Σ → Σ′ between them is a function f : V (Σ) → V (Σ/) sending vertices of Σ to vertices of Σ ′, with the property that, if the vertices {v0, …, vn} span a simplex σ = v0, …, vn of Σ, then also the vertices f (v0), …, f (vn) span a simplex of Σ/; observe that the image of σ can be a simplex of Σ′ of smaller dimension. Examples of simplicial maps arise from inclusion. For example, if Σ is contained in Σ′ (as subfamily) then the inclusion Σ⊆ Σ′ defines a map of simplicial complexes (sending the simplices of Σ to the same simplices seen in Σ/. In this case we say that Σ is a subcomplex of Σ/. Having maps between simplicial complexes preserving the structures is an important ingredient in developing the theory of filtrations and persistent homology (see the sections below).
Example A.3 (Clique complex). A graph G is a 1-dimensional simplicial complex (as its simplices are at most of dimension 1); recall that an n-clique is a complete subgraph of G on n vertices. The clique complex Ĝ associated to G is the simplicial vertex {v} ∈ V (Σ). complex constructed as follows:
its 0- and 1-simplices are the vertices and edges of G;
for every (n + 1)-clique with vertices {v0, …, vn} in G we add the n-simplex σα = {v0, …, vn} on the same vertices.
For example, if G is a complete graph on 3 vertices, then its clique complex is nothing but a simplicial complex consisting of a 2-simplex together with all its faces. Observe that the dimension of the clique complex depends on the dimension of the maximal cliques of G. The embedding of G into Ĝ sending the vertices and edges of G to the respective 0 and 1-simplices of Ĝ is a map of simplicial complexes.
We are ready to recall the definition of filtration of a simplicial complex Σ:
A filtration ℱ (of a simplicial complex Σ) is an indexed sequence of subcomplexes of Σ such that for every i ≤ j, the simplicial complex Σi is contained in Σj.
We will only consider finite filtrations of simplicial complexes (so that the chain of simplicial complexes in the definition is finite). The set of indices in the definition of a filtration is usually given by positive natural numbers, but it can also be a more general set, like the positive real numbers (see, e.g., Example A.6). A figurative illustration of a filtration (with clique complexes) is given in Fig. A.7.
We conclude the subsection with two prominent examples of filtrations: the weight rank clique filtration (WRCF) Petri et al. (2013) associated to a weighted graph and the Vietoris–Rips filtration associated to any finite subset of a metric space.
Example A.5 (WRCF). Let G be a weighted graph and let (wi)0≤i≤N be the increasing sequence of all the weights of G (i.e., w0 is the minimal weight, wN the maximal and wi < wi+1). For the i-th weight wi construct the simplicial complex Σi as follows:
consider the subgraph Gi⊆ G with the same vertices of G and only those edges of G with corresponding weight w such that w ≤ wi;
Σi := Ĝi is the clique complex of Gi (see Ex. A.3).
It is easy to see that Σi is contained in Σi+1 and that the sequence ∅ ⊆ Σ0 ⊆ · · ·⊆ ΣN = Σ is a filtration. Analogously, by sorting the weights in the decreasing order, one obtains another filtration, where the graph Gi is defined by keeping only the edges of G with weight w ≥ wi.
Example A.6 (Vietoris–Rips filtration). Let (X, d) be a metric space and S be a finite subset of X, endowed with the induced metric. In applications, the set S can be a set of data-points in Rn or a set of measurements with a notion of distances between them. The Vietoris–Rips complex of S at scale t ≥ 0 is the simplicial complex: where the diam(σ) := sup d(x, y) x, y σ is the diameter of σ (we point out here that different definitions, considering a maximal distance of 2t instead of t can be found in the literature; we follow the definition used in Tralie et al. (2018) to be consistent with Sec. 3.8). Observe that every such σ represents a simplex of dimension |σ| − 1. If t ≤ t′ then VRt(S) ⊆ VR(S)t1 and the resulting filtration VR(S) of the simplicial complexes VRt(S), with t varying in ℝ≥0, is called the Vietoris–Rips filtration of S.
Observe that these two constructions are equivalent, if one considers metric spaces (X, d) associated to weighted graphs G (with respect to an increasing ordering of the weights), with underlying space X given by the nodes of the graph G and distance d(x, y) between x and y defined as the weight corresponding to the edge (x, y).
A.2. (Persistent) Homology and Persistence diagrams
Persistent Homology (PH) Edelsbrunner et al. (2002); Zomorodian and Carlsson (2005) is one of the main tools in TDA and one of the first applications of Algebraic Topology to Data Analysis, introduced in Carlsson (2009). Persistent Homology is a topological invariant whose definition is based on the mathematical concept of homology groups. It is robust to noise and stable with respect to small perturbations of the inputs (Cohen-Steiner et al., 2007, Section 3.1); moreover, calculations can be easily achieved by using many software options (in MATLAB, Python, C++, R, Java) Otter et al. (2017). Its constituents, the homology groups, are classical algebraic invariants of topological spaces. The number of connected components, loops (1-dimensional holes), voids (2-dimensional holes), tunnels and higher dimensional holes are examples of such invariants. A technical comprehensive definition of homology groups is outside of the scope of the present work. In the following, we will only sketch some ideas, but the interested reader can find more details in Hatcher (2000); Munkres (1984). We conclude the section with the definition of persistent homology groups and persistence diagrams.
Let k be a natural number. The k-th homology group Hk(Σ) associated to a simplicial complex Σ (see Def. A.1) can be thought of as the set (group, module or vector space) of k-dimensional holes in Σ; the elements of H∗(Σ) are called homology classes. To be more precise, let F2 be the field with two elements (i.e., 0 and 1, with sum and product inherited from the usual sum and product of real numbers, reduced mod 2) and let Cp(Σ) be the free F2-vector space whose basis consists of the set of p-simplices of Σ: elements of Cp(Σ) are formal combinations a0σ0+ …+anσn of p-simplices σ0, …, σn of Σ with coefficients ai in F2. For every p≥ 1 we define the map by sending a p-simplex σ to a formal sum of all its (p − 1)-faces. The map d0 is defined as the zero map, As the composition dp ○ dp+1 = 0 is the zero map, the image of dp+1 in Cp(Σ) is contained in the kernel of dp; these are both F2-vector spaces and the quotient ker(dk)/Im(dk+1) of vector spaces is well-defined. We can now formally define the homology groups of the simplicial complex Σ (with F2-coefficients):
Let Σ be a simplicial complex and k ≥ 0 a natural number. The k-th homology group Hk(Σ) of Σ is defined as the quotient of the kernel of the map dk with the image of dk+1.
The group Hk(Σ) is in fact a F2-vector space and its dimension (over F2) called the k-th Betti number βk(Σ) of Σ. More general definitions are possible, with arbitrary fields F or with the integers Z as coefficients. We refer to Hatcher (2000); Munkres (1984) for more detailed descriptions of homology groups.
The number of k-holes of (the geometric realization of) a simplicial complex Σ corresponds to the k-th Betti number βk(Σ) associated this way to Σ. In fact, Betti numbers give (a measure of) the number of “independent” holes or loops in Σ, i.e., holes that can not be shrunk continuously to a point. The 0-th Betti number β0 represents the number of connected components, the 1-st Betti number β1 is the number of independent loops and the 2-nd Betti number β2 gives the number of 2-dimensional voids.
Example A.8. If X is a point, then X has a single connected component and no loops or higher dimensional holes. Its Betti numbers βn(X) are 0 for every n > 0, except for β0(X) which is 1. The 0-th Betti number of two points is 2 (2 connected components) and 0 in higher dimensions. On the other hand, if X is a 2-dimensional sphere, then X has a 2-dimensional hole, represented by the void internal to the sphere. We get β0(X) = β2(X) = 1 and βn(X) = 0 otherwise. Observe that β1(X) is also 0 as every loop on the sphere can be continuously shrunk to one point.
One of the most useful theoretical properties of homology is given by its functoriality: for every simplicial map f : X → Y of simplicial complexes and for every k in N, we also get a linear map Hk(f) : Hk(X) → Hk(Y): satisfying some additional compatibility properties; we avoid to spell them out here and we refer to Hatcher (2000) for the details. This turns out to be of great importance when dealing with more than one single simplicial complex.
As we have seen, the k-th homology group Hk(Σ) is associated to a single simplicial complex Σ. Likewise, persistent homology groups are defined for a filtration of simplicial complexes, as we now explain. By the functoriality described in Eq. A.1, if ℱ: ∅ ⊆ Σ0 ⊆ …⊆ ΣN = Σ is a filtration of simplicial complexes (see Def. A.4), then, for every k, we get a sequence 0 → Hk(Σ0) → …→ Hk(ΣN) = Hk(Σ) of homology groups and morphisms between them. Let k be fixed; then, for every pair of indices i and j between 0 and N, the composition of maps of simplicial complexes Σi → …→ Σj induces a map between the homology groups of Σi and Σj.
Edelsbrunner et al. (2002) For i < j, the k-th persistent homology group of the filtration F : ∅ ⊆ Σ0 ⊆i,j…⊆ ΣN = Σ is the image of the induced map . The k-th persistent Betti numbers are the ranks of the persistent homology groups .
The intuition behind the definition of persistent homology groups is that homology classes (the k-dimensional holes) can appear and disappear at different stages of the filtration steps, due to changes in the topology of the simplicial complexes across the filtration. A persistence homology class (i.e.,an element) of represents a k-dimensional hole of Σi surviving across all the filtration steps between the index i and index j. The filtration indices at which a k-dimensional hole first appears and first disappears are called its birth and death. Persistent homology makes (mathematically) clear the idea of appearing and disappearing of high dimensional holes; it keeps track of all these topological changes with the persistence Betti numbers.
We can visualize the information provided by persistent homology groups and persistent Betti numbers by constructing a planar diagram called Persistence Diagram (PD), stably with respect to the input data Cohen-Steiner et al. (2007). A persistent homology class born at time b and that died at time d is represented in the diagram by the pair (b, d) (see Fig. A.8). Classes that persist until the end are usually depicted in the persistent diagram as dying at infinity (or at any other index bigger than the maximum filtration one). Note that there are no points below the diagonal because the death value of a hole is always bigger than its birth value.
In a persistence diagram it is then visually clear the measure of how long a homology class persists: the farther from the diagonal, the higher the persistence. In the standard interpretations, short-living classes (so, with representing dots close to the diagonal) are interpreted as noise, and only long-persistence classes represent important features of the filtration.
B. EEG data: Subject selection
In our analysis of epileptic brain states we have used the data available at https://physionet.org/content/chbmit/1.0.0/, from the publicly open dataset Physionet. We have analysed EEG recordings belonging to 18 selected subjects: chb01-05, chb07, chb09, chb10, chb12, chb14, chb17-24 except chb18 01, chb19 01.
For the variation of the main pipeline, we have used the following files: chb01 03, chb02 16+, chb03 01, chb04 05, chb05 06, chb07 12, chb09 06, chb10 12, chb12 06, chb14 11, chb17a 03, chb18 29, chb19 28, chb20 13, chb21 19, chb22 20, chb23 06, chb24 04. For those, we always use the first seizure (the first 30 seconds of it) and two 30-second segments preceding it: [-3m:-2m30sec] for baseline connectivity, and [-1min:-30sec] for preictal connectivity estimation. Detailed procedure is described in Sec. 3.1.2, in Intracranial EEG data description.
C. EEG data: single subject analysis
In this supplementary, we assess the efficiency of PH and DPH on single subjects brain state alteration classification, in contrast to the group analysis reported in the main paper in Sec. 4.2.1.
In the single-subject analysis, we consider the data collected from the patients CHB06, CHB15 and CHB24 (from the same database, we have chosen the patients with higher number of seizures).
For every patient, we consider segments with epileptic seizures and interictal data of corresponding length: 10 ictal plus 10 interictal recordings for the patient CHB06, 20 plus 20 for the patient CHB15 and 16 plus 10 for the patient CHB24. We then apply the same pipeline as in the group analysis.
From the accuracy results, shown in Table C.5, we observe that the classification performances associated to single patients, based on topological features out of the directed and undirected approach, achieve better results (on average about 0.75, 0.60 and 0.70 for CHB06, CHB15 and CHB24, respectively) than in the inter-subject analysis. Topological features may not achieve great performances in heterogenous group-wise classification settings (compare the results with those in Table 2), but they are more informative when applied to more specific datasets.
D. Supplementary results
In a parallel exploratory analysis to that presented in the main text, we have applied small variations of the pipeline described in Sec. 3 and/or computed different features. In this supplementary we show the results obtained following the main pipeline applied on the fMRI and EEG datasets, using either kernel methods (see App. D.1), the full matrices (as apposed to the masked ones, see App. D.2, or the 0-dimensional features (as opposed to the 1-dimensional features, see App. D.3).
D.1 Kernel methods
Our main analysis has been based on Persistence Images, Persistence Landscapes, Persistence Images, Carlsson Coordinates and Persistent Entropy, but many other topological features to analyse datasets are available. Among others, kernel-based methods provide some flexible choices. As there is not a standard persistence kernel approach, we have decided to analyse this feature separately. We follow the kernel-based method introduced by Rein-inghaus et al. (2015) and apply the SVM classification on the (masked) FC and EC networks, following the pipeline in Sec. 3 . We report the results in Table D.6, from which we can see that they are comparable to those achieved by the other features reported in the main text of the work.
D.2 Full versus masked connectivity networks
An additional analysis has been driven by the question whether classification tasks, based on masked matrices associated to (un-)directed networks (see Sec. 3.2), achieve different results when applied to the full matrices. We compare in Table D.7 full and masked-based classification results; the results suggest that the masking does not have a systematic detrimental effect on the analysis outcomes.
D.3 0-Dimensional features
In the last analysis carried out on both EEG and fMRI data, we focus on 0-dimensional topological features. We use the features Carlsson Co-ordinates (CC), Persistent Entropy (PE) and the kernel feature of Reininghaus et al. (2015). Additionally, we investigate the 0-dimensional features Area Under the Curve (AUC) Gracia-Tabuenca et al. (2019) and Integrated Persistence Feature (IPF) Kuang et al. (2019a), as both have shown to provide good differentiations when applied to attention-deficit/hyperactivity disorder (ADHD) and Alzheimer’s disease, respectively. We show the results in Table D.8.
E. Variation on EEG and iEEG dataset: the complete report
We report here the full analysis based on the variation of the pipeline (as described in Sec. 3.6) in the case of the EEG/iEEG recordings. In the results of Sec. 4.2.2 and Sec. 4.3, we have only shown the classification results based on the masked FC and on the masked EC matrices (see Sec. 3.2). For the sake of completeness, we show below in Tab. E.9 and Tab. E.10 the results achieved by the topological features PL, PI, PE, CC (see Sec. 3.5), kernel (introduced in App. D.1), and by the features AUC and IPF (as in the previous section), in both dimension 0 and dimension 1.
Acknowledgment
The authors thank Jakub Kopal for sharing the preprocessed fMRI time series and Barbora Bučková for sharing scripts for classification pipeline.
Footnotes
✩ The study was supported by the long-term strategic development financing of the Institute of Computer Science (RVO:67985807) of the Czech Academy of Sciences, by project Nr. LO1611 with a financial support from the MEYS under the NPU I program, the Czech Health Research Council Project No. NV17-28427A and the Czech Science Foundation project No. 19-11753S.
Email addresses: caputi{at}cs.cas.cz (Luigi Caputi), pidnebesna{at}cs.cas.cz (Anna Pidnebesna)