## Abstract

Neuroimaging techniques are increasingly used to study brain cognition in humans. Beyond their individual activation, the functional associations between brain areas have become a standard proxy to describe how information is distributed across the brain network. Among the many analysis tools available, dynamic models of brain activity have been developed to overcome the limitations of original connectivity measures such as functional connectivity. In particular, much effort has been devoted to the assessment of directional interactions between brain areas from their observed activity. This opinion article summarizes our recent approach to analyze fMRI data based on our whole-brain effective connectivity, which relies on the multivariate Ornstein-Uhlenbeck (MOU) and is thus referred to as MOU-EC. Meanwhile, we also discuss the pros and cons of its underlying assumptions with respect to other established approaches. Once tuned, the model provides a connectivity measure that reflects the dynamical state of BOLD activity obtained using fMRI, which can be used to explore the brain cognition. We focus on two important applications. First, as a connectivity measure, MOU-EC can be used to extract biomarkers for task-specific brain coordination, understood as the patterns of areas exchanging information. The multivariate nature of connectivity measures raises several challenges for whole-brain analysis, for which machine-learning tools presents some advantages over statistical testing. Second, we show how to interpret changes in MOU-EC connections in a collective and model-based manner, bridging with network analysis. To illustrate our framework, we use a dataset where subjects were recorded in two conditions, watching a movie and a black screen (referred to as rest). Our framework provides a comprehensive set of tools that open exciting perspectives for the study of distributed cognition, as well as neuropathologies.

## 1 Connectivity measures for neuroimaging data

The study of brain cognition has flourished in the recent decades due to the abundance of neuroimaging data that give access to neural activity in human subjects. Among non-invasive techniques, functional magnetic resonance imaging (fMRI) has become a tool of choice to investigate how the brain activity is shaped when performing tasks [108, 97, 74, 31]. The blood-oxygen-level dependent (BOLD) signals recorded in fMRI measure the energy consumption of brain cells, reflecting modulations in neural activity [9, 48, 10, 105]. Since early fMRI analyses, a main focus has been on identifying with high spatial precision regions of interest (ROIs) in the brain that significantly activate or deactivate for specific tasks [32, 96, 151]. Because the measure of BOLD activity during task requires the quantification of a baseline, the brain activity for idle subjects became an object of study and developed as a proper line of research [17, 135]. This revealed stereotypical patterns of correlated activity between brain regions, leading to the definition of the functional connectivity or FC [22, 64]. Together with studies of the anatomical connectivity using structural MRI [138, 80], fMRI studies progressively shifted from focusing on specific ROIs to whole-brain analyses [39]. For example, high-order cognition involves changes in BOLD activity that are distributed across the whole brain [120, 29], which cannot be restrained to a small set of preselected ROIs. The typical description of the whole-brain activity in these methods is a ROI-ROI matrix, which we refer to as connectivity measure.

Recently, fMRI studies for both rest and tasks have also evolved to incorporate the temporal structure of BOLD signals in their analysis [83, 90, 30, 107, 69, 74, 25, 150], in addition to the spatial structure. Models have also been developed to formalize the link between the observed BOLD and the neuronal activity [61, 97, 41, 102, 69, 56, 150]. This has led to many definitions for connectivity measures and variations thereof. In this introductory section we present fundamental concepts about connectivity measures that set our model-based approach in context (see also the Glossary for definitions).

### 1.1 Model-based versus model-free approaches

A recurrent discussion opposes model-free and model-based approaches in neuroscience, which also applies for connectivity measures. For instance, FC calculated using Pearson correlation between time series (blue matrix in Fig. 1A) can be directly calculated from data —see Eq. (2) in Appendix D— and may thus seen as model-free. However, the interpretation of the FC values for the pairs of ROIs in the matrix is based on the theory of linear Gaussian models (LGMs), which assumes multivariate normal distributions for the data. The Pearson correlations is a connectivity measure that can be directly calculated on the generated activity. The underlying connectivity in the LGM, called the precision matrix, is closely related to the partial correlations (PC) represented by the green matrix in Fig. 1A. The PC symmetric matrix describes the undirected linear interactions between Gaussian variables that result in the observed correlations. Our viewpoint is thus that connectivity measures are always related to a model, so it is thus crucial to bear in mind their assumptions when interpreting their values. Note that other classifications of methods have been proposed and discussed in previous work, like directed functional connectivity measures versus effective connectivity measures [60], state-space models versus structural models for BOLD dynamics [143] and network-wise methods versus pairwise inference methods [16].

In the literature of whole-brain modeling, three families of connectivity measures have emerged:

Structural connectivity (SC): It measures the density or probability of anatomical pathways that connect two ROIs, mostly via the white matter [128]. This led to the definition of the human connectome at the whole-brain level [138, 80].

Functional connectivity (FC): Pairwise statistical dependencies between the observed activity of ROIs [22, 64]. Apart from the Pearson correlation of the BOLD signals [17, 135], other common FC measures include mutual information [86] and synchrony of instantaneous phases [24, 25]. Conceptually, FC corresponds a measure that can be applied to multiple time series, either the data or the model activity.

Effective connectivity (EC): In this article we define EC as a measure of the directional interactions in a dynamic model. The original concept comes from electrophysiology [3], where EC determines how the stimulation of a neuron affects a target neuron (e.g. combining the source-to-target synaptic weight and the excitability of the target). It was then brought to neuroimaging in the 2000s when building models of the BOLD response and further developed in the context of the dynamic causal model (DCM), usually interpreted as neuronal coupling [61, 58, 143]. Note that it is also still used close to its original formulation when measuring stimulation-driven responses in neuroimaging [95].

To go beyond statistical relationships, the combination of several types of data requires a model, as with the SC and FC that are combined in the EC model (see Fig. 1A). Following, the choice of the model ingredients, especially their dynamics, has important implications that we detail in the following.

#### Glossary

Generative model: Model of (dynamic) equations that generates a signal to be fit to empirical data. This definition is different from the definition in statistics where a generative model describes the joint probability distribution of observed and predicted variables (here the network activity), as opposed to a discriminative model that describes the conditional probability of the predicted variables with respect to the observed variables.

Linear Gaussian model (LGM): Generative model of Gaussian variables with linear interactions. Its output variables have a flat autocovariance (apart from zero time lag) and is used to model noisy data without temporal structure.

Multivariate Ornstein-Uhlenbeck (MOU) process: Dynamic generative model in continuous time with linear interactions. Its output variables have both spatial and temporal correlations.

Multivariate autoregressive (MAR) process: Dynamic generative model in discrete time with linear interactions. It is the equivalent of the MOU process in discrete time.

Lyapunov optimization or natural gradient descent: Tuning procedure for the EC weights in the MOU network that fits the model FC covariance matrices to their empirical counterparts.

Classification pipeline: Succession of machine-learning algorithms that aims to learn the mapping from input vectors of ‘features’ to output ‘labels’ (or categories). Here we use neuroimaging connectivity measures to predict cognitive conditions (like the task performed by a subject).

Multinomial logistic regression (MLR): Linear multicategory classifier that assigns a coefficient to each input feature to predict the labels. It reduces to logistic regression for binary classification.

k-nearest neighbor (kNN): Non-linear classifier that predicts the category of each new sample based on the most represented category over the k closest samples from the train set, given a distance metric or similarity measure.

Biomarker: Subset of observed features (here EC/FC links) that enable a robust classification, often paired with weights as in the case of a linear classifier.

Dynamic communicability: Measure of pairwise interactions between nodes (ROIs) in a network that takes indirect paths into account. In the present article, it corresponds to interactions over time for the MOU model.

Dynamic flow: Extension of dynamic communicability that incorporates the effect of input properties in the MOU model.

### 1.2 Choice of model and interpretability

In our framework we borrow the EC terminology that can be understood in a broad sense as the directed connectivity in a generative model, here for BOLD activity. When using the LGM related to Pearson correlations taken as FC, one can take the partial correlation (PC, green matrix in Fig. 1A) as “LGM-EC”. As illustrated by the flat green autocorrelogram for non-zero time lags in Fig. 1B, the generated signals by the linear Gaussian model are independently and identically distributed (i.i.d.) variables and do not have temporal correlations. In contrast, the multivariate autoregressive process (MAR, in orange in Fig. 1A-B) has a directed connectivity (asymmetric matrix) and produces temporally correlated signals. When these models are fitted to data, they do not capture the same part of the data structure (here spatial versus spatio-temporal). When the MAR model is considered, the estimation results in directed connectivity that depends on the spatiotemporal correlation structure of the observed activity (or spatio-temporal FC). When fitted to the spatial FC, the linear Gaussian model does not “naturally” give directed connectivity, although techniques based on optimization with regularization have been developed for that purpose [127].

EC usually describes causal and directional interactions that interplay with other dynamic parameters in the model to determine the global network pattern of activity. When optimizing a model to reproduce BOLD signals, the parameter can be seen as representations or “projections” of the BOLD signals, in a top-down or data-driven approach [143]. Importantly, the estimated EC depends on the choice for the model dynamics. For example, the DCM was developed to formalize the link between neural and BOLD activities by explicitly modeling the hemodynamics [61, 139, 58]. The DCM-EC is thus the directed connectivity between the ROIs and determine their activities, from which the BOLD signals are generated via a hemodynamic response function.

We keep in mind that all interpretations of fMRI signals sit on the hypothesis that correlated BOLD activity between brain areas reflects the activity of their neuronal populations [73, 36, 99, 84], which in turn mediates the transmission of neuronal information [57]. However, many metabolic mechanisms like breathing (which are usually ignored in whole-brain models) alter the BOLD signals [115] and the adequate procedure to minimize the corruption of data and obtain a satisfactory signal-to-noise ratio is still under debate [114].

A distinct line of research [24, 41, 124, 117] has focuses on the development of whole-brain models in a more bottom-up fashion (model-driven), combining various datasets and biologically-inspired mechanisms, such as oscillators to produce rhythms observed in the brain. Such approaches allow for the study of the influence of specific parameters (as ingredients in the model) in collectively shaping the global network activity. Compared to connectivity estimation, a major distinction of many of those models is that SC is taken as the intracortical connectivity to focus on the influence of the local dynamics in shaping network activity patterns; see [102] for a review.

### 1.3 Consistency of pipeline from preprocessed data to modeling and analysis

The incentive for novelty has fostered the development of many methods in computational neuroscience. This makes results difficult to compare, especially for resting-state fMRI studies when no ground truth is available. Another caveat concerns network analysis when it is applied on connectivity estimates from time-varying signals (e.g. BOLD) with metrics (e.g. to detect communities) that do not genuinely relate to the physical values of the data. Our practical answer to this point is thus to develop for the same dynamic model a set of analysis tools, covering parameter inference [69] to machine learning and network analysis. Although the mathematical tractability of the model limits the richness of its dynamic repertoire compared to more elaborate bottom-up models [41, 124], it provides an intuitive understanding for the roles of parameters as well as analytical derivations for the parameter estimation or model interpretation. We believe that such an effort for formalization and consistency is beneficial to the field, both for didactic purpose and designing tailored methodology for complex models.

The present article follows the pipeline displayed in Fig. 2A. The second section presents the dynamic model and our recent optimization procedure [69, 66] to estimate MOU-EC for whole-brain fMRI data (Fig. 2B). It also discusses technical aspects in relation with other models used with neuroimaging data. The third section shows how connectivity measures can be used to predict cognitive states (Fig. 2C) and uncover their underlying structure. Importantly, the use of datasets with multiple tasks allows for an in-depth benchmarking of the connectivity measures. It also highlights some advantages of machine learning over statistical testing in making use of the multivariate nature of connectivity measures, transposing and abstracting concepts presented in a recent study for subject identification [111]. The fourth section bridges with network analysis [68, 67], interpreting changes in MOU-EC connections in a collective and model-based manner. In particular, it adapts community detection to our model (Fig. 2D).

## 2 Capturing the whole-brain BOLD dynamics with MOU-EC

Using a generative model to reproduce the BOLD statistics, the optimized MOU-EC can be seen as an estimate that extracts spatio-temporal information about the BOLD dynamics. In fact, the main innovation of the modeling is the optimization procedure in Fig. 2B [69]. This section reviews important points about the underlying multivariate Ornstein-Uhlenbeck (MOU) dynamic model and its tuning. Further mathematical details can be found in Annex (see Fig. S1). Our approach aims to combine several key aspects:

**Whole-brain**connectivity estimates are necessary to properly take into account the experimentally observed distributed information across distant regions [39, 29], without a-priori selection of ROIs.From the overall

**spatio-temporal structure**of BOLD signals, the dynamic model only reproduces their covariances (without and with time lag, see Fig. 2B). This concerns the same fast timescale as recent studies [106, 107], which also demonstrated the influence of behavioral states such as sleep versus wake. The choice of BOLD covariances to be reproduced by the model is supported by previous results that showed that most of the information was in the second-order statistics of BOLD signals [86].**Causality**is inherent to the concept of EC [61, 58, 143] that are represented by the directed connections in the dynamic model (red arrows in Fig. 2B). From the estimated MOU-EC that best reproduces the data, we interpret strong MOU-EC weights as causal relationships and can examine their asymmetry to evaluate for a pair of ROIs which one drives the other. The optimization can deal with common inputs for ROIs, to explain the observed correlated activity by an interplay between the directed connectivity and correlated inputs [66].The MOU-EC

**topology**is the inter-regional infrastructure, namely which connections exist and which do not in the brain network. When SC is available, its binarization (black matrix in Fig. 1A) can be used to constrain the MOU-EC topology in order to reduce the number of parameters to estimate. This enforces the model to “explain” changes in the FC by existing connections only, in contrast to PC and MAR (Fig. 1A). It is thus important to remember that MOU-EC estimates are not related to SC values.The optimization procedure tunes all MOU-EC weights while taking into account network effects. For each fMRI session, we obtain a

**multivariate**estimate of more than 1000 parameters that represent the dynamical “state” of the brain activity. This contrasts with previous models that used the symmetric SC as connectivity matrix in dynamic models and focused on choosing or tuning the nodal dynamics with a few parameters only [41, 102, 124].A limitation of the model estimation procedure up to now is the assumption of stationarity for the BOLD signals over each fMRI session, which limits our approach to ongoing non-switching tasks.

Another limitation of MOU-EC for the interpretation in terms of neuronal coupling is the absence of explicit hemodynamics in the model. This choice comes from the priority given so far to the estimation robustness (with simpler dynamics) over the biological interpretability, as will be discussed later.

### 2.1 Multivariate Ornstein-Uhlenbeck dynamics as generative model for BOLD signals

Formally, our model-based analysis is based on the multivariate Ornstein-Uhlenbeck (MOU) process that is described by Eq. 3 in Appendix C. It corresponds to a network with linear feedback that is the equivalent in continuous time of the discrete-time multivariate autoregressive (MAR) process. These dynamic systems with linear feedback have been widely used in neuroscience to model the propagation of fluctuating activity, mixing “noise” and structure, for example in modeling single neuronal dynamics [23], relating the connectivity and activity in a network [63] and defining neural complexity [142, 7]. It also corresponds to the linearization of nonlinear population models like Wilson-Cowan neurons [153].

The choice for the MOU dynamics is motivated by the balance between simplicity, which ensures tractable analytical calculation, and richness of the generated activity when modulating the parameters (especially the MOU-EC weights). Thus, it is well adapted to whole-brain data with parcellation involving many ROIs (≥ 100). In addition, the MOU dynamics implies exponential decaying autocovariances in the model, which have similar profiles to the empirical data (see the left plot in Fig. 1B corresponding to straight lines in log-plot in Fig. S1).

Intuitively, it can be understood as a network when fluctuating activity (akin to noise) is generated at each ROI and propagates via the recurrent EC. In other words, MOU-EC (red matrix in the bottom box) acts as a “transition matrix” and quantifies the propagation of fluctuating BOLD activity across ROIs. The MOU-EC matrix is usually sparse when its topology is constrained by SC (black matrix) to match anatomical white-matter connections between ROIs. The fluctuating activity for all ROIs is described by their (co)variance matrix Σ, which is diagonal in the present case (see the purple vector of variances). In a previous study, cross-correlations for Σ also involve common inputs to homotopic sensory ROIs [66].

### 2.2 Parameter estimation capturing network effects in BOLD propagation

To capture the BOLD dynamics (i.e. propagation of fluctuating activity), we use the two BOLD covariance matrices FC0 and FC1 in Fig. 2B, without and with time lag, respectively. This also ensures a one-to-one mapping between an FC configuration (a pair FC0 and FC1) and a MOU-EC configuration. This choice is in line with previous adjustments of DCM to model the resting-state that relied on the cross-spectrum, namely the Fourier transform of covariances with all possible time lags [58]. It is richer than fitting only an FC matrix without time lag [41, 102] and complies with a recent study of the task-dependent modification of the BOLD temporal structure at short time scales [107].

The MOU-EC estimate is obtained for the minimum model error in reproducing the empirical FC0 and FC1 for each session (see the fit plot between the boxes in Fig. S1). The optimization is akin to a ‘natural’ gradient descent [4] in that it takes into account the nonlinearity of the mapping from MOU-EC to the covariances FC0 and FC1. This arises because of the network feedback (even though linear) and may result in strong correlations (in FC) for disconnected ROIs (EC=0) provided strong indirect pathways connect them (via other ROIs). Appendix D provides the mathematical details of the optimization, which applies the gradient descent to both MOU-EC and Σ [66], extending the initial formulation with a heuristic optimization for Σ [69].

For illustration purpose, this article uses a whole-brain parcellation consisting with 66 ROIs. The SC density is 28%, giving 1180 MOU-EC weights to estimate (see Appendix A). Each fMRI session has 300 time points separated by TR = 2 seconds, so the number of data points is 66 × 300 ≃ 2.10^{4}, about 16 times larger than the number of model parameters. EC should be like a summary of the BOLD signals: informative (not too short), but extracting and compressing relevant information (not too long). Our method was also successfully applied to the AAL parcellation with 116 ROIs and sessions of 5 minutes [111]. Typically, the MOU-EC estimation for a session and about 100 ROIs takes less than a minute of computation time on a desktop computer. For more refined parcellation or shorter sessions, the FC matrices may become quasi singular and the model estimates are expected to be noisier. The reduction of the number of parameters to estimate by using SC is crucial to work at the level of individual fMRI session and avoid overfitting. Here, overfitting would correspond to the situation where many distinct MOU-EC configurations in the model giving very similar pairs of FC0 and FC1 matrices, giving a larger overlap for MOU-EC than in Fig. 2C and the same small overlap for FC. The comparison between FC and MOU-EC as multivariate representation of cognitive states will be the focus of the next section. An important conceptual difference between MOU-EC and SC is that MOU-EC accounts for the dynamical properties of the brain activity, which are modulated when engaging a task. In other words, MOU-EC is hypothesized to account for the concentration of receptors or neurotransmitters, local excitability, etc., not only the density of the synaptic fibers.

### 2.3 Comparison with other approaches to extract information from BOLD signals

Other approaches have been proposed to characterize activity “states” based on the temporal structure of BOLD signals. For example, the ‘dynamic FC’ relies on sliding windows of several tens of TRs [90, 112, 116, 74], thereby focusing on changes in correlation patterns over minutes. Shorter timescales have been explored using instantaneous phases obtained using the Hilbert transform on the BOLD signals [25] or hidden Markov models (HMMs) [150, 18]. In contrast, the MOU-EC describes the BOLD propagation averaged over a session while assuming stationarity, as calculated in the corresponding statistics (covariances without lag and with a lag of 1 TR); Note that the “transition matrix” analogy for EC is at the level of the BOLD activity, not of hidden states as in the case of HMMs. Moreover, it does not involve a dynamic modulation of EC interactions as used in the DCM [58, 97, 112].

A key innovation to tackle whole-brain fMRI data is the optimization constrained by SC that determines the network topology. This means the model has to explain the spatio-temporal FC structure using the existing connections only. The “prior” information related to SC avoids the preselection of ROIs and can be seen as an alternative to model comparison in choosing the best topology using structural equations [101, 91], Granger causality analysis [72] and early versions of the DCM [61, 143] —note that a recent DCM study incorporates SC for the network topology [136]. Likewise, the MOU-EC density usually depends on the threshold applied to SC and the best choice can be decided using model comparison, although the formulation may not be as natural as in the Bayesian framework.

Several fundamental properties were discussed a few years ago about the defining concepts of EC and DCM [143]. Beyond technical details, three main points are that DCM-EC corresponds to the connectivity weights in a dynamic model, that the model incorporates the hemodynamic response function (HRF) and that the estimation captures the BOLD dynamics, including the subsampling related to the low time resolution of BOLD signals. The EC terminology was borrowed from the DCM literature [58] because of the model-based aspect. The MOU-EC estimation was developed to solve the trade-off between robust estimation and application to large brain network (70+ ROIs) by using linear dynamics [69]. Since then, the DCM has been applied to whole-brain fMRI data [119, 56].

The FC0 and FC1 matrices in Fig. 2B embody the spatio-temporal BOLD structure in the range of “high” frequencies close to the Nyquist frequency equal to 0.5 Hz. The recent extension of the DCM to analyze resting-state fMRI data reproduces the BOLD statistics via the cross-spectrum [62, 119, 56], which is in line with our approach. Recall that this contrasts with earlier versions of the DCM that reproduced the BOLD time series themselves for stimulation protocols [61, 97]. Because it works in continuous time, the MOU model deals with the BOLD subsampling [69], unlike estimation methods relying on the discrete-time multivariate autoregressive process that may be sensitive to the subsampling of the BOLD signals [132]. Moreover, the observed lags between ROIs in FC —similar to cross-correlograms [106, 107]— are explained by the combination between the estimated inputs Σ and MOU-EC [65].

The time constant *τ _{x}* in the MOU model is identical for all ROIs in Eq. (3) in Appendix C, which corresponds to an abstraction of the HRF waveform response that was reported with a decay of 5 to 10 seconds [37]. An important difference of our model compared with the underlying dynamic model behind DCM [61, 143], as well as other bio-inspired network models [41], is the absence of an explicit HRF to link the neural activity to the measured BOLD signals [19, 59, 139]. Note that applications of the DCM to resting-state data involve a linearization of the HRF [119, 56]. Therefore, a foreseen extension of our approach is that of a state-space model with the MOU process generating the neuronal activity that is convolved with a linear HRF filtering [126]. The challenge is to keep the gradient-descent optimization tractable, as was previously done with Granger causality analysis [8, 52]. Incorporating the HRF may improve the reliability of the estimated EC [70, 110]. Note that an alternative consists in performing the deconvolution of the BOLD signals with respect to a HRF before analyzing the obtained neuronal signals, as was done with Granger causality analysis [36, 125, 152, 76] or other connectivity analysis [123].

## 3 Machine learning for extracting multivariate biomarker specific to cognitive states

Beyond their goodness of fit and how many “realistic” biological mechanism they might incorporate, models can be used to extract information from data in a top-down approach. Once tuned, the model parameters can indeed be used to discriminate cognitive conditions, as a representation of the observed data (BOLD signals here). This conceptually differs from finding the generative model of activity with the best fit. Nonetheless, it is expected that a model with poor fit hardly extracts any relevant information. Many recent studies have used connectivity measures to predict which tasks are performed by the subjects in the scanner [75], the pathological conditions of patients [94, 118] and individual identity [104, 54, 26, 5]. Machine learning is the standard for identifying biomarkers of neuropathologies [146] and is also widely used in voxel-based analysis for cognitive tasks [108, 151]. However, it is less frequently applied for cognition studies using connectivity measures [145]. For a given connectivity measure, a biomarker is a subset of links that enables the robust identification of a category of fMRI sessions, e.g. a weighted sum of the matrix elements that exceeds a threshold for the MLR. In practice, changes in BOLD activity across cognitive conditions are mixed with individual traits and session-to-session variability. Disentangling these contributions is the key to obtain efficient biomarkers [111].

The present section addresses the following questions:

How to cope with the multivariate nature of connectivity measures (e.g. > 1000 MOU-EC links)? We illustrate some advantages of machine learning compared to statistical testing, about multiple comparisons and beyond.

What does classification tell about the underlying model? Linear and nonlinear classifiers apply different metrics on the estimated parameters, revealing the distribution of task-specific changes across brain regions.

In a complex environment (e.g. many tasks), can we uncover a hierarchy in cognitive states (subgroups of tasks)?

Adopting the language from machine learning, ‘samples’ refer to fMRI sessions and the links of the connectivity measures are ‘features’ whose values are used to discriminate ‘labels’, which are the types of fMRI sessions here. To illustrate our framework, we use data that were previously analyzed [86, 98, 66, 44], whose details are summarized in Appendix A. Subjects in the scanner were recorded fixating a point in a black screen (two rest sessions R1 and R2) or a movie. The entire movie session was divided in three sessions (M1, M2 and M3), each corresponding to a different part of the movie. The vectorized MOU-EC estimates are displayed for all subjects and sessions in Fig. 3A. The question is then whether connectivity measures can capture information to identify the movie sessions individually. For example, the MOU-EC links indicated by the gray arrow exhibit strong changes between rest and movie (with small and large values, respectively).

### 3.1 Variability of connectivity measures with respect to cognitive labels

Before classification, the data structure can be explored to evaluate the similarity or distance across conditions between the connectivity measures, as representations of the BOLD signals. For this type of (passive) task, BOLD synchronization patterns during movie viewing between brain areas (e.g. visual and auditory) as reflected by inter-subject correlations [82] is expected to be also captured by the connectivity measures [66]. The gray arrow in Fig. 3A indicates some MOU-EC links with strong changes between rest and movie.

A usual technique for multivariate data is dimensionality reduction, for example using principal component analysis a.k.a. PCA [75] and independent component analysis a.k.a. ICA [27]. Here we compare two linear transformations, the unsupervised PCA in Fig. 3B with the supervised linear discriminant analysis (LDA) in Fig. 3C. The principal component is the direction in the high-dimension space with largest variability across the samples and successive components are ranked in decreasing order for their corresponding variability. It is called unsupervised because it does not use the task labels to compute the components, as can be seen for FC where the first component is not related to the task labels. In comparison, the components for LDA are ranked according to their variability *with respect* to task labels. This can be seen for FC where the first component for PCA does not reflect the separation between movie and rest (the second component does).

The separation of the tasks viewed using the connectivity measures can be evaluated using silhouette coefficients that measure the clustering quality [121], ranging from 1 for separated dot clouds to −1 for overlapping dot clouds. Applied to the LDA coordinates in Fig. 3D), we see a slow increase in silhouette coefficient when incorporating more components to separate the 4 clouds. This reflects the difficulty of the 4-task discrimination, compared to that for the 2 tasks where a single component is sufficient [44]. The viewpoint here is that of clustering, assuming that the reduced space allows for isolating the clouds to characterize the cognitive states [75]

### 3.2 Cross-validation for assessing the generalization capability of connectivitybased classification

Although machine-learning techniques are now increasingly used to analyze neuroimaging data, the generalization capabilities are not always properly verified [146]. For example, clustering algorithms applied on reduced-dimension components (see Fig. 3B-C) give a measure for the separation of the tasks viewed using the connectivity measures (Fig. 3D). Notice that all sessions are used for the dimensionality reduction in Fig. 3. This is a problem since the parameters tuning of the model can be influenced by the specific noise of the data samples and, in turn, the results will be not generalizable to new samples —a phenomenon called overfitting. To evaluate the robustness of a classifier, a cross-validation procedure is the standard for voxel-wise studies for activation maps [108, 151] and for clinical applications [118, 87]. This procedure consists in splitting the data samples in train and test sets that are respectively used to fit the classifier and to assess its performance, as described in Fig. 4A. Thus, one may have to include preprocessing steps in the cross-validation scheme to properly evaluate the generalization capability of the classification, for example with functional parcellations derived from the data that should be calculated on the train set only [21]. For LDA in Fig. 3C, this corresponds to setting a linear boundary using the dimensionality reduction on the train set and evaluating the performance on the test set. The data splitting can be repeated several times in order to assess the impact of different samples in the training and test set on the classification performance. Thus, the accuracy distribution provides an estimate of the generalization capability of the classification scheme to new samples. Different splitting strategies have been proposed and here we use the recommended one for neuroimaging applications is based on a ratio of 80%-20% for train and test sets, respectively [146]. We repeat the classification for 20 random splitting of the data. This is lower than the recommended 100 times and is justified by the rather low number of 22 subjects in our dataset, as explained in the box below together with further technical considerations about choosing a cross-validation scheme.

### Remark on cross-validation schemes:

The recommended ratio for neuroimaging applications is 80%-20% for train and test sets, respectively, chosen by random draws and repeated about 100 times [146]. While we agree on this as a good general practice, we highlight that a slightly different scheme might be better suited when the number of available samples is rather low, as is often the case in cognitive studies like here with 22 subjects. Another issue concerns the independence or absence thereof of the measurements used as inputs for classification. For the example of Fig. 3, each subject is scanned in both restingstate and movie conditions, meaning that fMRI measurements are paired together. There, random resampling based on fMRI sessions without separating subjects in train and test sets would lead to more similar samples in the train and test sets than expected by chance. This would likely inflate the evaluated performance for generalization. A better solution is to apply the cross-validation to ‘groups’ (subjects in this case) instead of single samples (fMRI sessions). While it is possible to repeat random resampling of the groups several times, the chance of repeating the cross-validation for the exact same test set becomes non-negligible when the number of subjects is not sufficiently large. For our dataset, using 4 subjects for testing and the remaining 18 for training, there is almost 50% chance of getting the same split twice over 100 repetitions. The consequence would be an underestimation of performance variability. In such cases, an alternative valid option is the leave-one-group-out strategy, using a subject for testing and the remaining subjects for training, yielding a distribution of 22 accuracies. This procedure also has the advantage of highlighting individual differences in the dataset, showing for example if some subjects are easier or more difficult to predict. With our data, we find that repeating 80%-20% shuffle splits 20 times and performing leave-one-out for each of the 22 subjects leads to very similar accuracy distributions.

We consider task discrimination in two flavors: movie versus rest (2 tasks) in Fig. 4B and the movie sessions individually plus rest (4 tasks) in Fig. 4C. For each case, the performance evaluated using the test set should be considered relatively to perfect accuracy and chance level (dashed line). As expected from Fig. 3, the 4-task discrimination is more difficult that the 2-task one. The relative performance decreases by about a half of perfect accuracy for the 4 tasks compared to the 2 tasks. There are three interesting points coming from the comparison of the performances between the connectivity measures.

First, the MLR and LDA perform much better than the 1NN for the 4-task discrimination, both with MOU-EC and FC; this does not change when using kNN instead of 1NN. This agrees with previous results on subject identification [111] that are transposed to the present dataset in Fig. 4E. For task identification, the Pearson-correlation similarity for FC or MOU-EC does not seem the best choice, especially when the environment is complex. A technical point here is that LDA takes a long time to be trained for a large number of features or for many labels (subjects in Fig. 4E). Over all these results, the MLR seems the best practical choice for a classifier, as was found for clinical data [34].

Second, and more importantly, the masked FC performs worse than FC for the 4 tasks in Fig. 4C, demonstrating that the SC mask (with 25% density here) cannot be directly used to preselect the most informative FC links that contribute to the good performance. In contrast, MOU-EC has the same number of features as the masked FC and performs as well as FC (up to a few %), even though the silhouette coefficients in Fig. 3D were clearly lower for MOU-EC than FC. Moreover, the MAR connectivity estimate, which extracts temporal information from BOLD signals as MOU-EC does, gives very poor accuracy, even for the 2-task classification. Partial correlations are somewhat in between. For subject identification, MOU-EC performs better than FC in Fig. 4E, in line with previous results [111]. Taken together, these results show that the dynamic model used for MOU-EC estimation is well suited to extract relevant information from the BOLD signals. In a way, MOU-EC can be seen as a more compressed version of the BOLD information than FC (i.e. with fewer parameters) without loss of information.

Last, we consider the symmetric connectivity obtained by the heuristic optimization in Appendix D.2 that tunes the MOU model to reproduce the zero-lag covariance FC. In Fig. 4G the goodness of fit is slightly worse for the heuristic method (larger model error measure by the matrix distance) compared to EC. It is worth noting that the heuristic estimate leads to perfect accuracy for the “simple” classification rest versus movie (Fig. 4B). However, the accuracy decreases by at least 10% compared to MOU-EC for the 4-task and subject classifications (Fig. 4C-E). This shows that the choice for the connectivity estimation method (with the corresponding measure on the BOLD activity) is crucial to efficiently extract information from the BOLD signals.

### 3.3 Capturing the hierarchy of cognitive states

The high dimensionality of connectivity measures allows for representing complex environments, such as the 4 tasks considered here. Beyond their ability to classify, the important question is whether MOU-EC or FC can capture the structure of the task categories —rest on one hand and the group of three movie sessions on the other. A similar goal was sought using clustering on FC to identify subtypes of depression [46], although concerns have been raised about the reproducibility of the results [45]. Clustering corresponds to unsupervised learning (without cognitive labels) and seeks a structure in the data for a given metric as with the first two components of PCA in Fig. 3B.

Instead, we rely on supervised learning and propose labels as an informed guess to a classifier, then assess whether they are meaningful using cross-validation. Beyond the performances of the 2-task and 4-task classifications in Fig. 4B-C, each classifier thus defines is a similarity metric for the connectivity measures. This is reflected in the confusion matrices in Fig. 4D, where the classification errors (dark off-diagonal green pixels) indicate which cognitive states are difficult to separate. By the naked eye, the structure of the left confusion matrix for the MLR determines a hierarchy that corresponds to the desired one, similar to the clustering in Fig. 3B and C. When asked to discriminate between the two rest sessions, the classifier gives a negative answer. The three movie sessions are identifiable even though the MLR classifier makes errors between them. In contrast, the right confusion matrix for the 1NN does not provide useful information. Similar matrices without a clear hierarchy are obtained for FC (not shown). This suggests that the Pearson correlation used as a similarity measure in the 1NN is not well suited to describe the cognitive states with sufficient flexibility. In other words, it is less the global EC/FC profile than specific links that significantly vary across tasks and should be the basis for building the task hierarchy. We do not enter into further details here, leaving the comparison between unsupervised and supervised learning in estimating label hierarchies in more depth for future work. As a last comparison, the confusion matrix for subjects in Fig. 4F shows that subjects are well discriminated and no particular structure in the errors is observed, indicating an absence of grouping subjects.

### 3.4 Biomarker formed of informative MOU-EC links: machine learning versus statistical testing

A biomarker can be extracted for the 2-task and 4-task classifications in Fig. 4B-C by identifying the most informative links that contribute to the correct classification. To do so, we employ recursive feature elimination (RFE) that efficiently works with the MLR classifier [78]. It provides a ranking of the features in order of importance and we select the first ones that give maximal performance. A previous study compared these support networks of informative links for subjects and tasks to see whether the two classifications were biased one by the other [111]. Here we make another comparison with the two support networks in Fig. 5A, movie versus rest (7 black and 3 gray links) and for the 4 tasks (7 black and 21 red links). The 7 common links are also a hallmark of the hierarchy mentioned in the previous section, as the more 4-task biomarker extends the 2-task biomarker. Furthermore, the two RFE rankings are robustly aligned (Pearson correlation equal to 0.48 with p-value ≪ 10^{-10}). If we compare with the same results for the FC links in Fig. 5B, we observe less overlap between the two support networks. Importantly, 6 FC links out of the 16 for the 2-task classification do not appear in SC (15 out of 25 for the 4-task classification). Nonetheless, we find that the rankings for EC and FC are correlated

Now we discuss the pros and cons of machine learning compared to statistical testing in finding correlates (i.e. link changes) of cognitive tasks. Formally, these two methods correspond to distinct questions that relate to opposite mappings from measured variables to conditions: changes in the variables across conditions for statistical testing and the prediction of conditions based on the variables for machine learning. However, they can be used to address the same practical question, especially when interpreting data in two conditions (as here with rest versus movie). In neuroimaging data a crucial point relates to the high dimensionality of connectivity measures, which leads to repeated hypothesis testing —are MOU-EC values different across two conditions?— over the 1180 connections. This calls for correction for multiple comparisons, which is problematic as the number of connections scales with the square of the number of ROIs in general. We illustrate this point by calculating the p-values for the non-parametric Mann-Whitney test between rest and movie MOU-EC weights for each connection (left plot in Fig. 5D). To identify links that strongly differ across conditions, we use the significance threshold with Bonferroni correction (dashed-dotted line), here – log_{10}(0.05/1180) for 1180 MOU-EC links, or better with Benjamini-Hochberg correction [11] (dashed line). The same can be done for the 4 tasks (right plot), which involves 6 pairwise tests between the 4 cognitive states. For each connection we retain the smallest p-value among the 6 (as it is informative for at least a pairwise discrimination). For the Bonferroni correction, this gives 47 connections passing the significance threshold for the 4-task, to be compared with 75 for 2 tasks. Using Benjamini-Hochberg correction, we find for the two classifications 346 and 412 links, respectively. This does not satisfactorily reflect the increase of complexity of the cognitive labels —more labels require more connections in the biomarker— while the support network obtained using RFE in Fig. 5A provides a satisfactory description. More importantly, the comparison of the mapping between p-values and the RFE ranking in Fig. 5D also illustrates that efficient classification may rely on link with poor p-values, and that this phenomenon may be stronger when the number of categories increases (like labels here). This conclusion also holds when using the parametric Welch t-test instead of the non-parametric Mann-Whitney test. Beyond the details of the present example, machine learning seems better suited than statistical testing to deal with connectivity measures in a context of multiple labels.

Another conceptual difference between the two approaches is the Gaussian assumption for the parameter distribution. For example, DCM relies in many cases on a single generative model with a distribution for each parameters —typically determined by a mean and a variance for each connection weights— and selects the model that provides the best evidence for all observed FC matrices. Instead, we estimate a single MOU-EC matrix per session, which builds a distribution of point samples for each connection. It remains to explore the implications of the difference in the natures of the estimate distributions and the detection of significant changes in MOU-EC —using the Bayesian machinery for DCM and standard machine learning in our case.

## 4 Network-oriented analysis for interpreting collective BOLD dynamics across conditions

So far, we have shown how MOU-EC estimates can be used for the classification of cognitive conditions using machine learning. Now we turn to the collective interpretation of MOU-EC links as a network. Tools from graph theory have been extensively applied to explore the brain network and understand how cognitive functions were distributed over subnetworks of ROIs. The study of both FC or SC has revealed a hierarchical organization of modular networks that are interconnected by hubs [85, 144, 156], which can be modulated depending on the subject’s condition [103, 13, 130].

A limitation in many SC and FC studies comes from the direct application of graph measures defined for binary data. In such cases SC or FC matrices are often binarized using an arbitrary threshold to obtain a graph, discarding the information conveyed by the graded values. For SC, this gives the skeleton of strong connections. Similar methods have also been used on FC [53, 38], but the use of a threshold to binarize FC seems more arbitrary, especially as many pairs of brain regions exhibit strong correlation. Another important aspect is often overlooked: brain connectivity measures such as FC are inferred from signals that have a temporal structure. This means that network analysis should take time into account. A possibility is to analyze the evolution of snapshot networks over time using dynamic FC based on sliding windows applied to the observed BOLD signals [93, 140, 6].

Going a step further, a particular focus has been on studying the relationship between the SC and FC [137, 15], in particular using dynamic models at the mesoscopic or macroscopic level [88, 63, 39, 112, 102, 20, 129, 56]. Following, graph analysis can be performed on the modeled BOLD activity [40, 55]. Nonetheless, it can be argued that such “integration” measures somehow rely on the observed phenomena (in empirical or model activity) rather than their causes. Rather, dynamic models enable the application of graph measures to the underlying connectivity that generates the dynamics, as was recently proposed [119, 67].

Here we follow this novel direction with our adaptation of graph-like measures for the network dynamics themselves [68, 67], whose main details are presented in the following; see also Appendix E for mathematical details. This section illustrates the following aspects of network-oriented analysis for MOU-EC:

How to interpret collective changes in MOU-EC in their contribution to shape the network dynamics? We introduce dynamic communicability with the notion of ‘integration time’ that discriminates early and late interactions between ROIs in response to a (local) perturbation in the network via MOU-EC.

Are network measures relevant for discriminating across conditions, thereby implementing another type of biomarkers for BOLD dynamics?

Between the level of individual connections and the global network, is there an intermediate scale to interpret the MOU-EC estimates? Can we detect functional communities of ROIs and are they task-specific?

### 4.1 Dynamic flow as graph-like measure for ROI-ROI interactions across ‘integration time’

Our definition of dynamic flow takes advantage of the information provided by the dynamic model, here applied to the MOU process [68]. As illustrated in Fig. 6A, MOU-EC quantifies the instantaneous causal interactions between pairs of ROIs and the Σ matrix measures the spontaneous or input activity level that each ROI receives (here represented only for ROI *i*). Built on those estimates, the dynamic flow measures the influence of a perturbation at a source ROI *i* onto a target ROI *k* after a given delay, which we call ‘integration time’. Importantly, it takes into account the network effect (or network response), meaning the contribution of indirect paths in the network (from ROI *i* to ROI *k* via ROI *j* in the figure). See Appendix E for the mathematical formulation based on the network impulse response (or Green function), which is analytically tractable for the MOU process.

The dynamic flow thus aims to capture the dynamic response of the brain to local spontaneous activity (or external inputs) as it propagates from region to region in the network. A special case of the dynamic flow —termed dynamic communicability— that only depends on the MOU-EC (and not Σ) can also be defined assuming that all ROIs have independent spontaneous activity with the same intensity; it corresponds to a diagonal and homogeneous matrix Σ. Like the flow, dynamic communicability describes directional interactions, which differs from the symmetric interactions obtained from the direct application of graph theory to FC or SC [67]. It is worth stressing that the dynamic flow and communicability are graph-like measures that genuinely capture the properties of weighted networks, similar to the previously-proposed graph communicability [51, 50]. In the following we focus on the dynamic flow to interpret the BOLD dynamics.

As a first example, the flow changes in Fig. 6C indicate differentiated effects among the ROIs, with increased/decreased listening for input flow and broadcasting for output flow. A previous analysis based on MOU-EC [131] examined the task-dependent changes for the rich club of ROIs that have dense anatomical connectivity and are involved in the default-mode network, namely the precuneus (PCUN), superior parietal cortex (SP) and superior frontal cortex (SF). In line with those previous results, we find an increase of output flow to the rest of the brain (even though small for PCUN). Here among the ROIs with strong increases in output flow, we find the post-cingulate gyrus (PC) in addition to SP, which agrees with previous results about the default-mode network [148, 149].

### 4.2 Integration in the brain network

Following experiments on the propagation of stimulus information from sensory areas to “high-level” brain areas in relation with being conscious of the stimuli [43, 28], the concept of integration was defined in models by quantifying the influence between regions while being in a broader network. Considering again the fMRI dataset in Appendix A, we evaluate the dynamic flow for the two conditions, movie and rest. Our previous analysis of the same dataset showed that the modulations of MOU-EC between rest and movie implemented a selection of pathways in the cortex [66]. In addition, sensory inputs were reflected by changes in Σ in the model. Here we examine the flow at various scales to test how it captures differences in the BOLD propagation within the brain network. Note that the ‘effective drive’ defined in our previous study [66] corresponds to the flow at early integration time.

It is sometimes interesting to evaluate changes for specific brain subsystems, which in our case concerns occipital and temporal ROIs that are related to vision and audition. As an example, we step aside from the whole-brain approach and consider the ROI groups in Fig. 7A, where “sensory inputs” are expected to be reflected as increased Σ in OCC and TMP. Fig. 7B displays the summed flow between the ROI groups in Fig. 7A, comparing the two conditions: black for rest and brown for movie. The total flow over all ROIs shows a weakly, but significant, increase for movie, as indicated by the light brown squares in Fig. 7B. From the significance viewpoint, the preferred pathway in terms of estimated summed flow follows a staged hierarchy from sensory (occipital and temporal) to the parietal ROIs and then to frontal ROIs, see thick arrows in Fig. 6B. In this case the parietal ROIs thus play the role of integration and relay.

The machine-learning equivalent of the statistical testing for the summed flow between subnetworks in Fig. 7A corresponds to the blue violin plots in Fig. 7C. Here a good accuracy means that the pathway is strongly modulated between rest and movie, but the blue violin plots are close to chance level, indicating a weak generalization across subjects. A different question is to which extent the detailed pattern of flow vary between movie and rest, which is referred to as the multivariate flow (in orange). The striking difference between univariate and multivariate flows indicates that the selection of pathways is diverse for ROI groups. Moreover, the discrepancies with changes in MOU-EC for the same matrix elements (green) shows the importance of network effects, for instance from parietal to frontal ROIs. It is also worth noting the directionality in the changes, for example strong changes from parietal to frontal ROIs, but not the converse. These results speak to a global balance of overall flow in the cortex, with detailed reorganization of cortical pathways [66]. Together with the distributed signature found in Fig. 5A, they also highlight the need for a whole-brain approach to understand the cortical reconfiguration [14], contrasting with previous studies relying on hypothesis testing for a-priori selected ROIs [72, 35, 30, 49].

### 4.3 Detection of functional communities determined by dynamic flow

In parallel to integration, the concept of complexity has been often used to differentiate brain states [142, 141, 7, 155]. Intuitively, complexity measures quantify the diversity of organizational motifs in a network, in the range of interest between the trivial extremes of full disconnectivity and full connectivity. A proxy for complexity is the detection of ROI communities because modular networks are a stereotypical model for complex networks, as was observed with SC [156]. Here we use the dynamic flow as a basis for the functional ROI communities in a data-driven manner, which contrasts with the anatomical grouping in Fig. 7A. Such methods developed initially for graphs with a metric called modularity based on weights [109]. Here we consider communities determined by the flow, namely ROIs with strong exchange of activity in the estimated dynamics [68].

Fig. 8A displays the evolution of communities over integration time for rest and movie. Strong coparticipation values indicate the robustness of the community structure across subjects. It shows a merging of almost all ROIs that eventually exchange flow in an even manner. This agrees with our previous results for resting state only and a different parcellation [67]. This speaks to a transition from segregated processing (within each diagonal block of left matrices) to a global integration. The number of communities can be taken as another proxy for the complexity in the brain network and our approach provides a dynamical perspective about it. The community merging goes hand in hand with the homogenization of the flow matrix in Fig. 8B, as quantified by the diversity of the matrix elements in the flow matrix that decreases and eventually stabilizes [68]. In our data the diversity is slightly higher for movie compared to rest.

Going a step further, we can examine the difference between the community structures of the two conditions, see the last row of Fig. 8A. This shows that the changes in dynamic flow for specific pairs of ROIs between rest and movie (Fig. 7C) have an important collective impact. The strongest differences appear when the network effects are very strong at t = 4 TRs. This is confirmed by Fig. 8C, where the dynamic flow is most different between rest and movie at the same integration time (see arrow). These results speak to an intermediate organization of groups of ROIs between the local and global scales, which can be captured using our network-oriented analysis of MOU-EC.

## 5 Conclusions

This article illustrates important concepts related to the use of whole-brain dynamic models for quantifying “states” related to cognition (or neuropathologies) and for interpreting BOLD signals in a collective manner. It contains 5 important results obtained from the application of MOU-EC on the movie dataset, in line with our previous analyses:

The MOU process provides a practically useful balance between the richness of the generated BOLD dynamics, the robustness of the model optimization and intuition for interpretability.

Interactions between brain regions evaluated using MOU-EC gives robust predictions of cognitive states.

MOU-EC (and other connectivity measures like FC or other types of EC) can be used to uncover the hierarchy of the cognitive states.

Dynamic flow calculated from estimated dynamics indicates several timescales for whole-brain integration.

Biomarkers can be extracted from the estimated dynamics (including the use of network theory), with similar predictive power to biomarkers from connectivity measures.

We also highlight 5 general key points for the analysis of multivariate neuroimaging data:

It is important to maintain consistency in the analysis pipeline from estimation to interpretation (same assumptions, same dynamic system with canonical measures).

Fitting a dynamic system to time series enables the characterization of network properties with time involved in analysis.

Adequate machine-learning tools (with cross-validation and feature selection) is the tool of choice to relate fMRI or multivariate neuroimaging data to cognitive states or neuropathologies.

One should aim for methods that can span multiple scales, like graph-like measures that can be analyzed from the connection/node level to communities and global level.

Versatile tools that can address different kinds of questions are good for making comparisons between the results (e.g. classification accuracy of connectivity versus communicability).

The main of this article was to bring together the most important aspects of our MOU-EC framework for fMRI/BOLD data, which were developed in previous separate works. It links dynamic systems, machine learning and network theory in a comprehensive and consistent fashion, as was detailed in the successive sections. We stress that this type of formalized unification between several fields is not so common in neuroscience and should be promoted. Our model-based approach allows for a flexible analysis of whole-brain dynamics at various scales, from individual connections to ROI-centric and global measures. By fitting the dynamic model to the spatio-temporal covariance structure of the BOLD signals, our approach captures the BOLD dynamics averaged over each fMRI session. The estimated MOU-EC can be seen as a “projection” of the BOLD signals on anatomically-constrained connectivity, thereby characterizing the dynamical state of the brain. In contrast, analyses based on the static FC can be related to a linear Gaussian model, and thus does not take the temporal dimension into account (i.e. BOLD values are considered as i.i.d. variables). Dynamic FC methods have been developed to incorporate time in the analysis [112, 116, 74, 25], but they do not genuinely capture the propagating nature of BOLD signals. Dynamic causal models have just been adapted to deal with whole-brain fMRI data [119, 113, 56] and it will be interesting to compare the estimated connectivities.

The basic tools for the presented analyses are available in the open-source language Python with example data on http://bitbucket.org/MatthieuGilson/mou-ec_tutorial. The classification uses the library scikit-learn [2].

### 5.1 Biomarkers for cognition and neuropathologies

The robust estimation of the (multivariate) parameters allows for a powerful representation of cognitivestatesbytheMOU-EC estimates (Fig. 4), as well as the extraction of task-specific biomarkers (Fig. 5). An important point considering connectivity measures is that machine learning solves the problem of multiple comparisons: instead of determining p-values and then a threshold for connections with significant changes, we can characterize which MOU-EC links —as features— contribute to the correct classification. It has been recently advocated that machine learning can be used to avoid “excessive reductionism”, namely linking brain activity in localized regions to narrow experimental protocols in a way that cannot generalize to more complex protocols [145]. Although the focus has been on machine-learning techniques, we remind that hypothesis testing of significant changes for preselected ROIs or links can be performed in our framework (see Fig. 7B with the dynamic flow). In that case, whole-brain changes can also provide a baseline reference for the considered links or ROIs.

On the technical side, it remains to be explored how the classification performance is affected by the resolution of various parcellations [33, 47, 71], as was recently done with FC-like connectivity measures [34]. The trade-off to solve is between the richness of the connectivity measure (more links can reproduce richer environments) and their robustness (more parameters may result in noisier estimation). The alignment of distinct datasets with various MRI protocols also remains a difficult question for practical application to large datasets [147].

In parallel to the study of cognition, the presented framework can be applied to neuropathologies with the aim to inform clinical diagnosis [100]. The rationale is that BOLD activity specifically reflects neuropathologies, even in the resting state [77, 87]. If SC is increasingly used for strokes or Alzheimer disease that strongly impact the brain anatomy, fMRI may provide additional information [133, 79]. Other diseases like depression [46] or some motor disorders [122] may be better investigated using functional imaging. Importantly, many neural diseases are likely to affect multiple brain regions and thus require adequate tools to capture their distributed nature. Another direction where machine learning is a key tool is the development of personalized medicine, adapting the models to individual patients beyond group analyses [154].

### 5.2 Network-oriented analysis of dynamics

The network analysis for our dynamic model relies on the dynamic flow, a graph-like connectivity measure that describes the integration of propagating activity at multiple scales in the network (Figs. 7 and 8). In particular, small changes in EC that are not individually significant can collectively induce large changes in the dynamic flow, especially when feedback loops are involved. In this sense, the dynamic flow quantitatively captures the interplay between parameters, so the resulting communication cannot always be simply understood at the level of single estimated parameters. As an example, many small changes in EC (below significance level) may collectively result in a strong dynamical effect, as seen for task A in Fig. 9. Strong coordinated changes in EC result in large changes in the dynamics (task B). Moreover, changes in input properties may only be captured by the dynamics, in particular when the connectivity does not change (task C). Building biomarkers that capture network effects is important to make use of the multivariate nature of the fMRI data. This is important when interpreting data in terms of concepts such as integration, segregation and complexity [141, 43, 42, 155].

### 5.3 Limitations and future work

Recall also that any interpretation of BOLD in term of brain communication relies on the assumption that changes in neuronal activity are reliably reflected in the BOLD signals, which is under debate [73, 99, 84]. In practice, the preprocessing of fMRI signals must strive to increase the signal-to-noise ratio [115, 114].

Recall also that causal interactions between brain regions come from the directed connectivity in the model, whose “best” parametrization is estimated from the data. Without a proper experimental interaction with the brain activity, the notion of causality is subject to the initial model assumption [92, 89]. This remark also applies to other methods for directed connectivity, such as Granger causality and DCM [143].

A key distinction of our approach compared to DCM is that our MOU-EC (or FC) estimates are calculated for each fMRI session, so we obtain a sample distribution for connection weight over all subjects for each condition (e.g. task). In contrast, DCM usually considers a single model per condition that aims to represent the variability over subjects with a parametric distribution (mean and variance) for each connection weight. The limitations and advantages of both approaches in terms of statistical power and generalizability remain to be practically compared, especially for large datasets.

Regarding the hemodynamics and explicit HRF modeling discussed at the end of the section presenting the MOU model, a quantitative comparison between MOU-EC, DCM and Granger causality analysis to extract biomarkers or interpret fMRI data in terms of neuronal coupling is necessary in the future. A particular point concerns the observed heterogeneity of HRF across different brain regions and how it affects the estimated connectivity and its directionality [81]. Previous works also suggest that the HRF is important for the reliability of connectivity analysis [70, 110]. This may increase the number of parameters in the model, so the robustness of the estimation should be verified using both forward modeling in synthetic whole-brain networks [134] and with classification from real data with known categories (e.g. subjects with several sessions each). A first step could be to extend the MOU-EC as a state-space model with a first-order filter for HRF [126].

## Appendix

The following sections summarize details about the data and mathematical framework used in this article. For machine learning, the readers are referred to the online documentation http://scikit-learn.org/ of the scikit-learn library [2].

## A Movie dataset

The experimental procedure was approved by the Ethics Committee of the Chieti University and the participants signed a written informed consent. We use the data from 22 right-handed young and healthy volunteers (15 females, 20-31 years old) that had recordings for both a resting state with eyes opened and a natural viewing condition (30 minutes of the movie ‘The Good, the Bad and the Ugly’).

The fMRI time resolution (TR) is 2 seconds and each session of 10 minutes corresponds to 300 frames (or volumes). The preprocessing performed using SPM8 involves the coregistration of structural and functional MRI, spatial normalization in the MNI coordinates and motion correction. The data are parcellated in *N* = 66 ROIs [80]. The ROIs are listed in Table 1, indexed in the order of Figs. 5 and 6. The generic structural matrix is obtained from averaging individual matrices obtained using the diffusion spectrum imaging. A threshold was then applied to retain the 27% largest connections. Inter-hemispheric connections between homotopic ROIs were added, increasing the density to 28%. The same generic SC matrix is used to determine the MOU-EC topology for all subjects.

We refer the readers to previous publications [86, 98, 66] for further details.

## B Functional connectivity measures

For each session of *T* = 300 time points (2 for rest and 3 for movie), we denote the BOLD time series by for each region 1 ≤ *i* ≤ *N* with time indexed by 1 ≤ *t* ≤ *T*. The mean signal is for all *i*. Following [69], the empirical FC consists of two BOLD covariance matrices (see the blue and green matrices in the top box of Fig. S1), without and with time lag:

Pearson correlation can be calculated from the covariances in Eq. (1):

## C Multivariate Ornstein-Uhlenbeck (MOU) process to model wholebrain dynamics

The network dynamics are determined by two sets of parameters:

the time constant

*τ*is an abstraction of the_{x}the network effective connectivity (MOU-EC) between these ROIs embodied by the matrix

*C*, whose topological skeleton is determined by structural data;the local variability embodied in the matrix Σ inputed individually to each of the

*N*= 66 ROIs.

The activity variables obey are described by a multivariate Ornstein-Uhlenbeck process. Each activity variable *x _{i}* of node

*i*decays exponentially according to the time constant

*τ*and evolves depending on the activity of other populations:

_{x}Here the fluctuating inputs d*B _{i}* are independent and correspond to a diagonal covariance matrix Σ, as represented in purple by the vector of variances Σ

_{ii}in the bottom box of Fig. S1. In the model, all variables

*x*have zero mean. Their spatiotemporal covariances are denoted by , where

_{i}*τ*∈ {0, 1} is the time lag, and correspond to the the blue and green matrices in the bottom box of Fig. S1. They can be calculated by solving the consistency equations:

The first equation is the continuous Lyapunov equation, which can be solved using the Bartels-Stewart algorithm in scipy [1]. Here *J* is the Jacobian of the dynamical system and depends on the time constant *τ _{x}* and the network MOU-EC: , where

*δ*is the Kronecker delta and the superscript † denotes the matrix transpose. In the second equation e denotes the matrix exponential.

_{ij}Ideally, the model should be extended to incorporate subcortical areas and the relevance of input cross-correlations inputs should be evaluated for all ROI pairs [111].

## D Parameter estimation procedures

### D.1 Lyapunov optimization or natural gradient descent

We tune the model such that its covariance matrices *Q*^{0} and *Q*^{1} reproduce the empirical FC0 and FC1 matrices in Fig. 2B, denoted here by and , respectively. The uniqueness of this maximum-likelihood estimation follows from the bijective mapping from the model parameters *τ _{x}*,

*C*and Σ to the FC pair (

*Q*

^{0},

*Q*

^{1}). The iterative optimization procedure for C is similar to the original version [69], which can be related to the concept of “natural” gradient descent [4] that takes the non-linearity of the mapping between

*J*and the matrix pair

*Q*

^{0}and

*Q*

^{1}in the second line of Eq. (4). Note that the parameters

*τ*and Σ also follow a gradient descent now.

_{x}For each session (subject and condition), we calculate the time constant *τ*_{ac} associated with the exponential decay of the autocovariance averaged over all ROIs using time lags from 0 to 2 TRs:
where *a*(*υ _{i}*|

*u*) is the slope of the linear regression of by

*u*= [0, 1, 2]. The model is initialized with

*τ*=

_{x}*τ*

_{ac}and no connectivity

*C*= 0, as well as unit variances without covariances (Σ

_{ij}=

*δ*). At each step, the Jacobian

_{ij}*J*is straightforwardly calculated from the currentvalues of

*τ*and

_{x}*C*. Using the current Σ, the model FC matrices

*Q*

^{0}and

*Q*

^{1}are then computed from the consistency equations, using the Bartels-Stewart algorithm to solve the Lyapunov equation.

The difference matrices and determine the model error
where each term - for FC0 and FC1 - is the matrix distance between the model and the data covariances, normalized by the latter. The desired Jacobian update is the matrix
which decreases the model error *E* at each optimization step, similar to a gradient descent. The best fit corresponds to the minimum of *E*. Finally, the connectivity update is
for existing connections only; other weights are forced at 0. We also impose non-negativity for the MOU-EC values during the optimization. To take properly the effect of cross-correlated inputs into account, we adjust the Σ update from the heuristic update in [69]:

As with weights for non-existing connections, only diagonal elements of Σ —and possible crosscorrelated inputs [66]— are tuned; others are kept equal to 0 at all times. Last, to compensate for the increase of recurrent feedback due to the updated *C*, we also tune the model time constant *τ _{x}* as
where λ

_{max}is the maximum negative real part of the eigenvalues of the Jacobian

*J*. The rationale is to avoid an explosion of the network dynamics (when λ

_{max}→ 0) while letting the model connectivity

*C*develop a non-trivial structure to reproduce FC.

## E E Dynamic communicability and flow for network analysis

Following our previous study [68], we firstly define *dynamic communicability* to characterize the network interactions due to the MOU-EC connectivity *C*, ignoring the input properties Σ. Our definition is adapted to study complex networks associated with realistic (stable) dynamics where time has a natural and concrete meaning. In comparison, a previous version of communicability for graphs [51] relied on abstract dynamics. The basis of our framework is the network response over time, or Green function, which is the basis of the concept of dynamic communicability, which focuses on the temporal evolution of such interactions. Although we focus on the MOU process here, our framework can be easily adapted to distinct local dynamics for which the Green function is known. In addition to the MOU-EC matrix *C*, the MOU dynamics is determined by the input properties, so we use the dynamic flow in the present article (Figs. 6 and 8).

### E.1 Definitions

Formally, dynamic communicability is the “deformation” of the Green function *e ^{Jt}* of the MOU process due to the presence of the (weighted and directed) matrix

*C*, as compared to the Green function

*e*

^{J0t}corresponding to the Jacobian with leakage only and no connectivity, . It is defined as the family of time-dependent matrices depicted in Fig. 6A:

The scaling factor ||*J*^{0}||^{−1} = || ∫_{t≥0} *e*^{J0t}d*t*|| where || · || is the L1-norm for matrices (i.e., sum of elements in absolute value) is used for normalization purpose [68]. Recall that *t* ≥ 0 here is the time for the propagation of activity in the network, referred to as ‘impulse-response time’ in the figures.

To incorporate the effect of local spontaneous activity or excitability (inputs in the model), we define the dynamic flow that fully characterizes the complex network dynamics [68]. The input statistics of interest for a stable MOU process correspond to the input (co)variance matrix Σ, which are independent parameters from the MOU-EC matrix *C*. This is represented by the purple arrows in the left diagram of Fig. 6A, indicating that the fluctuation amplitude is individual for each ROI. The Σ matrix may be non-diagonal when ROIs experience cross-correlated noise [66], as represented by the purple dashed arrows. The dynamic flow describes the propagation of local fluctuating activity over time via the recurrent connectivity and is defined by the
where is the real symmetric “square-root” matrix of the input covariance matrix, satisfying . Dynamic communicability is thus a particular case of the flow for homogeneous input statistics.

From the time-dependent matrices , we define the total flow that sums all interactions

Total communicability for graphs has been used to define a version of centrality [12]. Here the proximity between ROIs correspond to how much flow they exchange. We also define the diversity (or heterogeneity) among the ROI interactions in the time-dependent matrices , which can be seen as a proxy for their homogenization over time:
defined as a coefficient of variation where *μ*_{{i,j}} and *σ*_{{j,j}} are the mean and standard deviation over the matrix elements indexed by (*i, j*).

### E.2 Community detection

To detect communities from in Fig. 8D, we rely on Newman’s greedy algorithm for modularity [109] that was originally designed for weight-based communities in a graph. Adapting it here to the flow matrix at a given time *t*, we seek for communities where ROIs have strong bidirectional flow interactions. In the same manner as with weighted modularity, we calculate a null model for MOU-EC:

Note that we preserve the empty diagonal. The resulting matrix contains from the expected weight for each connection, given the observed input strengths **c**^{in} and output strengths **c**^{out}; is the total sum of the weights in *C*. Then we caclulate using Eq. (14) with *C*^{null} instead of *C*. Starting from a partition where each ROI is a singleton community, the algorithm iteratively aggregates ROIs to form a partition of *K* communities denoted by *S _{k}* that maximizes the following quality function:

At each step of the greedy algorithm, the merging of two of the current communities that maximizes the increase of Φ is performed. Note that communicability-based communities can be defined similarly using and the corresponding null model .

## Footnotes

a few clarifications

## References

- [1].↵
- [2].↵
- [3].↵
- [4].↵
- [5].↵
- [6].↵
- [7].↵
- [8].↵
- [9].↵
- [10].↵
- [11].↵
- [12].↵
- [13].↵
- [14].↵
- [15].↵
- [16].↵
- [17].↵
- [18].↵
- [19].↵
- [20].↵
- [21].↵
- [22].↵
- [23].↵
- [24].↵
- [25].↵
- [26].↵
- [27].↵
- [28].↵
- [29].↵
- [30].↵
- [31].↵
- [32].↵
- [33].↵
- [34].↵
- [35].↵
- [36].↵
- [37].↵
- [38].↵
- [39].↵
- [40].↵
- [41].↵
- [42].↵
- [43].↵
- [44].↵
- [45].↵
- [46].↵
- [47].↵
- [48].↵
- [49].↵
- [50].↵
- [51].↵
- [52].↵
- [53].↵
- [54].↵
- [55].↵
- [56].↵
- [57].↵
- [58].↵
- [59].↵
- [60].↵
- [61].↵
- [62].↵
- [63].↵
- [64].↵
- [65].↵
- [66].↵
- [67].↵
- [68].↵
- [69].↵
- [70].↵
- [71].↵
- [72].↵
- [73].↵
- [74].↵
- [75].↵
- [76].↵
- [77].↵
- [78].↵
- [79].↵
- [80].↵
- [81].↵
- [82].↵
- [83].↵
- [84].↵
- [85].↵
- [86].↵
- [87].↵
- [88].↵
- [89].↵
- [90].↵
- [91].↵
- [92].↵
- [93].↵
- [94].↵
- [95].↵
- [96].↵
- [97].↵
- [98].↵
- [99].↵
- [100].↵
- [101].↵
- [102].↵
- [103].↵
- [104].↵
- [105].↵
- [106].↵
- [107].↵
- [108].↵
- [109].↵
- [110].↵
- [111].↵
- [112].↵
- [113].↵
- [114].↵
- [115].↵
- [116].↵
- [117].↵
- [118].↵
- [119].↵
- [120].↵
- [121].↵
- [122].↵
- [123].↵
- [124].↵
- [125].↵
- [126].↵
- [127].↵
- [128].↵
- [129].↵
- [130].↵
- [131].↵
- [132].↵
- [133].↵
- [134].↵
- [135].↵
- [136].↵
- [137].↵
- [138].↵
- [139].↵
- [140].↵
- [141].↵
- [142].↵
- [143].↵
- [144].↵
- [145].↵
- [146].↵
- [147].↵
- [148].↵
- [149].↵
- [150].↵
- [151].↵
- [152].↵
- [153].↵
- [154].↵
- [155].↵
- [156].↵