## Abstract

Focal epilepsy is a devastating neurological disorder that affects an overwhelming number of patients world-wide, many of whom prove resistant to medication. The efficacy of current innovative technologies for the treatment of these patients has been stalled by the lack of accurate and effective methods to fuse multimodal neuroimaging data to map anatomical targets driving seizure dynamics. Here we propose a parsimonious model that explains how large-scale anatomical networks and shared genetic constraints shape inter-regional communication in focal epilepsy. In extensive ECoG recordings acquired from a group of patients with medically refractory focal-onset epilepsy, we find that ictal and preictal functional brain network dynamics can be accurately predicted from features of brain anatomy and geometry, patterns of white matter connectivity, and constraints complicit in patterns of gene coexpression, all of which are conserved across healthy adult populations. Moreover, we uncover evidence that markers of non-conserved architecture, potentially driven by idiosyncratic pathology of single subjects, are most prevalent in high frequency ictal dynamics and low frequency preictal dynamics. Finally, we find that ictal dynamics are better predicted by white matter features and more poorly predicted by geometry and genetic constraints than preictal dynamics, suggesting that the functional brain network dynamics manifest in seizures rely on – and may directly propagate along – underlying white matter structure that is largely conserved across humans. Broadly, our work offers insights into the generic architectural principles of the human brain that impact seizure dynamics, and could be extended to further our understanding, models, and predictions of subject-level pathology and response to intervention.

## Introduction

For over 60 million patients, epilepsy restricts the quality of daily life through spontaneous and recurring seizures. While seizures may be controlled in two-thirds of epilepsy patients, the remaining require more invasive treatment. Medication-resistant patients undergo continuous monitoring of intracranial electrophysiology for biomarkers generated by the epileptic network, a set of interacting brain regions that are thought to initiate and spread seizure activity in the brain [1]. Novel technologies have expanded treatment options beyond surgical resection of abnormal tissue to laser ablation and neurostimulation [2,3,4,5,6], affording greater specificity in targeting discrete nodes of a patient’s epileptic network. To optimize an intervention strategy for seizure control, practitioners are required to assimilate neuroimaging data across multiple modalities, and to map anatomical targets where intervention would most likely reduce seizure dynamics. However, a parsimonious mechanism explaining how large-scale anatomical networks shape inter-regional communication in focal epilepsy has remained elusive. Understanding how large-scale brain architecture facilitates the onset and rapid spread of seizures can yield better tools for mapping targets for therapy that broadly extend to the patient cohort.

A large body of research has shown that spatially distributed variations in anatomy can distinguish patients with epilepsy from healthy individuals [7,8,9]. Quantitative features of white-matter connectivity can also explain first-order parameters of seizure dynamics, such as duration [10] and seizure severity [11]. Aston-ishingly detailed seizure dynamics can be predicted using *in silico* network modelling that fuses a patient’s structural connectome – a comprehensive network map of physical connections among neural elements – with their intracranial electrophysiology recordings [12,13]. While variation in anatomy and pathophysiology can precipitate seizures through different mechanisms [14, 15], seizures also exhibit substantial similarities in their dynamics within and across patients [16, 17, 18, 19]. Indeed, these and related studies suggest that certain properties of the epileptic network may be common across patients. A critical question, then, is to what extent are seizure dynamics a product of anatomical organization that is common across individuals and more fundamental to the human brain?

Recent work has demonstrated a significant convergence of structural network features across large populations of healthy individuals [20, 21], spanning both young and old age [22, 23], and extending even to diverse patient groups with neurological disease [24] and psychiatric disorders [25, 26]. Notably, such conserved structural features appear to be reflected in patterns of gene coexpression in both the mouse [27] and human [28, 29]. The modular pattern of gene expression is consistent across independent human datasets, evolutionarily conserved in non-human primates [30] and other mammals [31], and supports synchronous activity in brain networks [32,33]. Collectively, these findings suggest that gene expression is an important marker of conserved structural features of human brain networks, and can by extension also reflect functional dynamics in distributed neuronal ensembles. It is therefore of interest to ask whether and how the patterns of structural connectivity and gene coexpression that are conserved across healthy individuals might offer principles upon which the pathology of refractory epilepsy depends.

To answer this question, we constructed dynamic functional networks using electrocorticographic (ECoG) neural recordings from human epilepsy patients undergoing routine clinical evaluation for epilepsy surgery. Recent work has addressed several challenges in studying ECoG dynamics across multiple individuals by accounting for variable electrode placement and sparse cortical coverage, developing a method to study how resting state ECoG functional networks can be predicted from underlying white matter tractography, spatial features, and gene coexpression [28], supporting related work in non-human species [34]. Here, we extend this approach with an expanded set of structural features to model and predict ictal ECoG functional networks to measure the relative impact of different communication policies over white matter networks on epileptic seizures. This method is a unique aggregation of data across multiple approaches to the study of epilepsy, including gene coexpression data from the Allen Institute [35] and structural connectivity measures from diffusion imaging combined within a principled machine learning framework. We hypothesize that models trained with structural parameters would best predict functional connectivity during seizures, thus suggesting the influence of specific communication policies and measures over white matter in explaining ictal functional connectivity.

To test this hypothesis, we recorded ECoG from 25 patients diagnosed with drug-resistant epilepsy under-going routine pre-surgical evaluation. We separated these recordings into pre-seizure and seizure epochs and constructed ECoG functional networks for both of them separately. The seizure epoch spanned the period between the clinically-marked earliest electrographic change and seizure termination, while the pre-seizure epoch was identical in duration to the seizure and ended immediately prior to the earliest electrographic change. In each epoch, we divided the ECoG signal into 1 sec non-overlapping time-windows and estimated functional connectivity in *α/θ* (5–15 Hz), *β* (15–25 Hz), low-*γ* (30–40 Hz), and high-*γ* (95–105 Hz) frequency bands using multitaper coherence estimation. We trained multi-linear regression models with different combinations of structural, physical, and genetic parameters in different settings to determine the genetic and neuroanatomical mechanisms predictive of ictal connectivity.

## Results

### Preictal and ictal functional connectivity are generically predicted by structure, geometry, and genetics

We begin by assessing whether and how brain structure, geometry, and genetics influence functional connectivity during preictal and ictal periods. To this end, we investigated a series of nested multi-linear models (MLM) that generated predictions of functional connection weights (see **Figure 1**). For this purpose, we generated a list of five predictors as follows: **D** = [D_{ij}], the Euclidean distance between region *i* and region *j*; **G** = [G_{ij}], the Pearson correlation between the gene expression profiles of region *i* and region *j*; **S**=[S_{ij}], the search information between region *i* and region *j*, a measure of the “hiddenness” of the shortest anatomical path between the regions; **P**=[P_{ij}], the length of the true shortest path length between region *i* and region *j*; and **F**=[F_{ij}], the maximum flow between region *i* and region *j*, a measure that treats edge weights as capacities and sends units of flow from region *i* to region *j* while respecting maximum capacities. Search information, path length, and maximum flow all represent structural measures that capture different potential scales of communication dynamics that are supported by the network architecture. These three measures were computed from an undirected and weighted connectivity matrix, *A* ∈ ℝ^{N} *×*^{N}, whose edge weights were equal to the streamlines detected between regions normalized by the geometric mean of the regional volumes.

We constructed models with all combinations of predictors, from single factor models to models with all five predictors. All trained models performed remarkably well on both preictal and ictal functional connectivity. The correlation between predicted and observed functional connectivity was consistently greater than *r* = 0.3734 in both datasets. Moreover, all models with combinations of at least two predictors produced predicted functional connectivity matrices that were correlated with the observed functional connectivity matrices at an *r* ≥ 0.5182 in ictal and preictal data (see **Figure 2**). The final model, which contained all predictors, performed remarkably well, and produced predicted functional connectivity matrices that were correlated with the observed functional connectivity matrices with an *r* ≥ 0.6647 across all frequency bands and across both ictal and preictal data.

Next we asked whether predictions were stronger or more accurate during the ictal states or during the preictal states. To address this question, we defined the model quality as the correlation between the true functional connectivity and the predicted functional connectivity. Notably, we observed that model quality was significantly different in preictal and ictal data (*p <* 10^{−16} for all frequency bands using a paired *t*-test with Fisher’s *rho*-to-*z* transform). Specifically, preictal predictions outperformed ictal predictions in the three higher frequency bands (*r*_{p}*i − r*_{i} *>* 0.0219) and ictal predictions outperformed preictal predictions in the lowest frequency band (*r*_{i} *– r*_{p}*i* = 0.0378). This pattern of observations suggests that there are marked differences in the utility of the underlying models in explaining brain dynamics across states, perhaps reflecting the existence of distinct biological processes associated with seizure generation, propagation, and termination.

### Preictal and ictal predictions generalize well

Thus far, we have shown that ictal and preictal functional connectivity can be partially explained by combinations of genetic, physical, and structural network features. We next turn to the question of whether predictions that are trained on one subset of subjects are also generalizable to other subsets of subjects. This question is important because such generalizability would be reflective of reliability and robustness of the architectural substrates for functional ECoG dynamics. To address this question, we chose to employ a standard cross-validation protocol in which one subject is “held-out” and a model is trained on the remaining subjects, and then the same model is tested on the single hold-out (see **Figure 3**).

Using this leave-one-out cross-validation approach, we find that even single predictor models provide surprisingly accurate predictions of preictal and ictal functional connectivity matrices in this group-training setting. On average, all single predictor models produced predicted functional connectivity matrices that were correlated with the observed functional connectivity matrices at an *r* ≥ 0.3721 on preictal data in all frequency bands, and at an *r* ≥ 0.4180 on ictal data in all frequency bands (see **Figure 4**). Interestingly, the model that performed the best in this setting was in fact the same for all frequency bands and consisted of gene coexpression, maximum flow, and shortest path length, omitting the Euclidean distance and the search information. These results suggest that network geometry and the hiddenness of shortest paths in the network are less important for generalizabile predictions than markers of the length of the true shortest path between regions, the total information flow between regions, and the correlation of gene expression profiles between regions.

Next we asked whether the quality of predictions based on the leave-one-out cross-validation was stronger or more accurate during the ictal states or during the preictal states. To address this question, we used the same definition of model quality: the correlation between the true functional connectivity and the predicted functional connectivity. Across all models, we find that ictal predictions significantly outperform preictal predictions in the lowest frequency band (*r*_{i} *– r*_{pi} = 0.0563, *p* = 2.77 × 10^{−20} using a paired *t*-test with Fisher’s *ρ*-to-*z* transform) while preictal predictions significantly outperform ictal predictions in all other frequency bands (*r*_{i} *– r*_{pi} *< –* 0.0136, *p <* 1.47 × 10^{−09} using a paired *t*-test with Fisher’s *ρ*-to-*z* transform). These results suggest that measures of brain structure, geometry, and genetics can accurately predict EcoG functional connectivity in both ictal and preictal states, with ictal dynamics being more easily explained at low frequencies and preictal dynamics being more easily explained at midrange and high frequencies.

### Single subject preictal and ictal predictions generalize well

A critical and pervasive question in clinical applications is whether and to what degree individual differences in brain architecture and dynamics determine the choice of treatment for a given patient, as well as their response to that treatment. The answer to this question depends in large part on the degree to which the architecture and function of the neural system is conserved or variable across individuals. The degree to which a model is dependent on individual heterogeneity can be assessed to some degree using leave-one-out cross-validation, as we discussed and implemented in the previous section. However, an even more stringent method to assess individual heterogeneity is to ask whether a model built from a single subject can be used to predict the neurophysiological dynamics of a different subject. To explicitly assess such fine-scale pairwise heterogeneity, we used the functional connectivity matrix of a single subject to predict the functional connectivity matrix of a different subject, and we quantified the quality of those predictions (see **Figure 5**).

As in the group training setting, we asked whether the quality of predictions based on the single-subject prediction approach were stronger or more accurate during the ictal states or during the preictal states. To address this question, we used the same definition of model quality: the correlation between the true functional connectivity and the predicted functional connectivity. Across all models of single subjects, we find that ictal predictions significantly outperform preictal predictions in the lowest frequency band (*r*_{i} *– r*_{pi} = 0.0752, *p* = 1.380 × 10^{−27} using a paired *t*-test with Fisher’s *ρ*-to-*z* transform) and also in the second lowest frequency band (*r*_{i} *– r*_{pi} = 0.0128, *p* = 2.276 × 10^{−4} using a paired *t*-test with Fisher’s *ρ*-to-*z* transform). In contrast, we find that preictal predictions significantly outperform ictal predictions in both of the higher frequency bands (*r*_{i} *– r*_{pi} *< –* 0.012, *p <* 4.075 × 10^{−8} using a paired *t*-test with Fisher’s *ρ*-to-*z* transform). Again, these results suggest that measures of brain structure, geometry, and genetics can accurately predict EcoG functional connectivity in both ictal and preictal states, with ictal dynamics being more easily explained at low frequencies and preictal dynamics being more easily explained at high frequencies.

Next, we turned to the important question of whether model performance differed significantly across different subjects. Even if the general pairwise predictive accuracy was quite good, it remains possible that one (or a few) subject(s) provided significantly poorer or significantly better predictions for other subjects. To assess such individual variability, we considered each subject separately, and then we trained a model on the functional connectivity of every other subject. Next, we ordered subjects by the value of the correlation coefficient between the predicted functional connectivity matrix and the true functional connectivity matrix (Fig. 6c,f). We found that the worst preictal predictions are significantly worse than the worst ictal predictions, and in fact, even the worst ictal predictions do reasonably well (*r*_{i} – *r*_{pi} > 0:311, *p* < 8:3 × 10^{−8} using a paired *t*-test for the ten worst predictions of each subject’s functional connectivity). These results provide some initial support for the notion that ictal data contains features of network dynamics that are more conserved across subjects than features of network dynamics in preictal data. While seizure patterns can be heterogeneous across different states, non-seizure states must encompass a much larger potential range of actions and, therefore, are less likely to be predicted by the multi-linear model used here.

### Beta values distinguish between preictal and ictal predictions

In the previous sections, we presented results on the performance of various models in predicting ECoG functional connectivity in preictal and ictal states. Now we turn to the important question of whether individual features in the models contribute equally to the ictal and preictal predictions. If mechanisms are conserved then all features should contribute similarly across ictal and preictal states, whereas if mechanisms differ then all features should contribute differentially to ictal and preictal states. We hypothesized that mechanisms would differ. Specifically, based on recent studies of the relation between white matter and seizure duration or severity [10,11], we hypothesized that ictal functional connectivity would rely more heavily on the brain’s structural network architecture and communication policies atop that architecture than preictal functional connectivity.

To test this hypothesis, we train a model with every predictor matrix separately on each frequency band of the preictal and ictal data, and we obtain *β* weights placed on each predictor matrix. This framework allowed us to isolate the redundant, synergistic, and unique contributions made by individual factors, and as they appeared in models either in isolation or together with other factors. To show robustness of these results to variability in the underlying data, we train separate models while omitting one subject in each model. We then compute the average normalized difference between *β* weights assigned to each parameter in the model for ictal and preictal datasets with all parameters present.

When training on ictal data compared to preictal data, we find significantly lower *β* weights associated with Euclidean distance (*β*_{ictal} − *β*_{preictal} *<* −0.063, *p* < 5.8242 × 10^{−13} for all frequency bands using a paired *t*-test). We also find significantly lower *β* weights associated with gene coexpression (*β*_{ictal} *– β*_{preictal} *< –* 0.1136, *p <* 2.204 × 10^{−19} for the two highest frequency bands using a paired *t*-test) when training on ictal data compared to preictal data. Simultaneously, we found significantly higher *β* weights associated with maximum flow (*β*_{ictal} − *β*_{preictal} *>* 0.0902, *p <* 3.080 10^{−16} for the two highest frequency bands using a paired *t*-test) when training on ictal data compared to preictal data. Consistent with our hypothesis, these data demonstrate that ictal functional connectivity relies more heavily on the brain’s structural network architecture (and communication policies putatively enacted upon it) than preictal functional connectivity. Moreover, the data demonstrate that preictal functional connectivity depends more heavily on conserved patterns of gene coexpression than ictal functional connectivity.

## Discussion

In this work, we trained multi-linear regression models to predict the architecture of functional brain networks constructed from preictal and ictal epochs extracted from electrophysiological (ECoG) recordings in 25 patients with medication refractory epilepsy. We hypothesized that the performance of these models would differ in functional networks constructed from ictal epochs and functional networks constructed from preictal epochs. By quantifying the strength of the model using the correlation coefficient between the true functional connectivity and the predicted functional connectivity, we find that model strength is significantly different in the ictal and preictal data, but that the most powerful models in both ictal and preictal data reach the same level of prediction accuracy. Critically, we find notable differences in the relative weight placed on structural, geometric, and genetic features between ictal and preictal data. Significantly higher weight on structural features in the ictal data suggests that the functional brain network dynamics manifest in seizures rely on – and may directly propagate along – underlying white matter structure that is largely conserved across humans, in a sterotyped manner consistent with specific putative communication policies. Significantly higher weight on Euclidean distance and gene coexpression in the preictal data suggests that the more general functional brain network dynamics occurring outside of the seizure state rely on conserved principles of geometry and shared genetic constraints. Collectively, our findings fundamentally extend our understanding of how seizure dynamics depend upon common conserved anatomical organization while breaking with constraints of local geometry and instrinsic genetic markers of neurophysiological function.

### ECoG functional connectivity is partially explained by brain structure, geometry, and genetics

Across both ictal and preictal states, even single parameter models perform remarkably well in predicting ECoG functional connectivity, and in fact, also generalize well from a single sample, here reflecting a single subject. The consistently high prediction accuracy across models is particularly notable given that the features are not specific to the subjects being tested. Indeed, gene coexpression data is taken from the Allen Brain Institute, and represents an agglomeration over a set of healthy adult donors [36, 35, 37]. Similarly, the structural features that we study – and that reflect various communication policies – are computed from the average structural connectivity of a healthy adult population [38, 39, 40]. The fact that ECoG functional connectivity is so well-predicted by models derived from these features suggests that there exist highly conserved principles of human brain anatomy, geometry, and genetics that form pervasive constraints on neurophysiological dynamics in disease across both persistent baseline and transient pathological states. It would be interesting in future work to examine the degree to which these models could potentially be enhanced with subject-level structural data, and to determine whether the same models could be used to predict different phases of seizure initiation, propagation, and termination [16,17].

### The role of white matter architecture in ictal network dynamics

We found that structural connectivity features were weighted far greater in models of the seizure state than in models of the preictal state. Interestingly, this observation suggests that the underlying anatomy of white matter structure plays an enhanced role in governing the behavior of neurophysiological dynamics during seizures. Notably, we found the greatest difference in the weights of the maximum flow feature, which was the most topologically global of the structural measures that we studied. Commonly used in information theory [41], maximum flow measures the total amount of some item that can be sent along the network from one point to another by treating the strength of edges as capacities for that edge [42]. In the context of neurophysiological processes, this feature could be considered a medium for the evolution and consequent propagation of seizure dynamics through the network [16,17].

More generally, we found that structural connectivity features varied in terms of their ability to accurately predict functional connectivity, with path length and search information performing slightly better than max flow. Along with metrics such as navigability [43], diffusion [44], and communicability [45], these measures, and the assumptions about neural communication that they entail [46], have all been used in past studies to relate structural and functional connectivity (usually estimated at rest), *via* a particular communication strategy or policy [47, 48, 49]. Typically, measures that are strongly associated with functional connectivity are also treated as being more likely to represent the true underlying process by which that functional connectivity pattern was generated. Here, however, in addition to comparing overall fit, we compare a sub-set of these measures and their sensitivities to seizure dynamics, identifying max flow as the measure with the greatest sensitivity, suggesting that it may be of greater relevance. In the future, explicitly causal and mechanistic models could be used to more deliberately test this hypothesis [50].

In comparing ictal and preictal regression weights, the emphasis on structural connectivity is coupled with a relative disregard for the constraints of gene coexpression as well as the physical distance between regions. This latter independence of dynamics from geometry is striking, particularly in light of the fact that physical distance between brain regions seems to be a pervasive constraint in healthy functional connectivity [51]. Indeed, much theoretical work has sought to frame the dependence of connectivity on distance in light of a trade-off between communication cost and communication efficiency [52]. Interestingly, our results suggest that functional connectivity during ictal periods is more strongly predicted by the underlying topological pattern of white-matter connections than by any signal conduction effects through the physical volume of the brain [10,11]. This insight suggests that functional connections observed during a seizure may actually reflect brain dynamics in more distant regions than commonly expected, underscoring the increasing need to understand distributed markers of pathology [12,13].

### Clinical applicability

In addition to offering insights into the generic architectural principles of the human brain that impact seizure dynamics, our methodology could be used to extend our understanding, models, and predictions of subject-level pathology and response to intervention in future applications. By combining our general computational framework with the goals of virtual cortical resection [53], it could be useful to test how the resection of specific cortical or subcortical volumes affects various structural measures in a manner that might be predicted to hamper seizure spread by decrementing the correspondence between ictal dynamics and white matter architecture. Furthermore, a marked limitation of current IEEG studies, both for epilepsy surgery and for basic cognitive neuroscience, is the fact that ECoG measurements provide only partial brain coverage, thereby yielding incomplete representations of epileptic networks, in some cases even missing the seizure onset zone itself or other regions of seizure spread [16]. It would be interesting to use our model to predict functional connectivity in areas of the brain that are not covered by stereo or grid electrodes, or perhaps to inform which areas of the brain absolutely require coverage in order for us to obtain accurate assessments of the pathology or predictions of response to resection or ablation. Finally, our model could be used to determine which time periods throughout the seizure or in preictal states are most relevant to the pathology of the individual, rather than to pathology shared across patients with focal epilepsy. Given the differential predictions of generic structural, geometric, and genetic data for ictal vs. preictal dynamics in low vs. high frequency bands, our results suggest that individual-specific information might be most prevalent in high frequency ictal dynamics and low frequency preictal dynamics. In future work, this suggestion could be directly tested and evaluated for utility in clinical contexts.

### Methodological considerations

Composite structural connectivity was used to compute the structural parameters for each model, meaning that none of the individual subject data has been incorporated into the training of this model. Replacing average connectivity with subject-specific connectivity might allow trained models to better predict functional connectivity of a single subject, and thus perhaps identify drivers of the epileptic network. Additionally, subjects included in this study had seizures originating in different regions of the brain, which may worsen the quality of the predictions made. To realize potential clinical uses, future work will require incorporating further unique subject information, including patient specific imaging data, to improve the specificity of results.

A second limitation concerns the gene coexpression data. These data were obtained from the Allen Brain Institute Human Brain Atlas [35] and represent the average coexpression pattern of only two individual subjects that differed in age and demographics. It remains unclear whether the coexpression patterns obtained from this small but unique dataset are, in fact, representative of typical individuals.

A third limitation concerns the reconstruction of white matter fiber networks from diffusion MRI data using tractography. This reconstruction procedure is prone to systematic errors [54,55,56], which may introduce bias into any measure computed from a white matter network, including the measures of search information, max flow, and path length studied here. Although these biases remain potential confounds, advances in acquisition hardware, processing strategies [57], and tractography algorithms [58] will help mitigate these issues in future work.

## Conclusion

Medically refractory epilepy remains a critical burden to our society, and to the patients that suffer from it. Novel technologies including enhacements to resection, laser ablation, and neurostimulation, afford increasing specificity for treatment but critically depend on the accurate fusion of multimodal neuroimaging data to map anatomical targets driving seizure dynamics. Here we propose a parsimonious model that explains how the large-scale anatomical networks and shared genetic constraints shape inter-regional communication in focal epilepsy. The work offers broad insights into the generic architectural principles of the human brain that impact seizure dynamics, and could be extended to further our understanding, models, and predictions of subject-level pathology and response to intervention.

## Conflict of Interest

The authors declare that they have no conflicts of interest related to this work.

## Materials and Methods

### ECoG data collection and preprocessing

#### Ethics Statement

All patients at the Hospital of the University of Pennsylvania included in this study gave written informed consent in accordance with the Institutional Review Board of the University of Pennsylva-nia.

#### Electrophysiology recordings

Twenty five patients, twenty at the University of Pennsylvania and five at the Mayo Clinic, undergoing surgical treatment for medically refractory epilepsy underwent implantation of subdural and depth electrodes to localize the seizure onset zone after presurgical evaluation with scalp EEG recording of ictal epochs, MRI, PET and neuropsychological testing suggested that focal resection may be a therapeutic option. Patients were then deemed candidates for implantation of intracranial electrodes to better define the epileptic network. De-identified patient data was retrieved from the online International Epilepsy Electrophysiology Portal (IEEG Portal) [59].

For patients at the University of Pennsylvania, ECoG signals were recorded and digitized at 500 Hz sampling rate using Nicolet C64 amplifiers and pre-processed to eliminate line noise. Cortical surface electrode configurations (Ad Tech Medical Instruments, Racine, WI), determined by a multidisciplinary team of neurologists and neurosurgeons, consisted of linear and two-dimensional arrays (2.3 mm diameter with 10 mm inter-contact spacing) and linear depths (1.1 mm diameter with 10 mm inter-contact spacing). Signals were recorded using a referential montage with the reference electrode being chosen by the clinical team to be distant to the site of seizure onset. Recording spanned the duration of a patient’s stay in the epilepsy monitoring unit. We note that some recording information was unavailable for patients treated at the Mayo Clinic.

#### Epileptic events

From patients at the University of Pennsylvania, we analyzed 41 partial seizures (simple and complex) and 57 partial seizures that generalized to surrounding tissue. Of the 20 Hospital of the University of Pennsylvania epilepsy patients in the study cohort, 8 patients exhibited strictly complex-partial seizures that secondarily generalized (distributed events), 6 patients exhibited strictly simple-partial or complex-partial seizures that did not secondarily generalize (focal events), and 6 patients exhibited a combination of distributed events and focal events. Seizure type, onset time, and onset localization were marked as a part of routine clinical workup. Patient demographic data for these patients is included in Table 1.

In the IEEG Portal, information regarding the type and number of seizures is unavailable for patients treated at the Mayo Clinic. For these patients, we therefore present a limited set of demographic data; see Table 2.

#### Clinical marking of the seizure onset zone

The seizure state was deemed to span between the clinically-marked earliest electrogaphic change (EEC) [60] and seizure termination. The pre-seizure state was the period immediately preceding the seizure state and of equal duration. We refer to the pair of pre-seizure and corresponding seizure states as an *event*. For patients at the University of Pennsylvania, the seizure onset zone was marked on the Intracranial EEG (IEEG) according to standard clinical protocol in the Penn Epilepsy Center. Initial clinical markings are made on the IEEG the day of each seizure by the attending physician: a board certified, staff epileptologist responsible for that inpatient’s care. Each week these IEEG markings are vetted in detail, and then finalized at surgical conference according to a consensus marking of 4 board certified epileptologists. These markings on the IEEG are then related to other multi-modality testing, such as brain MRI, PET scan [61], neuropsychological testing, and ictal SPECT scanning to finalize surgical approach and planning. This process is the standard of clinical care stipulated by the National Association of Epilepsy Centers (NAEC), and upheld at all certified Level-4 epilepsy centers in the United States. Information regarding seizure onset marking is unavailable for patients treated at the Mayo Clinic.

#### ECoG pre-processing

Artifactual channels were discarded and the remaining channels were referenced to the average signal, pre-whitened by retaining the residuals after fitting a first-order autoregressive model to the referenced time series, stop-filtered to remove line noise and its harmonics, and bandpass filtered into canonical frequency bands associated with seizure activity: *α – θ* (5-15 Hz); *β* (15-25 Hz); low-*γ* (30-40 Hz); high-*γ* (95-105 Hz). For each subject and for each trial, we computed the inter-electrode functional connectivity as a zero-lag Pearson correlation coefficient averaged across trials.

#### Mapping electrode locations

Electrode locations were identified via thresholding of each patient’s post-implant CT image. Next, each patients CT and T1-weighted MRI images were co-registered via 3D rigid affine registration. These T1-weighted MRI images were then aligned to the standard MNI brain using diffeomorphic registration with the symmetric normalization (SyN) method [62]. The resulting transformations were used to warp the coordinates of the electrode centroids into MNI space. These locations were subsequently mapped to the MNI standard coordinate system using the FSL function `img2stdcoord`. We compared each electrode’s location in MNI space to points (vertices) on the *fsaverage* pial surface, and assigned each vertex to an electrode if the Euclidean distance between the two was less than 5 mm. Each surface vertex was also assigned to one of *N* = 114 cortical regions as defined by the AAL atlas [63], thereby making it possible to map electrodes to brain regions.

#### Group-aggregated ECoG functional connectivity

For every pair of brain regions, *i* and *j*, and for each subject independently, we identified all electrode pairs, *u* and *v* where electrode *u* was assigned to region *i* and electrode *v* was assigned to region *j*. We estimated their average connection weights to generate a subject-specific inter-regional ECoG functional connectivity matrix. We then estimated the connection weight *A*_{ij} in the group-aggregated ECoG matrix by averaging connection weights over all subjects. We repeated this procedure separately for each of the four frequency bands, resulting in band-limited, wholebrain, inter-regional ECoG functional connectivity matrices.

### Diffusion imaging data collection and preprocessing

To study conserved structural features that may constrain ECoG network dynamics, we analyzed a group-representative, whole-brain structural connectivity network following [28]. Specifically, we built a group-representative connectome by combining single-subject data from a cohort of 30 healthy adult participants. Each participant’s structural network was reconstructed from diffusion spectrum images (DSI) in conjunction with state-of-the-art tractography algorithms to estimate the location and strength of large-scale interregional white matter pathways. Study procedures were approved by the Institutional Review Board of the University of Pennsylvania, and all participants provided informed consent in writing. Details of the acquisition and reconstruction have been described elsewhere [38,39,40]. We studied a division of the brain into *N* = 114 cortical regions [64]. Based on this division, we constructed for each individual an undirected and weighted connectivity matrix, *A* ∈ ℝ^{N ×N}, whose edge weights were equal to the number of streamlines detected between region *i* and region *j*, normalized by the geometric mean of region *i*’s and region *j*’s volumes: .

The resulting network was undirected (i.e., *A*_{ij} = *A*_{ji}). These individual-level networks were then aggregated to form a group-representative network. This aggregation procedure can be viewed as a distance-dependent consistency thresholding of connectome data and the details have been described elsewhere [65, 39]. The resulting group-representative network has the same number of binary connections as the average individual and the same edge length distribution. This type of non-uniform consistency thresholding has been shown to be superior to other, more commonly used forms of thresholding [66].

### Gene coexpression data collection and preprocessing

To study conserved genetic features that may constrain ECoG network dynamics, we constructed a correlation matrix of brain regions’ gene expression profiles using a similar approach and following [28]. We used normalized microarray data available from the Allen Brain Institute (http://human.brain-map.org/static/download) [36, 35, 37]. The full dataset includes six donor brains (aged 18 to 68 years) for which spatially-mapped microarray data were obtained (≈ 60,000 RNA probes). We focused on donors `10021` and `9861` which included samples (893 and 946 sites, respectively) from both the left and right hemispheres. Subsequently, we retained only those samples in the cerebral cortex [30]. Next, we extracted expression profiles for each sample, averaged over duplicate genes, and standardized expression levels across samples as *z*-scores. The standardized measure of any sample, then, measured to what extent a particular gene was differentially expressed at that cortical location relative to the other cortical locations in both hemispheres.

In addition to microarray data, the Allen Brain Institute also provides coordinates representing the location in MNI space where each sample was collected. This information facilitated the mapping of sample sites to brain regions in a procedure exactly analogous to our approach for mapping ECoG electrodes. As a result, we obtained representative expression profiles for each brain region (provided there were nearby samples). For each of the two donor brains, we calculated the region-by-region correlation matrix of standarized expression profiles. Due to the overall density of the whole-brain sampling, we were able to generate an estimate of gene expression correlation (a measure of similarity) for 6286 of 6441 possible region pairs (≈97.6%).

### Gene-ECoG optimization procedure

Following [28], we identified a subset of genes that were related to resting state ECoG functional connectivity in a separate cohort of patients and from clips far from any ictal activity. Then, we used this subset of genes to construct the gene coexpression matrix utilized in our models. In general, we sought the list of *K* genes, Γ^{K} = {*g*_{1},…, *g*_{K}} whose brain-wide coexpression matrix was maximally correlated with resting state ECoG functional connectivity in this separate cohort. While the exact solution of this optimization problem is computationally intractable (the full list included 29130 genes), we could define an objective function and use numerical methods to obtain an approximate solution.

The objective function we sought to minimize was defined as follows. Let *G*_{1}(Γ) and *G*_{2}(Γ) be the gene coexpression matrices for each of the two donor brains calculated using the gene list, Γ. We can then vectorize each matrix by extracting its upper triangle of non-zero elements. Then, after doing the same for the resting state ECoG functional connectivity matrix from the separate cohort, **A**^{ECoG}, we can calculate the correlation of gene expression with ECoG functional connectivity, resulting in two correlation coefficients *ρ*_{1} and *ρ*_{2}. In general, we wish for the magnitudes of *ρ*_{1} and *ρ*_{2} to be as large as possible. Accordingly, we defined our objective function to be *F* (*ρ*_{1}, *ρ*_{2}) = min(*ρ*_{1}, *ρ*_{2}), so that the correspondence of any gene list, Γ, with ECoG functional connectivity is only as good as the worse of the two donor brains’ correlations.

As noted earlier, optimizing this function is computationaly intractable, so we used a simulated annealing algorithm to generate estimates of the solution. In general, simulated annealing works by proposing initial estimates of the solution (that are usually poor), making small changes to these estimates and evaluating whether or not these changes improve the estimate. The algorithm begins in a “high temperature” phase, during which even changes that result in inferior estimates can be accepted, making it possible to explore the landscape of possible solutions. Gradually, a temperature parameter is reduced so that in later phases only solutions that result in improvements are accepted.

In our case, the algorithm was initialized with a temperature of *t*_{0} = 2.5 and a randomly-generated list of *K* genes, Γ, which represented our initial estimate of the solution. From this list we constructed matrices *G*_{1}(Γ) and *G*_{2}(Γ), calculated *ρ*_{1} and *ρ*_{2}, and then evaluated the objective function, *F* (*ρ*_{1}, *ρ*_{2}). With each iteration, the temperature was reduced slightly, (*t*_{i} = *t*_{i−1} × 0.99975) and one gene randomly selected from Γ was replaced with a novel gene. We then used this new list, Γ*′*, to construct *G*_{1}(Γ)*′* and *G*_{2}(Γ)*′*, from which we eventually obtained a new value of the objective function, . If , then we replaced Γ with Γ*′* and the algorithm proceeded to the next iteration. Otherwise, we accepted the Γ*′* with probability , where *t*_{i} is the temperature at the current iteration. The algorithm continued for either 200000 total iterations or 10000 consecutive iterations with no change in Γ.

The result of simulated annealing will usually vary somewhat from run to run. Accordingly, we repeated the algorithm 50 times. We also varied the number of genes, *K*, from 10 to 360 in increments of 10. We chose the optimal *K* to be the value at which the objective function was on average greatest over the 50 repetitions. Rather than treat any of the 50 estimated solutions as representative, we calculated how frequently each gene appeared across the ensemble of all 50 solutions, and we compared this frequency to what we would expect in 50 samples of *K* genes. We retained only those genes that appeared more frequently than expected (false discovery rate controlled at *q* = 0.05). These genes represented the “optimized list,” from which we constructed the gene coexpression matrix **G** = [G_{ij}], given by the Pearson correlation between the gene expression profiles of region *i* and region *j*.

### Generation of Geometric and Structural Network Features

In this section, we provide formal definitions for the structural network features that we utilized in our multi-linear regression models: **D** = [D_{ij}], the Euclidean distance between region *i* and region *j*; **G** = [G_{ij}], the Pearson correlation between gene expression profiles of region *i* and region *j*; **S**=[S_{ij}], the search information between region *i* and region *j*, a measure of the “hiddenness” of the shortest anatomical path between regions; **P**=[P_{ij}], the length of the true shortest path length between region *i* and region *j*; and **F**=[F_{ij}], the maximum flow between region *i* and region *j*, a measure that treats edge weights as capacities and sends units of flow from region *i* to region *j* while respecting maximum capacities.

#### Euclidean Distance

To estimate the *N* ×*N* matrix **D** of Euclidean distances, we considered the center of mass (COM) of each region of interest in the whole brain parcellation, and we calculated the 3-dimensional Euclidean distance between all possible pairs of COMs.

#### Path Length

To estimate the *N* ×*N* matrix **P**=[P_{ij}] of path lengths, we calculated the shortest path length between region *i* and region *j* in the group-representative structural adjacency matrix **A** using Dijk-stra’s algorithm.

#### Search information

Anatomical connectivity matrices obtained from diffusion imaging data and reconstructed using deterministic tractography are usually sparse, meaning that only a fraction of all possible connections exist [67,68]. Rather than only use the sparse connectivity matrix to predict ECoG functional connectivity, we also generated a full matrix, *S*, whose element *S*_{ij} indicates the information (in bits) required to follow the shortest path from node *i* to node *j* [69]. *Let π*_{s→t} = {*A*_{si}, *A*_{ij},…, *A*_{kt}} be the series of structural edges that are traversed along the shortest path from a source node, *s*, to a different target node, *t*, and Ω_{s→t} = {*s i j*,…, *k t*} be the sequence of nodes along the same path. The probability of following this path under random walk dynamics is given by where *s*_{i} = ∑_{j} *A*_{ij} is the weighted degree of node *i*, is the first edge on the shortest path from *i* to *t* and is the shortest path node sequence excluding the target node. The amount of information (in bits) required to access this shortest path, then, is given by *S*(*π*_{s→t}) = log_{2}(*P* (*π*_{s→t})). We can treat every pair of nodes *i j* as the source and target, respectively, and (provided that there exists a unique shortest-path from node *i* to node *j*) we can compute *S*(*π*_{i→j}) for all such pairs. The resulting matrix, *S*, termed “search information”, has been shown to partially predict BOLD FC [47] and may be modulated in certain neurological disorders [70].

#### Maximum Flow

Search information, while useful, is affected only by the edges along the shortest path. To incorporate information from longer distance paths in the network, we generated a full matrix *F*, whose element *F*_{ij} indicates the maximum amount of flow that can be sent through the network from node *i* to node *j*. In this model, the source node *i* is treated as an infinite source of flow and the weights of each edge from node *s* to node *t* are considered to be the capacity of that edge, or the amount of flow that can be sent directly from node *s* to node *t* [71]. Therefore the value *F*_{ij} is the maximum amount of flow that can be sent from node *i* to node *j* while respecting all capacities in the network. We can treat every pair of nodes {*i j*} as the source and sink, respectively, and we can compute *F* (*π*_{i→j}) for all such pairs. The resulting matrix, *F*, termed “maximum flow”, has been shown to be of value in modeling information transfer in non-shortest paths [72] and has previously been used to model connectivity changes in patients with Alzheimer’s disease [73,74].

## Acknowledgments

We thank Jennifer Stiso, Xiaosong He, and Richard Rosch for helpful comments on earlier versions of this manuscript. The work was supported by a grant to B.L. and D.S.B from the National Institute of Neurological Disorders and Stroke (R01 NS099348). D.S.B., P.G.R., and R.F.B. also acknowledge support from the John D. and Catherine T. MacArthur Foundation, the Alfred P. Sloan Foundation, the ISI Foundation, the Paul Allen Foundation, the Army Research Laboratory (W911NF-10-2-0022), the Army Research Office (Bassett-W911NF-14-1-0679, Grafton-W911NF-16-1-0474, DCIST-W911NF-17-2-0181), the Office of Naval Research, the National Institute of Mental Health (2-R01-DC-009209-11, R01 MH112847, R01-MH107235, R21-M MH-106799), the National Institute of Child Health and Human Development (1R01HD086888-01), and the National Science Foundation (BCS-1441502, BCS-1430087, NSF PHY-1554488 and BCS-1631550). The content is solely the responsibility of the authors and does not necessarily represent the official views of any of the funding agencies.