Directed effective connectivity and synaptic weights of in vitro neuronal cultures revealed from high-density multielectrode array recordings

Studying connectivity of neuronal cultures can provide insights for understanding brain networks but it is challenging to reveal neuronal connectivity from measurements. We apply a novel method that uses a theoretical relation between the time-lagged cross-covariance and the equal-time cross-covariance to reveal directed effective connectivity and synaptic weights of cortical neuron cultures at different days in vitro from multielectrode array recordings. Using a stochastic leaky-integrate-and-fire model, we show that the simulated spiking activity of the reconstructed networks can well capture the measured network bursts. The neuronal networks are found to be highly nonrandom with an over-representation of bidirectionally connections as compared to a random network of the same connection probability, with the fraction of inhibitory nodes comparable to the measured fractions of inhibitory neurons in various cortical regions in monkey, and have small-world topology with basic network measures comparable to those of the nematode C. elegans chemical synaptic network. Our analyses further reveal that (i) the excitatory and inhibitory incoming degrees have bimodal distributions the excitatory and inhibitory incoming degrees have bimodal distributions, which are that distributions that have been indicated to be optimal against both random failures and attacks in undirected networks; (ii) the distribution of the physical length of excitatory incoming links has two peaks indicating that excitatory signal is transmitted at two spatial scales, one localized to nearest nodes and the other spatially extended to nodes millimeters away, and the shortest links are mostly excitatory towards excitatory nodes and have larger synaptic weights on average; (iii) the average incoming and outgoing synaptic strength is non-Gaussian with long tails and, in particular, the distribution of outgoing synaptic strength of excitatory nodes with excitatory incoming synaptic strength is lognormal, similar to the measured excitatory postsynaptic potential in rat cortex. Author summary To understand how the brain processes signal and carries out its function, it is useful to know the connectivity of the underlying neuronal circuits. For large-scale neuronal networks, it is difficult to measure connectivity directly using electron microscopy techniques and methods that can estimate connectivity from electrophysiological recordings are thus highly desirable. Existing methods focus mainly on estimating functional connectivity, which is defined by statistical dependencies between neuronal activities but the relevant direct casual interactions are captured by effective connectivity. Here we apply a novel covariance-relation based method to estimate the directed effective connectivity and synaptic weights of cortical neuron cultures from recordings of multielectrode array of over 4000 electrodes taken at different days in vitro. The neuronal networks are found to be nonrandom, small-world, excitation/inhibition balanced as measured in monkey cortex, and with feeder hubs. Our analyses further suggest some form of specialisation of nodes in receiving excitatory and inhibitory signals and the transmission of excitatory signals at two spatial scales, one localized to nearest nodes and the other spatially extended to nodes millimeters away, and reveal that the distributions of the average incoming and outgoing synaptic strength are skewed with long tails.


18
Unravelling synaptic connectivity and understanding the relationship between 19 connectivity and dynamics and between network structure and function of neuronal 20 circuits are actively pursued topics in neuroscience [1,2]. In vitro neuronal cultures 21 serve as simple yet useful experimental model systems for studying the relationship 22 between connectivity and the rich dynamics observed [3][4][5]. One common technique 23 used to record the activity of neurons in cultures is direct measurement of the electrical 24 signals of neurons using multielectrode array (MEA) [6]. It is thus of great interest to 25 estimate connectivity from MEA recordings. There is a distinction between functional 26 and effective connectivity [7]. Functional connectivity is defined by statistical 27 dependencies between the activities of neurons and is usually inferred based on the 28 cross-covariance or cross-correlation of spike trains with the spikes first being detected 29 from the recorded electrical signals of the electrodes of the MEA [8][9][10]. As statistical 30 dependencies can arise from indirect interactions, functional connectivity does not 31 necessarily reflect the direct causal interactions among the neurons. Direct causal 32 interaction is referred to as effective connectivity, and is more relevant for studying the 33 relationship between connectivity and dynamics and between network structure and 34 function. Effective connectivity is used to be inferred by using some specific model of 35 interactions and, as a result, it is sometimes said to be model-dependent. 36 Reconstructing functional connectivity from measurements is already highly challenging 37 and nontrivial especially for large-scale MEA with several thousands of electrodes. 38 Existing methods thus focus predominantly on estimating functional connectivity. 39 For a class of networked systems having generic stationary dynamics modelled by a 40 set of stochastic differential equations with directional interactions, it has been shown 41 that the equal-time cross-covariance of the dynamics of the nodes does not carry 42 sufficient information to recover the effective connectivity [11,12], and the effective 43 connectivity matrix can instead be extracted using the theoretical relation between the 44 time-lagged cross-covariance and the equal-time cross-covariance [13]. Similar result has 45 been found for systems with discrete-time dynamics [14]. This covariance-relation based 46 method thus infers effective connectivity for a generic class of models and not only for a 47 specific model. It has been validated by numerical simulations for weighted directed 48 random and scale-free networks with different nonlinear dynamics and directional 49 coupling including particularly the FitzHugh-Nagamo dynamics [15] with directional 50 synaptic-like coupling [13,16]. 51 In this paper, we show how we adopt this covariance-relation based method to cross-correlation based methods that estimate functional connectivity, inhibitory links 59 pose additional challenges [10] and bidirectional connections are difficult to detect. As 60 the validity of the covariance-relation based method has already been shown using 61 numerical data, we focused to test how well the estimated directed effective connectivity 62 can reproduce the measured spiking activity using a stochastic leaky-integrate-and-fire 63 model. We carried out analyses of the reconstructed networks studying the basic 64 network features and the distributions of incoming and outgoing degrees as well as the 65 average incoming and outgoing synaptic strength.  A significant fraction of nodes participated in the network bursts but some nodes did 107 not. We denoted nodes that have no measured spiking activity in T obs as null-nodes and 108 checked how well the simulated results managed to retrieve these null-nodes. As a 109 control, we randomly shuffled the reconstructed network with the distributions of the 110 incoming and outgoing degrees kept intact and assigned the synaptic weights randomly 111 from a Gaussian distribution of zero mean and unit variance and repeated the precision, which measures the fraction of nodes simulated to be null-nodes are indeed 116 measured null-nodes. Denote the number of null-nodes that have null or nonzero 117 simulated spiking activity by N T P or N F N and the number of non null-nodes with null 118 or nonzero simulated spiking activity by N F P or N T N . The sensitivity (SEN) and 119 precision, which is also known as the positive predictive value (PPV) are defined by The results obtained are presented in Table 1. Both SEN and PPV are much higher for 121 the reconstructed networks than for the control randomly-shuffled networks. Thus the 122 reconstructed networks outperform the control randomly reshuffled networks in 123 retrieving the null nodes.  Table 1. The sensitivity and precision or positive predictive value of the reconstructed and control randomly-shuffled networks in retrieving null-nodes. f is the fraction of the N T N nodes that have correlation coefficient ρ ≥ 0.8.
We further studied the correlation of the measured and simulated spiking activity at 125 tens of millisecond timescale. We calculated the Pearson correlation coefficient ρ 126 between the measured and simulated spiking activity using a sliding window of 127 w = 28.4 ms for each of the N T N nodes with non-zero measured and simulated spiking 128 activity. The distributions of the correlation coefficient P (ρ) are shown in Fig 2. The 129 fraction f of these N T N nodes with ρ ≥ 0.8 is significantly larger for the simulated 130 results using the reconstructed networks (see Table 1) than those using the control 131 randomly shuffled networks.  Table 2). The connection probability p of a network of N nodes with N L links is defined 139 by p = N L /[N (N − 1)]. We found that p ranges from 1.1-1.9%, confirming that the 140 neuronal networks are sparse. This value of p is comparable to that of the chemical 141 synapse network of C. elegans [20]. Most of the connections are unidirectional as 142 expected. For a random network of the same number of nodes and the same connection 143 probability p, the expected number of bidirectionally connected pairs is given by 144 N (N − 1)p 2 /2. The number of bidirectionally connected pairs is indeed comparable to 145 this expected number for the control randomly shuffled networks but exceeds four times 146 this expected number for the reconstructed networks. This observation of an 147 over-representation of bidirectional connections as compared to a random network in 148 the reconstructed neuronal networks echoes previous reports for local regions of the rat 149 cortex [21][22][23][24] Thus neuronal networks are highly nonrandom. As a comparison, we 150 calculated the number of bidirectional connections in the functional connectivity 151 estimated by using the cross-covariance based methods [10] Table 2. Basic network measures of the reconstructed networks for the 8 DIVs including the connection probability p, the ratio r B of the number of bidirectionally connected pairs to the expected number N (N − 1)p 2 /2 for a random network with p, the fractions f E and f I of excitatory and inhibitory nodes, the fraction f SCC of nodes that form the strongly connected component, the characteristic path length l, the average clustering coefficient CC and the small world index SWI. When available, the corresponding results for the chemical synapse network of C. elegans [20] are included for comparison.
Each node in the neuronal network is inferred to be excitatory or inhibitory  Table 2, the fraction f E of excitatory nodes first increases as DIV increases and 158 peaks at DIV25 then generally decreases as DIV further increases while the fraction f I 159 of inhibitory nodes first decreases and reaches a minimum at DIV22 and then generally 160 increases as DIV further increases. Moreover, f I ranges from 0.13 to 0.31, which is 161 comparable to the measured fractions (0.15-0.30) of inhibitory neurons in various 162 cortical regions in monkey [25]. The fraction f SCC of nodes that form the largest Distributions of incoming and outgoing degrees 171 We calculated both the incoming degree k in and the outgoing degree k out for each node, 172 and k in = k out as the neuronal networks are directed. The distributions of the incoming 173 and outgoing degrees are qualitatively the same for all the 8 DIVs and the results for 174 DIV25 are shown in Fig 4. A striking feature of the distribution of the incoming degree 175 is its approximate bimodal feature, which is in contrast to the degree distributions of 176 the chemical synapse network of C. elegans [20] and the scale-free distribution found in 177 the functional connectivity [10]. For undirected networks, there have been studies 178 indicating that the robustness against both random failures and attacks can be 179 optimized by having a bimodal degree distribution [26,27].

180
For DIV25, the outgoing degree of most of the nodes is below 30, which is smaller 181 than the average incoming degree, and a small fraction of nodes have exceptionally large 182 k out (k out > 1000). This is the case for both excitatory and inhibitory nodes. Thus the 183 hubs of the neuronal networks are excitatory and inhibitory feeders. The spatial 184 location of these feeder hubs for DIV25 and DIV33 is shown in Fig 5. Interestingly, the 185 feeder hubs locate mostly on the top and bottom rows of the MEA probe.

186
To investigate further the approximate bimodal feature of P (k in ), we calculated and a larger mode of smaller k + in or smaller k − in (blue). In the insets, we show the conditional distributions of P (k + in |k − in ) and P (k − in |k + in ) for nodes in the two modes.
For excitatory nodes s out > 0 whereas for inhibitory nodes s out < 0. For nodes with no 212 detectable outgoing links, s out = 0. For each node, regardless of whether it is excitatory, 213 inhibitory or has no detectable links, s in can take either sign. Synaptic weights are 214 difficult to be inferred from MEA recordings and results on the distributions of s in and 215 s out are generally lacking in previous studies.

216
As shown in Fig 9, the distribution of s out is different for excitatory and inhibitory 217 nodes, and for excitatory nodes, the distribution is further different for nodes with 218 positive and negative average incoming synaptic strength s in . For excitatory nodes with 219 s in > 0, s out has an approximately lognormal distribution. This result is in accord with 220 the general findings of lognormal distribution of synaptic weights in both in vitro and in 221 vivo studies [28]. For excitatory nodes with s in < 0 and for inhibitory nodes, the better approximated by slightly asymmetric log-exponential distributions.

224
The standardized distributions of |s in | have a universal shape for all DIVs, 225 regardless of whether s in is greater or less than zero (see Fig 10). These distributions   correctly 99.8% of the non-existent links and 83.0% of the links in the subnetwork. We 260 compare the spatial distribution of the excitatory, inhibitory nodes and nodes with no 261 detectable links in the subnetwork (which is the same as the central region of the plots 262 shown in Fig 3) and the partial network in Fig 12. There are differences but a great 263 resemblance can be seen. We further compare the strongest excitatory and inhibitory links of the partial 265 network and the subnetwork using the same criterion defined for strongest links in theoretically that for a generic class of models and not just for a specific model, the 287 effective connectivity matrix is contained in a theoretical relation between the 288 time-lagged cross-covariance and the equal-time cross-covariance [13]. Making use of 289 this relation, a method has been developed to recover effective connectivity [13] and this 290 method was validated for models with different dynamics and coupling using numerical 291 data [13,16]. This covariance-relation based method involves a calculation of a principal 292 matrix logarithm, which is very sensitive to noise in the data. When we applied it 293 directly to the MEA recordings, we obtained an unphysical complex matrix for the 294 supposed effective connectivity matrix . A complex matrix was also found when this 295 covariance relation was used directly to estimate directed effective connectivity of a 296 cortical network of 68 regions from fMRI recordings and this motivated the development 297 of a Lyapunov optimization procedure [11]. Here, we showed that the unphysical 298 complex matrix problem can be avoided by first applying a moving average filter to the 299 MEA recordings to reduce random noise (c.f. Materials and methods for the method of 300 reconstruction). Moreover, we extended the originally proposed clustering analysis using 301 a Gaussian mixture model [13] to include also a single-Gaussian fit with outliers. We  [25]. In this sense, the reconstructed neuronal networks are 327 excitation/inhibition balanced as found in monkey cortex.

328
Neuronal networks are expected to be organized and should thus be highly 329 nonrandom in order to transmit signals efficiently and to carry out their functions 330 effectively. Similar to the chemical synapse network of C. elegans [20], the reconstructed 331 neuronal networks have small-world topology, and it has been shown that small-word 332 network allows efficient transmission of information [35]. The nonrandom feature of 333 local cortical circuits has been well documented by an over-representation of 334 bidirectional connected pairs as compared to a random network of the same connection 335 probability [21][22][23][24]. In the reconstructed neuronal networks, the number of can calculate the distributions of incoming and outgoing synaptic strength of the nodes. 374 We found that these distributions are described by lognormal or log-exponential 375 functional form (Fig 9 and 10) and they are in line with the ubiquitous lognormal-like 376 distributions of many physiological and anatomical features at different levels in the 377 brain [28]. The significance of these skewed non-Gaussian distributions with long tails is 378 that a small fraction of nodes have dominantly strong synaptic strength suggesting the 379 possibility that the bulk of the information flow occurring mostly through these 380 nodes [28]. Understanding how this lognormal-like synaptic strength might be related to 381 the spiking dynamics or the general relation between connectivity and dynamics will be 382 a topic of great interest for future studies. Neuronal activities from HD MEA were recorded with the recording device (BioCAM, 405 3Brain AG) and the associate software (BrainWave 2.0, 3Brain AG) at 7.06 kHz. One 406 electrode was used for calibration purpose so there were 4095 electrodes that recorded 407 signals. Spontaneous activities were recorded from 6 to over 60 days in vitro. Samples 408 were placed into the recording device 10 min before the recording in order to prevent  412 We assumed that the activity x i (t), recorded by each of the electrodes, is stationary and 413 considered a generic dynamical model in which the time evolution of the fluctuations 414 around the average value, δx i (t) = x i (t) − x i (t) , is governed by a system of linear 415 stochastic differential equations

Method of reconstruction
where . . . denotes a time average, Q ij = w ij (1 − δ ij ) + Q ii δ ij are the elements of the 417 interaction matrix Q and η i is a Gaussian white noise of zero mean with When the activity x j affects the activity x i , there is a link from node j to node i with 419 synaptic weight w ij otherwise w ij = 0. Thus w ij give the directed and weighted 420 effective connectivity of the neuronal network. We note that such a linear stochastic 421 system of stationary fluctuating activity can arise even for systems whose activity obey 422 nonlinear dynamics [13]. For such a model, the time-lagged covariance matrix K(τ ) and 423 the equal-time covariance matrix K(0) [11,13,40,41] of the measurements are related by 424 the interaction matrix Q: February 3, 2020 16/22 The elements of K(τ ) and K(0) are defined by respectively. This relation implies that the off-diagonal elements of 427 M = log(K(τ )K(0) −1 )/τ , where log is the principal matrix logarithm, would separate 428 into two groups, one corresponds to w ij = 0 and another to w ij = 0. It has been 429 shown [13] that by clustering analysis of M ij using a Gaussian mixture model, w ij can 430 be inferred. We extended this proposed clustering analysis to include also a 431 single-Gaussian fit with outliers.

432
For the reconstruction, we only made use of measurements y i (t), i = 1, 2, . . . , 4095 433 taken during which all the 4095 electrodes were recording properly. The principal 434 matrix logarithm is very sensitive to noise in the measurements and a calculation of 435 K y (τ ) and K y (0) directly from these MEA recordings y i (t) [as defined in Eqs (6) and 436 (7) with x i replaced by y i ] would lead to the matrix log(K y (τ )K y (0) −1 ) having complex 437 elements. This has also been found for fMRI measurements [11]. We avoided this 438 problem by applying a moving average filter to the MEA recordings to reduce random 439 noise. The moving average filter is a simple digital low-pass filter that is optimum for 440 reducing random noise while retaining sharp transitions in the data [43]. Specifically, we 441 took x i (t) = [y i (t) + y i (t + ∆)]/2, where ∆ is the sampling time interval for all the valid 442 MEA recordings y i (t), i = 1, 2, . . . , 4095, and calculated K(τ ) and K(0) from x i (t) with 443 τ = ∆. The resulting matrix M was found to be real. Then we adopted the clustering 444 method of Ref. [13] with several modifications to cluster M ij 's for each fixed j into two 445 groups, one corresponds to w ij = 0 and the other to w ij = 0. 446 We assumed that the outgoing links of each node, if exist, can only be all excitatory 447 with w ij > 0 or all inhibitory with w ij < 0. This is in accord to neurons are either 448 excitatory or inhibitory. We first fitted the values of M ij 's with fixed j by a Gaussian 449 mixture model of two components, namely by a sum of two Gaussian distributions, 450 using MATLAB 'fitgmdist', which return the means, µ 1 and µ 2 , and standard 451 deviations, σ 1 and σ 2 , of the two Gaussian distributions, and the relative proportion of 452 the two components. We divided the results into three cases according to the separation 453 between the two fitted Gaussian distributions: (i) the two distributions are well 454 separated with |µ 1 − µ 2 | > σ 1 + σ 2 , (ii) the two distributions are very close to one 455 another with |µ 1 − µ 2 | < max(σ 1 , σ 2 ), and (iii) the in-between case with 456 max(σ 1 , σ 2 ) ≤ |µ 1 − µ 2 | ≤ σ 1 + σ 2 . For case (i), we continued with this two-Gaussian 457 fit. For case (ii), we fitted the data again by a single Gaussian distribution instead. For 458 case (iii), we chose either the two-Gaussian fit or the single-Gaussian fit according to 459 the Bayesian information criterion (BIC) for fitting models selection [42,44].

460
When using the two-Gaussian fit, we thus inferred the unconnected component for 461 w ij = 0 as the component of a relative proportion greater than 0.6 based on the 462 assumption that the neuronal cultures are sparse networks or as the component whose 463 mean is closer to zero when none of the relative proportions exceeds 0.6. We obtained 464 the posterior probability p of each data point belonging to the unconnected component 465 using MATLAB 'cluster'. For each M ij , if p > 0.5 we inferred w ij = 0 otherwise, we 466 inferred w ij to be given by M ij − M ij |w ij = 0 i and M ij |w ij = 0 i is the average of 467 M ij with inferred w ij = 0 over i for fixed j. Thus the synaptic weights w ij of the 468 outgoing links of each node j have the same sign. Depending on whether w ij is greater 469 than or less than zero, the node was inferred to be excitatory or inhibitory (c.f. Fig 15). 470 When using the single-Gaussian fit, we identified the data points that are well fitted 471 by the Gaussian, denoted by P G of mean µ, with w ij = 0 and the outliers, which are The dynamics of a stochastic leaky-integrate-and-fire (LIF) neuron is governed by the 486 following equation where V i (t) is the membrane potential of neuron i, V r is the resting potential, C m and 488 G m are the membrane capacitance and membrane conductance, τ m = C m /G m is the 489 decay constant, and I syn i (t) is the synaptic current input from the presynaptic nodes. When V i (t) exceeds a threshold V th , node i generates an action potential or spike at 491 that instant and V i is reset to V r . The synaptic current I syn i (t) is given by February 3, 2020 18/22 where V E and V I are the reversal potentials satisfying V I ≤ V r < V th ≤ V E , and G E (t) 493 and G I (t) are the excitatory and inhibitory synaptic conductances given by where G is a typical synaptic conductance, w ij are the synaptic weights, τ s is the decay 495 constant of the synaptic conductances, Θ(t) is the Heaviside step function and t j,k is the 496 spiking time of the k-th spike of the presynaptic neuron j.

497
To simplify the notations, we worked with the dimensionless membrane potential, 498 which is defined by with 0 ≤ v i ≤ 1 then the equations become where β = G/C m , and η i is a Gaussian white noise of zero mean defined by Eq (4) with D ij = σ 2 . When 502 v i ≥ 1, v i is reset to 0.

503
In the numerical simulations, we used τ m = 20 ms, τ s = 2 ms, v E = 6 and v I = −2 504 (which follow from V r = −60 mV, V th = −50 mV, V E = 0 mV and V I = −80 mV) [45] 505 and σ = 2. We took β = 1000 and β = 40 respectively for the reconstructed networks 506 and the control randomly-shuffled networks with standardized Gaussian synaptic 507 weights. The time step ∆t of the simulations was taken to be the sampling time interval 508 ∆t=1/7060s ≈ 0.142 ms and we obtained the spiking activity of each node i using the 509 measured spikes of all its presynaptic nodes (with w ij = 0) as inputs.