Scalable and accurate automated method for neuronal ensemble detection in spiking neural networks

We propose a novel, scalable, and accurate automated method for detecting neuronal ensembles from a population of spiking neurons. Our approach offers a simple yet powerful tool to study ensemble activity. It allows the participation of neurons in different ensembles, has few parameters to tune and is computationally efficient. We used spike trains of retinal ganglion cells obtained from multi-electrode array recordings under a simple ON-OFF light stimulus to test our method. We found a consistent stimuli-evoked ensemble activity intermingled with spontaneously active ensembles and irregular activity. Our results suggest that the early visual system activity is already organized in clearly distinguishable functional ensembles. To validate the performance and generality of our method, we generated synthetic data, where we found that our method accurately detects neuronal ensembles for a wide range of simulation parameters. Additionally, we found that our method outperforms current alternative methodologies. Finally, we provide a Graphic User Interface, which aims to facilitate our method’s use by the scientific community. Author summary Neuronal ensembles are strongly interconnected groups of neurons that tend to fire together (Hebb 1949). However, even when this concept was proposed more than 70 years ago, only recent advances in multi-electrode arrays and calcium imaging, statistical methods, and computing power have made it possible to record and analyze multiple neurons’ activities spiking simultaneously, providing a unique opportunity to study how groups of neurons form ensembles spontaneously and under different stimuli scenarios. Using our method, we found that retinal ganglion cells show a consistent stimuli-evoked ensemble activity, and, when validated with synthetic data, the method shows good performance by detecting the number of ensembles, the activation times, and the core-cells for a wide range of firing rates and number of ensembles accurately.

1 Introduction 1 Donald Hebb predicted more than 70 years ago that ensembles would naturally arise 2 from synaptic learning rules, where neurons that fire together would wire together [1]. 3 However, despite the long history of this idea, only recently the simultaneous recordings 4 and computational analysis from hundreds of cells have turned out to be possible [2]. 5 Recent advances in recording technology of neuronal activity combined with 6 sophisticated methods of data analysis have revealed significant synchronous activity 7 between neurons at several spatial and temporal scales [3][4][5]. Theses groups of neurons 8 that have the tendency to fire together known as neuronal ensembles (also called cell 9 assemblies) are hypothesized to be a fundamental unit of neural processes and form the 10 basis of coherent collective brain dynamics [1,[6][7][8]. 11 The idea that the coding units of information are groups of neurons firing together 12 (not single neurons) represented a paradigm shift in the field of computational 13 neuroscience [6]. Neuronal ensembles have been proposed as a fundamental building 14 block of the whole-brain dynamics, and relevant to cognitive functions, in particular, as 15 ensemble activity could implement brain-wide functional integration and segregation [3]. 16 Large-scale neuronal recordings techniques such as multi-electrode arrays (MEA) or 17 calcium imaging, allow for the recording of the activity of hundreds and even thousands 18 of neurons simultaneously [5,[9][10][11][12]. These recent technological advances provide a fertile 19 ground for analyzing neuronal ensembles and investigating how collective neuronal 20 activity is generated in the brain. Recent studies using multi-neuronal recording 21 techniques have revealed that a hallmark of population activity is the organization of 22 neurons into ensembles, generating new insights and ideas about the neural 23 code [9,[13][14][15]. In particular, the activation of specific ensembles has been shown to 24 correlate with spontaneous and stimuli evoked brain function [16]. The brain-wide 25 alterations present in neurological and mental impairments disrupt population activity 26 patterns and therefore affect the neuronal ensembles. Indeed, neuronal ensembles are 27 susceptible to epileptic seizures and schizophrenia as shown in in vivo two-photon 28 calcium imaging data in mouse [17,18], in medically-induced loss of consciousness in 29 mice and human subjects [17] and in a mice model of autism [19]. 30 However, identifying and extracting features of ensembles from high-dimensional 31 spiking data is challenging. Neuronal ensembles have different sizes and have different 32 activity rates. Some neurons may not participate in ensemble activity, while others may 33 participate in many, and not all neurons within an ensemble fire when the ensemble is 34 active. Ensembles can exhibit temporal extension, overlap, or display a hierarchical 35 organization, making it difficult to distinguish between them [2]. 36 Sophisticated statistical and computational tools are needed to extract relevant 37 features of neuronal activity. Detecting neuronal ensembles can be posed as a clustering 38 problem, where the binary population patterns (one-time bin of the spike train) 39 generated by the neural population are the variables to cluster. Given the probabilistic 40 nature of spiking activity, the challenge is to distinguish between core and non-core cells, 41 i.e., cells that belong to the ensemble and cells that do not, respectively. 42 There exists a wide variety of techniques and ideas that have been used to detect 43 and interpret neuronal ensembles (see Refs. [13,20,21] for reviews), for example, 44 previous works have applied different methodologies such as principal component 45 analysis [22][23][24][25], correlation between neurons [25][26][27], correlation between population 46 spiking patterns [16], statistical evaluation of patterns [28][29][30][31] and non-negative matrix 47 factorization [24,32]. Among them, only the one based on the correlation between 48 population patterns, and the one based on non-negative matrix factorization, fit with a 49 definition of ensembles. Under this definition, neurons may participate in many 50 ensembles, and ensembles have a specific duration in time. However, these methods are 51 computationally expensive, and require tuning several parameters, which hinder their application by the scientific community. 53 We exemplify our method's advantages on spiking neuronal data recorded using 54 MEA from mouse in vitro retinal patch, from which the spiking activity of hundreds of 55 retinal ganglion cells (RGCs) is obtained during a simple ON-OFF stimuli, i.e. 56 consequtive changes between light and darkness.

57
In brief, the vertebrate retina is part of the central nervous system composed of 58 thousand of neurons of several types [33][34][35], organized in a stratified way with nuclear 59 and plexiform layers [36]. This neural network has the capability to process several 60 features of the visual scene [37], whose result is conveyed to the brain through the optic 61 nerve, a neural tract composed mainly by the axons of the RGCs. In fact, the 62 physiological mechanisms involved in many of those processes are starting to become 63 clear with the development of new experimental and computational methods [35,36,38]. 64 One of the most remarkable, and simple, example of retinal processing is the ON-OFF 65 responses of RGCs, a stereotypical increase or decrease in firing rate when confronted to 66 changes light intensity. In this case, the connectivity between RGCs and Bipolar cells 67 (Bc) in the Inner Plexiform Layer (IPL) plays a major role, determining the tendency of 68 RGC to preferentially fire when the light increased, decreased, or both [36]; this 69 property is often called polarity [39], and represents the broadest functional 70 classification of RGCs into ON, OFF, and ON-OFF cell types. 71 We hypothesize that retinal ensembles could also exhibit this property as a whole.

72
Our analysis revealed the existence of diverse ON and OFF retinal ensembles with a 73 specific stimulus preference as functional units, which suggests that a stimulus tuning 74 preference is a property of the ensembles as a whole, and not a simple inheritance from 75 their corresponding core-cells. Besides, we validated our method on synthetic data 76 where ensembles were artificially generated, showing a remarkable detection 77 performance for the ensemble number, the ensemble activation sequence, and the 78 core-cells detection over a wide range of parameters.

79
Thus, we tested and validated our method using biological and synthetic data, 80 showing its accuracy and broad applicability to different scenarios. To facilitate our 81 method's use by the community, we provide a Graphic User Interface and the codes that 82 implement our algorithm that aim to provide a fast, scalable, and accurate solution to 83 the problem of detecting neuronal ensembles in multi-unit recordings.  These different evoked responses led us to expect at least four types of ensembles: 98 two ensembles related to transitions (one for the ON-OFF and one for the OFF-ON) 99 and two ensembles related to the decaying-stable activity after the ON-OFF and 100 OFF-ON transition, respectively. To test this hypothesis, we applied our ensemble 101 detection algorithm (see Methods for details and parameters) on the spiking activity of 102 319 RGCs during 120 seconds of MEA recording. 103 We found ten ensembles comprising ∼ 68% of the spike patterns in the analyzed 104 recording. Their activity was highly locked to the stimulus (Fig. 1A, D, middle and 105 bottom panel). We found two transiently active ensembles (one for the ON-OFF and 106 one for the OFF-ON transition), whose activity was only evoked by the stimulus 107 transition, showing no activity either before the stimulus start (black arrow in Fig. 1A) 108 or during the decaying or stable response. The other eight ensembles were active before 109 stimulus presentation, but at a lower rate, and during the decaying or stable response. 110 Notably, four of them (Ens. 7-10) are preferentially active in the OFF stimulus, and the 111 other four during the ON stimulus (Ens. [2][3][4][5]. 112 These results show that the detected retinal ensembles are preferentially tuned to 113 the features of the stimulus, showing that without any stimulus-related information, our 114 method can obtain ensembles whose activity is functionally coupled with the stimulus. 115 Functional properties of RGC ensembles 116 We detect the ensembles and their core-cells i.e. cells whose correlation with a given 117 ensemble is statistically significant(see Methods for details). On average, each core-cell 118 participated in 2.7 ± 1.3 ensembles, and only four RGCs were considered non-core, 119 considering all the ensembles. Three cells participated in six ensembles, indicating that 120 some cells may participate in both ensemble classes.

121
The two transiently active ensembles, namely Ens. 1 (ON) and Ens. 6 (OFF), have 122 257 and 222 core-cells, respectively, while the rest of the ensembles have, on average, 123 47.2 ± 17.9 core-cells (Fig. 1E). This result is consistent with the increased population 124 activity evoked by stimulus transition, where many cells are firing, and decaying 125 responses where many cells become silent.   Since ON-OFF cells are present in all ensembles, the core-cells of each ensemble, as a 142 group, have a preference for both light transitions, despite the precise stimulus tuning of 143 the ensembles (Fig. 1D).

144
The PSTH for each ensemble was also computed to provide the detailed tuning of 145 each ensemble and compare the core-cells tuning preferences and the ensemble tuning 146 preference. 147 We conclude that the ensemble tuning preference cannot be completely derived from 148 their core-cells' tuning preference, providing evidence in favor of neuronal ensembles as 149 whole functional units. The colored matrix shows the RGCs in rows and the ensembles in the columns. Each column is colored in the entries corresponding to their core-cells, according to which ensemble they belong (A).

Method overview 151
Before a systematic evaluation of the performance of our method, we summarize its core 152 elements using the example shown in Fig. 2, and refer the reader to the Methods section 153 for further details.

154
First, we discard any population pattern of the spike trains ( Fig. 2A) with less than 155 three spikes, and then perform a principal component analysis (PCA) using the 156 population patterns as observations (Fig. 2B). Then, we computed the Euclidean The inner-cluster mean correlation is compared against a threshold of non-core cell correlation, discarding any cluster below this threshold (cluster 1, in this case). This final filtering yields the set of ensembles. (I) Each spike pattern shown in A is colored according to the cluster to which they belong. Black patterns were discarded by the minimum number of spikes or the threshold rejection criteria. (J) The detected ensemble sequence is represented as an integer sequence, where the color corresponds to the spike pattern color in I. Note that the rejection criteria discarded cluster 1, so only ensembles 2-5 are shown, and black circles denote non-clustered spike patterns.
distance between all the patterns projected on a small subset of the first principal 158 components (from three to six), generating a distance matrix, from which the local 159 density, ρ, and the distance to the next denser point, δ, is computed for each population 160 pattern. Based on these two measures, and a power-law fit to the ρ vs. δ curve, we 161 automatically detect the cluster centroids (Fig. 2C), and assign the rest of the patterns 162 to their closest centroid, building up the clusters (Fig. 2D). Since the core of our 163 algorithm is the density-based clustering procedure, we call it Density-based.

164
To find the core-cells, we computed corr(n, e), the correlation between the spike 165 train of neuron n and the activation sequence of cluster e (Fig. 2E), and test its 166 significance using a threshold associated with a null hypothesis obtained from shuffled 167 versions of data (Fig. 2F). If corr(n, e) is above the threshold, neuron n is considered a 168 core-cell of cluster e (Fig. 2G).

169
Finally, to obtain the ensembles, the pairwise correlation between all the core-cells of 170 a given cluster are computed and averaged, representing the within-cluster average 171 correlation.

172
To define a cluster be an ensemble, we compared the within-cluster average 173 correlation to the average pairwise correlation of the whole network (Fig. 2H). If the 174 within-cluster if significantly higher (based on a threshold), we considered an ensemble 175 to be present; otherwise, the cluster is discarded due to the lack of internal correlation. 176 With this procedure, we were able to split the population patterns into an ensemble 177 and non-ensemble patterns (Fig. 2I), and to obtain the activation sequence of different 178 ensembles in time (Fig. 2J) with their corresponding core-cells.

179
Detection of ensembles on synthetic data 180 To test our method, we designed an algorithm to generate synthetic data where 181 neuronal ensembles' activity can be parametrically controlled (see Methods for details). 182 We assessed the detection performance of our method concerning known ground truth 183 (GT). We illustrate our results with a simple example shown in Fig. 3, where different 184 spike trains were generated using a network of N = 100 neurons and T = 5000 bins (the 185 figure shows just 200 bins to improve the visualization). We created seven ensembles 186 defining the temporal sequence of the ensembles, whose global temporal activity 187 comprised 80% of the sample (Fig. 3E), and the core-cells, whose participation 188 comprised from 20 to 40 neurons (Fig. 3I). Then, we randomly added or removed spikes 189 to each neuron in order to satisfy a given firing probability for each neuron, P (n), which 190 controls the spike trains density (Fig. 3B, C, D). 191 We found two ensembles more than expected (seven) for the low-density spike-trains 192 (Fig. 3F, red asterisks), while for the medium and high-density ones we found the 193 expected number of ensembles (Fig. 3G, H, respectively). 194 Then, we compared the ensemble sequence of the GT (Fig. 3E) and the detected 195 ensemble sequence in each density scenario (Fig. 3F, G, H), finding almost perfect 196 agreement between both, with the exception of a few false positives in the case of low 197 and high density. Finally, we compared the detected core-cells with the GT (Fig. 3J, K, 198 L), finding good agreement between both. Despite the over-detection in the low-density 199 regime (red asterisk), the other ensembles were in good agreement with the GT size, as expected, but the SVD-based method scaled exponentially, while the 220 Density-based method is two orders of magnitude faster for T = 10 4 (Fig. 4A).

221
Regarding the ensemble activity, our method accurately detects the number of 222 ensembles for samples as small as T = 1000, while the SVD-based method converges to 223 underestimation of the ensemble number (Fig. 4B). Once the detected ensembles were 224 matched to their closest GT ensemble (see Methods for details), we measured the 225 correlation between both sequences, finding that our method detects the GT with 226 excellent performance over the whole range of sample sizes. The SVD-based method, in 227 turn, systematically fails to detect the global sequence (Fig. 4C). 228 Furthermore, we measured the average correlation between the detected and GT 229 individual sequences, and between the detected and GT core-cells, finding again that 230 our method achieved excellent performance for both features even for small sample sizes 231 (Fig. 4D, E, respectively).

232
Finally, to evaluate the performance for different network sizes, we fixed the sample 233 size, while also keeping the other parameters fixed (number of ensembles, of core-cells 234 and P E ), and varied the network size from N = 50 to N = 1000, finding that our 235 method slightly overestimated the ensemble number for small N , but yielded accurate 236 results for the larger N (Fig. 4F, G). 237 We conclude that our method reliably performs on a wide range of sample and 238 network sizes, giving a more scalable and accurate solution to the ensemble detection 239 problem than the alternative SVD-based method. Here, we show the performance of our method when the number of ensembles and the 243 number of core-cells vary independently. 244 We generated synthetic spike-trains with fixed network size (N = 300), sample size 245 (T = 5000), ensemble probability (P E = 0.8), and medium density while the number of 246 ensembles and core-cells varied, as shown on Fig. 5. We found a wide combination of 247 these parameters where the method detects the number of ensembles with a small 248 relative error (Fig. 5A), accurately detects the global ensemble sequence (Fig. 5B), and 249 the corresponding core-cells (Fig. 5C). Further explorations of other parameters and 250 combinations of the same are considered work to be developed. To this end, we provide 251 the computational codes and a GUI at 252 https://github.com/brincolab/NeuralEnsembles. This GUI allows one to perform 253 all the analyses presented here on multi-variate recordings of single events (e.g., spiking 254 data, calcium events, arrival times in a sensor).  (S 1,t , . . . , S N,t ). 262 We generated a set of ensembles E characterized by their core-cells and an activation 263 sequence of ensembles. Each ensemble was composed of a fixed number of core cells  Synthetic data generated using medium density, N = 300, and T = 5000. Heat maps represent the average over 100 repetitions using the same spike-train parameters. Red/blue represents over/under detection respectively. (B) The correlation between the detected and the GT ensemble sequence. (C) Average correlation between the detected and the GT core-cells.
core-cells among ensembles. We generated for each ensemble, a column binary vector c e 266 of dimension N , with c e (n) = 1 if neuron n is a core-cell of ensemble e and 0 otherwise. 267 The activation sequence of each ensemble e ∈ E follows a time homogeneous 268 Bernoulli process with parameter p e , denoted B(p e ). We generate a row binary vector 269 denoted by a e of dimension T with a e (t) = 1 if ensemble e is active on time bin t and 0 270 otherwise. We allowed at most one active ensemble per bin. To implement this idea, we 271 first drew the first ensemble's activation times at random and then removed these times 272 from the list of possible time points available for the second ensemble. We proceeded 273 this way until reaching P E .

274
With the core-cells and the activation sequence of each ensemble defined, a spike train was generated by the product c e a e (matrix) of dimension N × T .

279
We randomly added/removed spikes to/from each neuron's spike train until matched 280 the target firing rate P (n). In removing spikes, the spike pattern located on the time 281 bins where a given ensemble was active, end up having less active neurons than defined; 282 on the other hand, if we added spikes, the opposite effect occurred. We used three 283 different spike train densities: low (s.d. = 0.05), medium (s.d. = 0.1) and high 284 (s.d. = 0.2). For the low-density case, matching the target firing rates usually required 285 the removal of spikes. This method corrupted the patterns related to ensemble activity 286 by underrepresenting their core-cells. In the medium case, we had a balance between 287 adding and removing spikes, and, in the last case, most neurons required the addition of 288 spikes to match the desired firing rate, making neurons participate in ensembles they 289 did not belong by construction.

290
In summary, we generated a spike train from the following procedure:     To detect neuronal ensembles from data, we used Principal Component Analysis (PCA) 304 over a subset of population patterns. PCA extracts a set of orthogonal directions 305 capturing the most significant variability observed on the spike patterns. We discarded 306 the spike patterns with less than three spikes and projected the spike patterns on the 307 first six principal components (PCs). Using four to seven principal components yielded 308 no difference in the clustering procedure. However, the optimal number of PCs may 309 vary depending on the data, so it is considered a free parameter in the GUI. We 310 computed the Euclidean distance between all binary patterns projected on the first six 311 principal components to guide close patterns' clustering. to find the centroids of the clusters follows [40], which is a modified version of the 322 method developed in [41]. To automatically detect the cluster centroids, we fit a 323 power-law to the δ vs. ρ curve, using the 99.9% upper confidence boundary as a 324 threshold. We considered centroids to be all points falling outside this boundary. The 325 rest of the points were assigned to their closest centroids. Thus, each binary pattern 326 with more than three spikes was assigned to a cluster, obtaining, in this way, their 327 corresponding activation times. To distinguish between a core and a non-core cell, we set a threshold obtained from 336 a null hypothesis built from shuffled versions of the (n, e) pair. For each cluster, e ∈ E, 337 the number of times in which it was active is kept fixed, but the temporal sequence was 338 randomly shuffled, obtaining a new temporal sequence of activation that we denoted e r . 339 We repeated this procedure 5000 times for each cluster, obtaining a distribution of 340 corr(n, e r ) for neuron n in cluster e, which represented the null hypothesis distribution 341 associated with the correlation between a neuron and a cluster given by chance. A 342 significance threshold was then defined as the 99.9-th percentile of this distribution; 343 therefore, we set p < 0.001 as threshold θ e . Then, if corr(n, e) > θ e , we considered that 344 n was a core-cell of cluster e, as their correlation was above the significance level.

345
Ensemble selection criteria 346 Following [42], we used two criteria: i) minimum cluster size and ii) within-cluster 347 average correlation. The first criterion considered as candidates for neuronal ensembles 348 clusters whose number of core-cells was above three. The second criterion compared the 349 average pairwise correlation between all the core-cells of the same cluster to the average 350 pairwise correlation of the whole population plus a threshold. This way, we kept only 351 clusters with a minimum size (in terms of core-cells) and with relatively (concerning the 352 population) high within-cluster average correlation. If a cluster failed to pass any of these two criteria, we discarded it from the analysis. Otherwise, we considered the 354 cluster as an ensemble.

355
Matching detected ensembles with ground truth ensembles 356 To evaluate our method with synthetic data, we matched the GT ensembles with the 357 detected ones. This was implemented by computing the correlation between the GT's 358 activation sequence and the detected ensembles. Thus, we looked for the detected 359 ensemble that maximized the correlation with the GT ensembles.

361
We introduced a scalable and computationally efficient method to detect neuronal 362 ensembles from spiking data. Using simple clustering techniques [41], and suitable  finding communities between spiking neurons along time [25,43], or using event-related 375 activity [42]. Despite the usefulness of these methods in their context, our analysis is 376 more general. It relies on grouping the population spiking patterns with no 377 event-related information, allowing us to segment a spiking network's temporal activity 378 under both spontaneous and stimuli-evoked conditions.

379
Regarding the SVD-based method [9], we acknowledge the insights that its 380 application has provided to the study of neural networks. However, it has limitations in 381 terms of computational cost (for relatively small sample sizes, Fig. 4) and parameter 382 tuning. Further, the SVD-based method extracts the temporal sequence of ensembles 383 from the spectral decomposition of the similarity matrix between population patterns, 384 aiming to detect groups of linearly independent patterns. Instead, we use a subset of 385 the first principal components to embed the population patterns and cluster them into 386 that space, with no need to find linear independence between the clusters. Finally, the 387 SVD-based method has no explicit implementation for the evaluation of the 388 within-ensemble correlation, which, in our case, is a critical step to distinguish between 389 a noisy cluster and an ensemble cluster.

390
There is room for further improvement of our method. For example, non-linear spike 391 correlations may produce spurious results in ensemble detection methods that depend 392 on PCA. Alternative approaches can be adapted to other measures of spike train 393 similarity besides linear correlations [44]. In our example of RGCs, while we use the 394 standard bin size for mammalian RGCs, the bin size used to create the spike trains may 395 influence the detected ensembles, and thus different bin sizes can yield different results. 396 Finally, we provided novel evidence in favor of the existence of retinal ensembles that 397 are functionally coupled to the stimuli. However, our purpose was to exemplify our 398 method on a biological spiking network rather than explaining the possible mechanisms 399 involved in the activity of retinal ensembles or their functional implications. Indeed, we 400 consider the study of retinal ensembles as an exciting new research avenue that needs to 401 be developed in the way to understand vision, sensory systems, and more generally, the 402 nervous system. In line with this perspective, we delivered a method that is general 403 enough to be applied to any multi-variate binary data set. Furthermore, it is intuitive 404 and can generate results that are easy to visualize, which should favor their 405 comprehension and general use by the scientific community. Matlab codes and a GUI 406 implementing our new method accompany this article. The GUI and the codes used to validate the method using synthetic data can be found 410 at https://github.com/brincolab/NeuralEnsembles. Valparaíso, at 20-25°C on a 12-h light-dark cycle, with access to food and water 419 ad-libitum. These recordings were performed for other experimental purposes, and for 420 the present work only one recording was used. The corresponding methods of MEA 421 recording have previously been described [45]. In brief, animals were euthanized under 422 deep isofluorane or halothane anesthesia and both eyes were extracted. Then, one of the 423 extracted retinas was diced into quarters while the other was stored in oxygenated (O 2 424 95% CO 2 5%) AMES medium at 32°C in the dark for further experiments. The same Germany), one piece of retina was mounted onto a dialysis membrane then placed into a 428 ring device mounted in a traveling (up/down) cylinder, which was moved to contact the 429 electrode surface of the MEA recording array. Data were processed off-line using the 430 Spiking-Circus spike sorting algorithm [40] with default parameters. Associates, Rochester, USA) and connected to an inverted microscope (Lens 4x, Eclipse 436 TE2000, NIKON, Japan). The image was conformed by 380 x 380 pixels, each covering 437 5µm 2 . To estimate the RGC receptive fields, a checkerboard stimulus (visual white 438 noise) with a block size of 50µm was presented at a rate of 60 Hz for 20 mins, with each 439 block independently taking 0 or 255 (max value) in the pixel value scale. To classify 440 the RGC, a green ON-OFF light stimulus was presented, where each part lasted three 441 seconds, repeated 21 times. For the classification analysis, the first trial was discarded. 442