TY - JOUR T1 - Unsupervised discovery of temporal sequences in high-dimensional datasets, with applications to neuroscience JF - bioRxiv DO - 10.1101/273128 SP - 273128 AU - Emily L. Mackevicius AU - Andrew H. Bahle AU - Alex H. Williams AU - Shijie Gu AU - Natalia I. Denissenko AU - Mark S. Goldman AU - Michale S. Fee Y1 - 2018/01/01 UR - http://biorxiv.org/content/early/2018/03/02/273128.abstract N2 - The ability to identify interpretable, low-dimensional features that capture the dynamics of large-scale neural recordings is a major challenge in neuroscience. Dynamics that include repeated temporal patterns (which we call sequences), are not succinctly captured by traditional dimensionality reduction techniques such as principal components analysis (PCA) and non-negative matrix factorization (NMF). The presence of neural sequences is commonly demonstrated using visual display of trial-averaged firing rates [15, 32, 19]. However, the field suffers from a lack of task-independent, unsupervised tools for consistently identifying sequences directly from neural data, and cross-validating these sequences on held-out data. We propose a tool that extends a convolutional NMF technique to prevent its common failure modes. Our method, which we call seqNMF, provides a framework for extracting sequences from a dataset, and is easily cross-validated to assess the significance of each extracted factor. We apply seqNMF to recover sequences in both a previously published dataset from rat hippocampus, as well as a new dataset from the songbird pre-motor area, HVC. In the hippocampal data, our algorithm automatically identifies neural sequences that match those calculated manually by reference to behavioral events [15, 32]. The second data set was recorded in birds that never heard a tutor, and therefore sang pathologically variable songs. Despite this variable behavior, seqNMF is able to discover stereotyped neural sequences. These sequences are deployed in an overlapping and disorganized manner, strikingly different from what is seen in tutored birds. Thus, by identifying temporal structure directly from neural data, seqNMF can enable dissection of complex neural circuits with noisy or changing behavioral readouts. ER -