TY - JOUR T1 - SSCC: a novel computational framework for rapid and accurate clustering large single cell RNA-seq data JF - bioRxiv DO - 10.1101/344242 SP - 344242 AU - Xianwen Ren AU - Liangtao Zheng AU - Zemin Zhang Y1 - 2018/01/01 UR - http://biorxiv.org/content/early/2018/06/13/344242.abstract N2 - Clustering is a prevalent analytical means to analyze single cell RNA sequencing data but the rapidly expanding data volume can make this process computational challenging. New methods for both accurate and efficient clustering are of pressing needs. Here we proposed a new clustering framework based on random projection and feature construction for large scale single-cell RNA sequencing data, which greatly improves clustering accuracy, robustness and computational efficacy for various state-of-the-art algorithms benchmarked on multiple real datasets. On a dataset with 68,578 human blood cells, our method reached 20% improvements for clustering accuracy and 50-fold acceleration but only consumed 66% memory usage compared to the widely-used software package SC3. Compared to k-means, the accuracy improvement can reach 3-fold depending on the concrete dataset. An R implementation of the framework is available from https://github.com/Japrin/sscClust. ER -