PT - JOURNAL ARTICLE AU - Bo Liu AU - Fang-Xiang Wu AU - Xiufen Zou TI - scASK: A novel ensemble framework for classifying cell types based on single-cell RNA-seq data AID - 10.1101/2020.06.07.138271 DP - 2020 Jan 01 TA - bioRxiv PG - 2020.06.07.138271 4099 - http://biorxiv.org/content/early/2020/07/08/2020.06.07.138271.short 4100 - http://biorxiv.org/content/early/2020/07/08/2020.06.07.138271.full AB - The Human Cell Atlas (HCA) is a large project that aims to identify all cell types in the human body. The dimension reduction and clustering for identification of cell types from single-cell RNA-sequencing (scRNA-seq) data have become foundational approaches to HCA. The major challenges of current computational analyses are of poor performance on large scale data and sensitive to initial data. We present a new ensemble framework called Adaptive Slice KNNs (scASK) to address the challenges for analysing scRNA-seq data with high dimensionality. scASK consists of three innovational modules, called DAS (Data Adaptive Slicing), MCS (Meta Classifiers Selecting) and EMS (Ensemble Mode Switching), respectively, which facilitate scASK to approximate a bias-variance tradeoff beyond classification. Thirteen real scRNA-seq datasets are used to evaluate the performance of scASK. Compared with five popular classification algorithms, our experimental results indicate that scASK achieves the best accuracy and robustness among all competing methods. In conclusion, adaptive slicing is an effective structural reduction procedure, and meanwhile scASK provides novel and robust ensemble framework especially for classifying cell types based on scRNA-seq data. scASK is publically available at https://github.com/liubo2358/scASKcmd.Competing Interest StatementThe authors have declared no competing interest.