PT - JOURNAL ARTICLE AU - Lifei Wang AU - Rui Nie AU - Ruyue Xin AU - Jiang Zhang AU - Jun Cai TI - scCapsNet: a deep learning classifier with the capability of interpretable feature extraction, applicable for single cell RNA data analysis AID - 10.1101/506642 DP - 2019 Jan 01 TA - bioRxiv PG - 506642 4099 - http://biorxiv.org/content/early/2019/05/20/506642.short 4100 - http://biorxiv.org/content/early/2019/05/20/506642.full AB - Recently deep learning methods have been applied to process biological data and greatly pushed the development of the biological research forward. However, the interpretability of the deep learning methods still needs to improve. Here for the first time, we present scCapsNet, a totally interpretable deep learning model adapted from CapsNet. The scCapsNet model retains the capsule parts of CapsNet but replaces the part of convolutional neural networks with several parallel fully connected neural networks. We apply scCapsNet to scRNA-seq data. The results show that scCapsNet performs well as a classifier and also that the parallel fully connected neural networks function like feature extractors as we supposed. The scCapsNet model provides contribution of each extracted feature to the cell type recognition. Evidences show that some extracted features are nearly orthogonal to each other. After training, through analysis of the internal weights of each neural network connected inputs and primary capsule, and with the information about the contribution of each extracted feature to the cell type recognition, the scCapsNet model could relate gene sets from inputs to cell types. The specific gene set is responsible for the identification of its corresponding cell types but does not affect the recognition of other cell types by the model. Many well-studied cell type markers are in the gene set with corresponding cell type. The internal weights of neural network for those well-studied cell type markers are different for different primary capsules. The internal weights of neural network connected to a primary capsule could be viewed as an embedding for genes, convert genes to real value low dimensional vectors. Furthermore, we mix the RNA expression data of two cells with different cell types and then use the scCapsNet model trained with non-mixed data to predict the cell types in the mixed data. Our scCapsNet model could predict cell types in a cell mixture with high accuracy.