RT Journal Article SR Electronic T1 scCapsNet: a deep learning classifier with the capability of interpretable feature extraction, applicable for single cell RNA data analysis JF bioRxiv FD Cold Spring Harbor Laboratory SP 506642 DO 10.1101/506642 A1 Wang, Lifei A1 Nie, Rui A1 Xin, Ruyue A1 Zhang, Jiang A1 Cai, Jun YR 2019 UL http://biorxiv.org/content/early/2019/05/20/506642.abstract AB Recently deep learning methods have been applied to process biological data and greatly pushed the development of the biological research forward. However, the interpretability of the deep learning methods still needs to improve. Here for the first time, we present scCapsNet, a totally interpretable deep learning model adapted from CapsNet. The scCapsNet model retains the capsule parts of CapsNet but replaces the part of convolutional neural networks with several parallel fully connected neural networks. We apply scCapsNet to scRNA-seq data. The results show that scCapsNet performs well as a classifier and also that the parallel fully connected neural networks function like feature extractors as we supposed. The scCapsNet model provides contribution of each extracted feature to the cell type recognition. Evidences show that some extracted features are nearly orthogonal to each other. After training, through analysis of the internal weights of each neural network connected inputs and primary capsule, and with the information about the contribution of each extracted feature to the cell type recognition, the scCapsNet model could relate gene sets from inputs to cell types. The specific gene set is responsible for the identification of its corresponding cell types but does not affect the recognition of other cell types by the model. Many well-studied cell type markers are in the gene set with corresponding cell type. The internal weights of neural network for those well-studied cell type markers are different for different primary capsules. The internal weights of neural network connected to a primary capsule could be viewed as an embedding for genes, convert genes to real value low dimensional vectors. Furthermore, we mix the RNA expression data of two cells with different cell types and then use the scCapsNet model trained with non-mixed data to predict the cell types in the mixed data. Our scCapsNet model could predict cell types in a cell mixture with high accuracy.