RT Journal Article SR Electronic T1 Embracing the dropouts in single-cell RNA-seq data JF bioRxiv FD Cold Spring Harbor Laboratory SP 468025 DO 10.1101/468025 A1 Peng Qiu YR 2018 UL http://biorxiv.org/content/early/2018/11/17/468025.abstract AB One primary reason that makes the analysis of single-cell RNA-seq data challenging is dropouts, where the data only captures a small fraction of the transcriptome of each cell. Many computational algorithms developed for single-cell RNA-seq adopted gene selection and dimension reduction strategies to address the dropouts. Here, an opposite view is explored. Instead of treating dropouts as a problem to be fixed, we embrace it as a useful signal for defining cell types. We present an iterative co-occurrence clustering algorithm that works with binarized single-cell RNA-seq count data. Surprisingly, although all the quantitative information is removed after the data is binarized, co-occurrence clustering of the binarized data is able to effectively identify cell populations, as well as cell-type specific pathways. We demonstrate that the binary dropout patterns of the data provides not only overlapping but also complementary information compared to the quantitative gene expression counts in single-cell RNA-seq data.