TY - JOUR T1 - The rise of sparser single-cell RNAseq datasets; consequences and opportunities JF - bioRxiv DO - 10.1101/2022.05.20.492823 SP - 2022.05.20.492823 AU - Gerard A. Bouland AU - Ahmed Mahfouz AU - Marcel J.T. Reinders Y1 - 2022/01/01 UR - http://biorxiv.org/content/early/2022/05/21/2022.05.20.492823.abstract N2 - There is an exponential increase in the number of cells measured in single-cell RNA sequencing (scRNAseq) datasets. Concurrently, scRNA-seq datasets become increasingly sparser as more zero counts are measured for many genes. We discuss that with increasing sparsity the binarized representation of gene expression becomes as informative as count-based expression. We show that downstream analyses based on binarized gene expressions give similar results to analyses based on count-based expressions. Moreover, a binarized representation scales to 17-fold more cells that can be analyzed using the same amount of computational resources. Based on these observations, we recommend the development of specialized tools for bit-aware implementations for downstream analyses tasks, creating opportunities to get a more fine-grained resolution of biological heterogeneity.Competing Interest StatementThe authors have declared no competing interest. ER -