TY - JOUR T1 - scCDC: a computational method for gene-specific contamination detection and correction in single-cell and single-nucleus RNA-seq data JF - bioRxiv DO - 10.1101/2022.11.24.517598 SP - 2022.11.24.517598 AU - Weijian Wang AU - Yihui Cen AU - Zezhen Lu AU - Yueqing Xu AU - Tianyi Sun AU - Ying Xiao AU - Wanlu Liu AU - Jingyi Jessica Li AU - Chaochen Wang Y1 - 2022/01/01 UR - http://biorxiv.org/content/early/2022/11/25/2022.11.24.517598.abstract N2 - In droplet-based single-cell RNA-seq (scRNA-seq) and single-nucleus RNA-seq (snRNA-seq) assays, systematic contamination of ambient RNA molecules biases the estimation of genuine transcriptional levels. To correct the contamination, several computational methods have been developed. However, these methods do not distinguish the contamination-causing genes and thus either under- or over-corrected the contamination in our in-house snRNA-seq data of virgin and lactating mammary glands. Hence, we developed scCDC as the first method that specifically detects the contamination-causing genes and only corrects the expression counts of these genes. Benchmarked against existing methods on synthetic and real scRNA-seq and snRNA-seq datasets, scCDC achieved the best contamination correction accuracy with minimal data alteration. Moreover, scCDC applies to processed scRNA-seq and snRNA-seq data with empty droplets removed. In conclusion, scCDC is a flexible, accurate decontamination method that detects the contamination-causing genes, corrects the contamination, and avoids the over-correction of other genes.Competing Interest StatementThe authors have declared no competing interest. ER -