RT Journal Article SR Electronic T1 SCAN-ATAC-Sim: a scalable and efficient method for simulating single-cell ATAC-seq data from bulk-tissue experiments JF bioRxiv FD Cold Spring Harbor Laboratory SP 2020.05.29.123638 DO 10.1101/2020.05.29.123638 A1 Zhanlin Chen A1 Jing Zhang A1 Jason Liu A1 Zixuan Zhang A1 Jiangqi Zhu A1 Donghoon Lee A1 Min Xu A1 Mark Gerstein YR 2020 UL http://biorxiv.org/content/early/2020/11/03/2020.05.29.123638.abstract AB Summary scATAC-seq is a powerful approach for characterizing cell-type-specific regulatory landscapes. However, it is difficult to benchmark the performance of various scATAC-seq analysis techniques (such as clustering and deconvolution) without having a priori a known set of gold-standard cell types. To simulate scATAC-seq experiments with known cell-type labels, we introduce an efficient and scalable scATAC-seq simulation method (SCAN-ATAC-Sim) that down-samples bulk ATAC-seq data (e.g., from representative cell lines or tissues). Our protocol uses a consistent but tunable signal-to-noise ratio across cell types in a scATAC-seq simulation for integrating bulk experiments with different levels of background noise, and it independently samples twice without replacement to account for the diploid genome. Because it uses an efficient weighted reservoir sampling algorithm and is highly parallelizable with OpenMP, our implementation in C++ allows millions of cells to be simulated in less than an hour on a laptop computer.Availability SCAN-ATAC-Sim is available at scan-atac-sim.gersteinlab.org.Contact pi{at}gersteinlab.orgSupplementary information Supplementary data are available at Bioinformatics online.Competing Interest StatementThe authors have declared no competing interest.