PT - JOURNAL ARTICLE AU - Yuan, Han AU - Kelley, David R TI - scBasset: Sequence-based modeling of single cell ATAC-seq using convolutional neural networks AID - 10.1101/2021.09.08.459495 DP - 2021 Jan 01 TA - bioRxiv PG - 2021.09.08.459495 4099 - http://biorxiv.org/content/early/2021/09/10/2021.09.08.459495.short 4100 - http://biorxiv.org/content/early/2021/09/10/2021.09.08.459495.full AB - Single cell ATAC-seq (scATAC) shows great promise for studying cellular heterogeneity in epigenetic landscapes, but there remain significant challenges in the analysis of scATAC data due to the inherent high dimensionality and sparsity. Here we introduce scBasset, a sequence-based convolutional neural network method to model scATAC data. We show that by leveraging the DNA sequence information underlying accessibility peaks and the expressiveness of a neural network model, scBasset achieves state-of-the-art performance across a variety of tasks on scATAC and single cell multiome datasets, including cell type identification, scATAC profile denoising, data integration across assays, and transcription factor activity inference.Competing Interest StatementH.Y. and D.R.K. are paid employees of Calico Life Sciences.