Abstract
We introduce AtacWorks (https://github.com/clara-genomics/AtacWorks), a method to denoise and identify accessible chromatin regions from low-coverage or low-quality ATAC-seq data. AtacWorks uses a deep neural network to learn a mapping between noisy ATAC-seq data and corresponding higher-coverage or higher-quality data. To demonstrate the utility of AtacWorks, we train a model on data from four blood cell types and show that this model accurately denoises and identifies peaks from low-coverage bulk sequencing of different individuals, cell types, and experimental conditions. Further, we show that the deep learning model can be generalized to denoise low-quality data, aggregate single-cell ATAC-seq profiles, and Tn5 insertion sites for transcription factor footprinting. Finally, we apply our deep learning approach to denoise single-cell ATAC-seq data from hematopoietic stem cells to identify differentially-accessible regulatory elements between rare lineage-primed cell subpopulations.