Skip to main content
bioRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search
New Results

Benchmarking Automated Cell Type Annotation Tools for Single-cell ATAC-seq Data

View ORCID ProfileYuge Wang, Xingzhi Sun, View ORCID ProfileHongyu Zhao
doi: https://doi.org/10.1101/2022.10.05.511014
Yuge Wang
1Department of Biostatistics, Yale School of Public Health, New Haven, CT, United States
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Yuge Wang
Xingzhi Sun
2Department of Statistics and Data Science, Yale University, New Haven, CT, United States
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Hongyu Zhao
1Department of Biostatistics, Yale School of Public Health, New Haven, CT, United States
3Program of Computational Biology and Bioinformatics, Yale University, New Haven, CT, United States
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Hongyu Zhao
  • For correspondence: hongyu.zhao@yale.edu
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Supplementary material
  • Data/Code
  • Preview PDF
Loading

Abstract

As single-cell chromatin accessibility profiling methods advance, scATAC-seq has become ever more important in the study of candidate regulatory genomic regions and their roles underlying developmental, evolutionary and disease processes. At the same time, cell type annotation is critical in understanding the cellular composition of complex tissues and identifying potential novel cell types. However, most existing methods that can perform automated cell type annotation are designed to transfer labels from an annotated scRNA-seq data set to another scRNA-seq data set, and it is not clear whether these methods are adaptable to annotate scATAC-seq data. Several methods have been recently proposed for label transfer from scRNA-seq data to scATAC-seq data, but there is a lack of benchmarking study on the performance of these methods. Here, we evaluated the performance of five scATAC-seq annotation methods on both their classification accuracy and scalability using publicly available single-cell datasets from mouse and human tissues including brain, lung, kidney, PBMC and BMMC. Using the BMMC data as basis, we further investigated the performance of these methods across different data sizes, mislabeling rates, sequencing depths and the number of cell types unique to scATAC-seq. Bridge integration, which is the only method that requires additional multimodal data and does not need gene activity calculation, was overall the best method and robust to changes in data size, mislabeling rate and sequencing depth. Conos was the most time and memory efficient method but performed the worst in terms of prediction accuracy. scJoint tended to assign cells to similar cell types and performed relatively poorly for complex datasets with deep annotations but performed better for datasets only with major label annotations. The performance of scGCN and Seurat v3 was moderate, but scGCN was the most time-consuming method and had the most similar performance to random classifiers for cell types unique to scATAC-seq.

Competing Interest Statement

The authors have declared no competing interest.

Footnotes

  • https://github.com/AprilYuge/ATAC-annotation-benchmark

Copyright 
The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license.
Back to top
PreviousNext
Posted October 10, 2022.
Download PDF

Supplementary Material

Data/Code
Email

Thank you for your interest in spreading the word about bioRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
Benchmarking Automated Cell Type Annotation Tools for Single-cell ATAC-seq Data
(Your Name) has forwarded a page to you from bioRxiv
(Your Name) thought you would like to see this page from the bioRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
Benchmarking Automated Cell Type Annotation Tools for Single-cell ATAC-seq Data
Yuge Wang, Xingzhi Sun, Hongyu Zhao
bioRxiv 2022.10.05.511014; doi: https://doi.org/10.1101/2022.10.05.511014
Reddit logo Twitter logo Facebook logo LinkedIn logo Mendeley logo
Citation Tools
Benchmarking Automated Cell Type Annotation Tools for Single-cell ATAC-seq Data
Yuge Wang, Xingzhi Sun, Hongyu Zhao
bioRxiv 2022.10.05.511014; doi: https://doi.org/10.1101/2022.10.05.511014

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Bioinformatics
Subject Areas
All Articles
  • Animal Behavior and Cognition (4397)
  • Biochemistry (9623)
  • Bioengineering (7118)
  • Bioinformatics (24928)
  • Biophysics (12658)
  • Cancer Biology (9985)
  • Cell Biology (14395)
  • Clinical Trials (138)
  • Developmental Biology (7985)
  • Ecology (12141)
  • Epidemiology (2067)
  • Evolutionary Biology (16020)
  • Genetics (10947)
  • Genomics (14774)
  • Immunology (9896)
  • Microbiology (23730)
  • Molecular Biology (9501)
  • Neuroscience (51036)
  • Paleontology (370)
  • Pathology (1544)
  • Pharmacology and Toxicology (2690)
  • Physiology (4035)
  • Plant Biology (8687)
  • Scientific Communication and Education (1512)
  • Synthetic Biology (2404)
  • Systems Biology (6453)
  • Zoology (1349)