RT Journal Article SR Electronic T1 Cell-type annotation with accurate unseen cell-type identification using multiple references JF bioRxiv FD Cold Spring Harbor Laboratory SP 2022.11.17.516980 DO 10.1101/2022.11.17.516980 A1 Yi-Xuan Xiong A1 Meng-Guo Wang A1 Luonan Chen A1 Xiao-Fei Zhang YR 2022 UL http://biorxiv.org/content/early/2022/11/18/2022.11.17.516980.abstract AB Automated cell-type annotation using a well-annotated single-cell RNA-sequencing (scRNA-seq) reference relies on the diversity of cell types in the reference. However, for technical and biological reasons, new query data of interest may contain unseen cell types that are missing from the reference. When annotating new query data, identifying unseen cell types is fundamental not only to improve annotation accuracy but also to new biological discoveries. Here, we propose mtANN (multiple-reference-based scRNA-seq data annotation), a new method to automatically annotate query data while accurately identifying unseen cell types with the help of multiple references. Key innovations of mtANN include the integration of deep learning and ensemble learning to improve prediction accuracy, and the introduction of a new metric defined from three complementary aspects to identify unseen cell types. We demonstrate the advantages of mtANN over state-of-the-art methods for cell-type annotation and unseen cell-type identification on two benchmark dataset collections, as well as its predictive power on a collection of COVID-19 datasets.Competing Interest StatementThe authors have declared no competing interest.