PT - JOURNAL ARTICLE AU - Yi-Xuan Xiong AU - Meng-Guo Wang AU - Luonan Chen AU - Xiao-Fei Zhang TI - Cell-type annotation with accurate unseen cell-type identification using multiple references AID - 10.1101/2022.11.17.516980 DP - 2022 Jan 01 TA - bioRxiv PG - 2022.11.17.516980 4099 - http://biorxiv.org/content/early/2022/11/18/2022.11.17.516980.short 4100 - http://biorxiv.org/content/early/2022/11/18/2022.11.17.516980.full AB - Automated cell-type annotation using a well-annotated single-cell RNA-sequencing (scRNA-seq) reference relies on the diversity of cell types in the reference. However, for technical and biological reasons, new query data of interest may contain unseen cell types that are missing from the reference. When annotating new query data, identifying unseen cell types is fundamental not only to improve annotation accuracy but also to new biological discoveries. Here, we propose mtANN (multiple-reference-based scRNA-seq data annotation), a new method to automatically annotate query data while accurately identifying unseen cell types with the help of multiple references. Key innovations of mtANN include the integration of deep learning and ensemble learning to improve prediction accuracy, and the introduction of a new metric defined from three complementary aspects to identify unseen cell types. We demonstrate the advantages of mtANN over state-of-the-art methods for cell-type annotation and unseen cell-type identification on two benchmark dataset collections, as well as its predictive power on a collection of COVID-19 datasets.Competing Interest StatementThe authors have declared no competing interest.