PT - JOURNAL ARTICLE AU - Sheng Wang AU - Angela Oliveira Pisco AU - Jim Karkanias AU - Russ B. Altman TI - Unifying single-cell annotations based on the Cell Ontology AID - 10.1101/810234 DP - 2019 Jan 01 TA - bioRxiv PG - 810234 4099 - http://biorxiv.org/content/early/2019/10/20/810234.short 4100 - http://biorxiv.org/content/early/2019/10/20/810234.full AB - Single cell technologies have rapidly generated an unprecedented amount of data that enables us to understand biological systems at single-cell resolution. However, analyzing datasets generated by independent labs remains challenging due to a lack of consistent terminology to describe cell types. Here, we present OnClass, an algorithm and accompanying software for automatically classifying cells into cell types represented by a controlled vocabulary derived from the Cell Ontology. Cell type similarity is inferred according to the distances in the Cell Ontology so a key advantage of OnClass is its ability to annotate cell types that are not present in the training set by using the hierarchical structure of the vocabulary space. We applied OnClass to diverse collections of single cell transcriptomics of both mouse and human and observed substantial improvement on automated cell type annotation. We further demonstrated how OnClass can be used to identify marker genes for cell types present and absent in the training set, suggesting that OnClass can be used as a tool to associate marker genes to each term of the Cell Ontology, offering the possibility of refining the Cell Ontology using a data-centric approach.