RT Journal Article SR Electronic T1 Unifying single-cell annotations based on the Cell Ontology JF bioRxiv FD Cold Spring Harbor Laboratory SP 810234 DO 10.1101/810234 A1 Sheng Wang A1 Angela Oliveira Pisco A1 Jim Karkanias A1 Russ B. Altman YR 2019 UL http://biorxiv.org/content/early/2019/10/20/810234.abstract AB Single cell technologies have rapidly generated an unprecedented amount of data that enables us to understand biological systems at single-cell resolution. However, analyzing datasets generated by independent labs remains challenging due to a lack of consistent terminology to describe cell types. Here, we present OnClass, an algorithm and accompanying software for automatically classifying cells into cell types represented by a controlled vocabulary derived from the Cell Ontology. Cell type similarity is inferred according to the distances in the Cell Ontology so a key advantage of OnClass is its ability to annotate cell types that are not present in the training set by using the hierarchical structure of the vocabulary space. We applied OnClass to diverse collections of single cell transcriptomics of both mouse and human and observed substantial improvement on automated cell type annotation. We further demonstrated how OnClass can be used to identify marker genes for cell types present and absent in the training set, suggesting that OnClass can be used as a tool to associate marker genes to each term of the Cell Ontology, offering the possibility of refining the Cell Ontology using a data-centric approach.