PT - JOURNAL ARTICLE AU - Chanwoo Kim AU - Hanbin Lee AU - Juhee Jeong AU - Keehoon Jung AU - Buhm Han TI - MarcoPolo: a clustering-free approach to the exploration of differentially expressed genes along with group information in single-cell RNA-seq data AID - 10.1101/2020.11.23.393900 DP - 2021 Jan 01 TA - bioRxiv PG - 2020.11.23.393900 4099 - http://biorxiv.org/content/early/2021/01/12/2020.11.23.393900.short 4100 - http://biorxiv.org/content/early/2021/01/12/2020.11.23.393900.full AB - A common approach to analyzing single-cell RNA-sequencing data is to cluster cells first and then identify differentially expressed genes based on the clustering result. However, clustering has an innate uncertainty and can be imperfect, undermining the reliability of differential expression analysis results. To overcome this challenge, we present MarcoPolo, a clustering-free approach to exploring differentially expressed genes. To find informative genes without clustering, MarcoPolo exploits the bimodality of gene expression to learn the group information of the cells with respect to the expression level directly from given data. Using simulations and real data analyses, we showed that our method puts biologically informative genes at high ranks more robustly than other existing methods. As our method provides information on how cells can be grouped for each gene, it can help identify cell types that are not separated well in the standard clustering process. Our method can also be used as a feature selection method to improve the robustness of the dimension reduction against changes in the parameters involved in the process.Competing Interest StatementBuhm Han is the CTO of Genealogy Inc.