Skip to main content
bioRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search
New Results

MarcoPolo: a clustering-free approach to the exploration of differentially expressed genes along with group information in single-cell RNA-seq data

Chanwoo Kim, Hanbin Lee, Juhee Jeong, Keehoon Jung, View ORCID ProfileBuhm Han
doi: https://doi.org/10.1101/2020.11.23.393900
Chanwoo Kim
1Department of Electrical and Computer Engineering, Seoul National University, Seoul, Republic of Korea
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Hanbin Lee
2Department of Medicine, Seoul National University College of Medicine, Seoul, Republic of Korea
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Juhee Jeong
3Department of Biomedical Sciences, BK21 Plus Biomedical Science Project, Seoul National University College of Medicine, Seoul, Republic of Korea
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Keehoon Jung
3Department of Biomedical Sciences, BK21 Plus Biomedical Science Project, Seoul National University College of Medicine, Seoul, Republic of Korea
4Department of Anatomy and Cell Biology, Seoul National University College of Medicine, Seoul, Republic of Korea
5Institute of Allergy and Clinical Immunology, Seoul National University Medical Research Center, Seoul, Republic of Korea
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Buhm Han
3Department of Biomedical Sciences, BK21 Plus Biomedical Science Project, Seoul National University College of Medicine, Seoul, Republic of Korea
6Interdisciplinary Program in Bioengineering, Seoul National University, Seoul, Republic of Korea
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Buhm Han
  • For correspondence: buhm.han@snu.ac.kr
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Supplementary material
  • Data/Code
  • Preview PDF
Loading

Abstract

A common approach to analyzing single-cell RNA-sequencing data is to cluster cells first and then identify differentially expressed genes based on the clustering result. However, clustering has an innate uncertainty and can be imperfect, undermining the reliability of differential expression analysis results. To overcome this challenge, we present MarcoPolo, a clustering-free approach to exploring differentially expressed genes. To find informative genes without clustering, MarcoPolo exploits the bimodality of gene expression to learn the group information of the cells with respect to the expression level directly from given data. Using simulations and real data analyses, we showed that our method puts biologically informative genes at high ranks more robustly than other existing methods. As our method provides information on how cells can be grouped for each gene, it can help identify cell types that are not separated well in the standard clustering process. Our method can also be used as a feature selection method to improve the robustness of the dimension reduction against changes in the parameters involved in the process.

Competing Interest Statement

Buhm Han is the CTO of Genealogy Inc.

Footnotes

  • Emails, Chanwoo Kim: ch6845{at}snu.ac.kr, Hanbin Lee: hanbin973{at}snu.ac.kr, Juhee Jung: juhee.jeong{at}snu.ac.kr, Keehoon Jung: keehoon.jung{at}snu.ac.kr, Buhm Han: buhm.han{at}snu.ac.kr

  • https://github.com/ch6845/MarcoPolo

Copyright 
The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license.
Back to top
PreviousNext
Posted January 12, 2021.
Download PDF

Supplementary Material

Data/Code
Email

Thank you for your interest in spreading the word about bioRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
MarcoPolo: a clustering-free approach to the exploration of differentially expressed genes along with group information in single-cell RNA-seq data
(Your Name) has forwarded a page to you from bioRxiv
(Your Name) thought you would like to see this page from the bioRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
MarcoPolo: a clustering-free approach to the exploration of differentially expressed genes along with group information in single-cell RNA-seq data
Chanwoo Kim, Hanbin Lee, Juhee Jeong, Keehoon Jung, Buhm Han
bioRxiv 2020.11.23.393900; doi: https://doi.org/10.1101/2020.11.23.393900
Digg logo Reddit logo Twitter logo CiteULike logo Facebook logo Google logo Mendeley logo
Citation Tools
MarcoPolo: a clustering-free approach to the exploration of differentially expressed genes along with group information in single-cell RNA-seq data
Chanwoo Kim, Hanbin Lee, Juhee Jeong, Keehoon Jung, Buhm Han
bioRxiv 2020.11.23.393900; doi: https://doi.org/10.1101/2020.11.23.393900

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Bioinformatics
Subject Areas
All Articles
  • Animal Behavior and Cognition (2409)
  • Biochemistry (4757)
  • Bioengineering (3300)
  • Bioinformatics (14584)
  • Biophysics (6591)
  • Cancer Biology (5132)
  • Cell Biology (7384)
  • Clinical Trials (138)
  • Developmental Biology (4327)
  • Ecology (6826)
  • Epidemiology (2057)
  • Evolutionary Biology (9843)
  • Genetics (7309)
  • Genomics (9471)
  • Immunology (4509)
  • Microbiology (12597)
  • Molecular Biology (4904)
  • Neuroscience (28113)
  • Paleontology (198)
  • Pathology (799)
  • Pharmacology and Toxicology (1372)
  • Physiology (1996)
  • Plant Biology (4452)
  • Scientific Communication and Education (970)
  • Synthetic Biology (1293)
  • Systems Biology (3894)
  • Zoology (718)