Skip to main content
bioRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search
New Results

Bayesian Inference for a Generative Model of Transcriptome Profiles from Single-cell RNA Sequencing

View ORCID ProfileRomain Lopez, View ORCID ProfileJeffrey Regier, Michael Cole, View ORCID ProfileMichael Jordan, View ORCID ProfileNir Yosef
doi: https://doi.org/10.1101/292037
Romain Lopez
1Department of Electrical Engineering and Computer Science, University of California, Berkeley
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Romain Lopez
Jeffrey Regier
1Department of Electrical Engineering and Computer Science, University of California, Berkeley
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Jeffrey Regier
Michael Cole
2Department of Physics, University of California, Berkeley
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Michael Jordan
1Department of Electrical Engineering and Computer Science, University of California, Berkeley
3Department of Statistics, University of California, Berkeley
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Michael Jordan
Nir Yosef
1Department of Electrical Engineering and Computer Science, University of California, Berkeley
5Ragon Institute of MGH, MIT and Harvard
6Chan-Zuckerberg Biohub Investigator
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Nir Yosef
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Preview PDF
Loading

Abstract

Transcriptome profiles of individual cells reflect true and often unexplored biological diversity, but are also affected by noise of biological and technical nature. This raises the need to explicitly model the resulting uncertainty and take it into account in any downstream analysis, such as dimensionality reduction, clustering, and differential expression. Here, we introduce Single-cell Variational Inference (scVI), a scalable framework for probabilistic representation and analysis of gene expression in single cells. Our model uses variational inference and stochastic optimization of deep neural networks to approximate the parameters that govern the distribution of expression values of each gene in every cell, using a non-linear mapping between the observations and a low-dimensional latent space.

By doing so, scVI pools information between similar cells or genes while taking nuisance factors of variation such as batch effects and limited sensitivity into account. To evaluate scVI, we conducted a comprehensive comparative analysis to existing methods for distributional modeling and dimensionality reduction, all of which rely on generalized linear models. We first show that scVI scales to over one million cells, whereas competing algorithms can process at most tens of thousands of cells. Next, we show that scVI fits unseen data more closely and can impute missing data more accurately, both indicative of a better generalization capacity. We then utilize scVI to conduct a set of fundamental analysis tasks – including batch correction, visualization, clustering and differential expression – and demonstrate its accuracy in comparison to the state-of-the-art tools in each task. scVI is publicly available, and can be readily used as a principled and inclusive solution for multiple tasks of single-cell RNA sequencing data analysis.

Copyright 
The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license.
Back to top
PreviousNext
Posted October 14, 2018.
Download PDF
Email

Thank you for your interest in spreading the word about bioRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
Bayesian Inference for a Generative Model of Transcriptome Profiles from Single-cell RNA Sequencing
(Your Name) has forwarded a page to you from bioRxiv
(Your Name) thought you would like to see this page from the bioRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
Bayesian Inference for a Generative Model of Transcriptome Profiles from Single-cell RNA Sequencing
Romain Lopez, Jeffrey Regier, Michael Cole, Michael Jordan, Nir Yosef
bioRxiv 292037; doi: https://doi.org/10.1101/292037
Digg logo Reddit logo Twitter logo Facebook logo Google logo LinkedIn logo Mendeley logo
Citation Tools
Bayesian Inference for a Generative Model of Transcriptome Profiles from Single-cell RNA Sequencing
Romain Lopez, Jeffrey Regier, Michael Cole, Michael Jordan, Nir Yosef
bioRxiv 292037; doi: https://doi.org/10.1101/292037

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Bioinformatics
Subject Areas
All Articles
  • Animal Behavior and Cognition (4094)
  • Biochemistry (8784)
  • Bioengineering (6490)
  • Bioinformatics (23377)
  • Biophysics (11761)
  • Cancer Biology (9164)
  • Cell Biology (13267)
  • Clinical Trials (138)
  • Developmental Biology (7420)
  • Ecology (11380)
  • Epidemiology (2066)
  • Evolutionary Biology (15110)
  • Genetics (10408)
  • Genomics (14017)
  • Immunology (9133)
  • Microbiology (22086)
  • Molecular Biology (8792)
  • Neuroscience (47418)
  • Paleontology (350)
  • Pathology (1421)
  • Pharmacology and Toxicology (2483)
  • Physiology (3710)
  • Plant Biology (8060)
  • Scientific Communication and Education (1433)
  • Synthetic Biology (2213)
  • Systems Biology (6019)
  • Zoology (1251)