Skip to main content
bioRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search
New Results

bayNorm: Bayesian gene expression recovery, imputation and normalisation for single cell RNA-sequencing data

Wenhao Tang, François Bertaux, Philipp Thomas, Claire Stefanelli, Malika Saint, View ORCID ProfileSamuel Marguerat, View ORCID ProfileVahid Shahrezaei
doi: https://doi.org/10.1101/384586
Wenhao Tang
1Department of Mathematics, Faculty of Natural Sciences, Imperial College, London SW7 2AZ, UK
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
François Bertaux
1Department of Mathematics, Faculty of Natural Sciences, Imperial College, London SW7 2AZ, UK
2MRC London Institute of Medical Sciences (LMS), Du Cane Road, London W12 0NN, UK
3Institute of Clinical Sciences (ICS), Faculty of Medicine, Imperial College London, Du Cane Road, London W12 0NN, UK
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Philipp Thomas
1Department of Mathematics, Faculty of Natural Sciences, Imperial College, London SW7 2AZ, UK
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Claire Stefanelli
1Department of Mathematics, Faculty of Natural Sciences, Imperial College, London SW7 2AZ, UK
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Malika Saint
2MRC London Institute of Medical Sciences (LMS), Du Cane Road, London W12 0NN, UK
3Institute of Clinical Sciences (ICS), Faculty of Medicine, Imperial College London, Du Cane Road, London W12 0NN, UK
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Samuel Marguerat
2MRC London Institute of Medical Sciences (LMS), Du Cane Road, London W12 0NN, UK
3Institute of Clinical Sciences (ICS), Faculty of Medicine, Imperial College London, Du Cane Road, London W12 0NN, UK
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Samuel Marguerat
  • For correspondence: samuel.marguerat@imperial.ac.uk v.shahrezaei@imperial.ac.uk
Vahid Shahrezaei
1Department of Mathematics, Faculty of Natural Sciences, Imperial College, London SW7 2AZ, UK
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Vahid Shahrezaei
  • For correspondence: samuel.marguerat@imperial.ac.uk v.shahrezaei@imperial.ac.uk
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Preview PDF
Loading

Abstract

Normalisation of single cell RNA sequencing (scRNA-seq) data is a prerequisite to their interpretation. The marked technical variability and high amounts of missing observations typical of scRNA-seq datasets make this task particularly challenging. Here, we introduce bayNorm, a novel Bayesian approach for scaling and inference of scRNA-seq counts. The method’s likelihood function follows a binomial model of mRNA capture, while priors are estimated from expression values across cells using an empirical Bayes approach. We demonstrate using publicly-available scRNA-seq datasets and simulated expression data that bayNorm allows robust imputation of missing values generating realistic transcript distributions that match single molecule FISH measurements. Moreover, by using priors informed by dataset structures, bayNorm improves accuracy and sensitivity of differential expression analysis and reduces batch effect compared to other existing methods. Altogether, bayNorm provides an efficient, integrated solution for global scaling normalisation, imputation and true count recovery of gene expression measurements from scRNA-seq data.

Copyright 
The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license.
Back to top
PreviousNext
Posted August 03, 2018.
Download PDF
Email

Thank you for your interest in spreading the word about bioRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
bayNorm: Bayesian gene expression recovery, imputation and normalisation for single cell RNA-sequencing data
(Your Name) has forwarded a page to you from bioRxiv
(Your Name) thought you would like to see this page from the bioRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
bayNorm: Bayesian gene expression recovery, imputation and normalisation for single cell RNA-sequencing data
Wenhao Tang, François Bertaux, Philipp Thomas, Claire Stefanelli, Malika Saint, Samuel Marguerat, Vahid Shahrezaei
bioRxiv 384586; doi: https://doi.org/10.1101/384586
Reddit logo Twitter logo Facebook logo LinkedIn logo Mendeley logo
Citation Tools
bayNorm: Bayesian gene expression recovery, imputation and normalisation for single cell RNA-sequencing data
Wenhao Tang, François Bertaux, Philipp Thomas, Claire Stefanelli, Malika Saint, Samuel Marguerat, Vahid Shahrezaei
bioRxiv 384586; doi: https://doi.org/10.1101/384586

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Bioinformatics
Subject Areas
All Articles
  • Animal Behavior and Cognition (4230)
  • Biochemistry (9123)
  • Bioengineering (6766)
  • Bioinformatics (23969)
  • Biophysics (12109)
  • Cancer Biology (9510)
  • Cell Biology (13753)
  • Clinical Trials (138)
  • Developmental Biology (7623)
  • Ecology (11674)
  • Epidemiology (2066)
  • Evolutionary Biology (15492)
  • Genetics (10631)
  • Genomics (14310)
  • Immunology (9473)
  • Microbiology (22822)
  • Molecular Biology (9086)
  • Neuroscience (48920)
  • Paleontology (355)
  • Pathology (1480)
  • Pharmacology and Toxicology (2566)
  • Physiology (3841)
  • Plant Biology (8322)
  • Scientific Communication and Education (1468)
  • Synthetic Biology (2295)
  • Systems Biology (6180)
  • Zoology (1299)