Skip to main content
bioRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search
New Results

SoupX removes ambient RNA contamination from droplet based single-cell RNA sequencing data

Matthew D Young, Sam Behjati
doi: https://doi.org/10.1101/303727
Matthew D Young
1Wellcome Trust Sanger Institute, University of Cambridge
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • For correspondence: my4@sanger.ac.uk
Sam Behjati
1Wellcome Trust Sanger Institute, University of Cambridge
2Cambridge University Hospitals NHS Foundation Trust, University of Cambridge
3Department of Paediatrics, University of Cambridge
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Preview PDF
Loading

Abstract

Background Droplet based single-cell RNA sequence analyses assume all acquired RNAs are endogenous to cells. However, any cell free RNAs contained within the input solution are also captured by these assays. This sequencing of cell free RNA constitutes a background contamination that confounds the biological interpretation of single-cell transcriptomic data.

Results We demonstrate that contamination from this ‘soup’ of cell free RNAs is ubiquitous, with experiment-specific variations in composition and magnitude. We present a method, SoupX, for quantifying the extent of the contamination and estimating ‘background corrected’ cell expression profiles that seamlessly integrate with existing downstream analysis tools. Applying this method to several datasets using multiple droplet sequencing technologies, we demonstrate that its application improves biological interpretation of otherwise misleading data, as well as improving quality control metrics.

Conclusions We present ‘SoupX’, a tool for removing ambient RNA contamination from droplet based single cell RNA sequencing experiments. This tool has broad applicability and its application can improve the biological utility of existing and future data sets.

Key Points

  • The signal from droplet based single cell RNA sequencing is ubiquitously contaminated by capture of ambient mRNA.

  • SoupX is a method to quantify the abundance of these ambient mRNAs and remove them.

  • Correcting for ambient mRNA contamination improves biological interpretation.

Copyright 
The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license.
Back to top
PreviousNext
Posted February 03, 2020.
Download PDF
Email

Thank you for your interest in spreading the word about bioRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
SoupX removes ambient RNA contamination from droplet based single-cell RNA sequencing data
(Your Name) has forwarded a page to you from bioRxiv
(Your Name) thought you would like to see this page from the bioRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
SoupX removes ambient RNA contamination from droplet based single-cell RNA sequencing data
Matthew D Young, Sam Behjati
bioRxiv 303727; doi: https://doi.org/10.1101/303727
Digg logo Reddit logo Twitter logo Facebook logo Google logo LinkedIn logo Mendeley logo
Citation Tools
SoupX removes ambient RNA contamination from droplet based single-cell RNA sequencing data
Matthew D Young, Sam Behjati
bioRxiv 303727; doi: https://doi.org/10.1101/303727

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Bioinformatics
Subject Areas
All Articles
  • Animal Behavior and Cognition (3477)
  • Biochemistry (7316)
  • Bioengineering (5294)
  • Bioinformatics (20189)
  • Biophysics (9972)
  • Cancer Biology (7697)
  • Cell Biology (11243)
  • Clinical Trials (138)
  • Developmental Biology (6416)
  • Ecology (9911)
  • Epidemiology (2065)
  • Evolutionary Biology (13271)
  • Genetics (9347)
  • Genomics (12544)
  • Immunology (7667)
  • Microbiology (18928)
  • Molecular Biology (7415)
  • Neuroscience (40870)
  • Paleontology (298)
  • Pathology (1226)
  • Pharmacology and Toxicology (2125)
  • Physiology (3138)
  • Plant Biology (6836)
  • Scientific Communication and Education (1268)
  • Synthetic Biology (1891)
  • Systems Biology (5295)
  • Zoology (1083)