Skip to main content
bioRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search
New Results

Panoramic stitching of heterogeneous single-cell transcriptomic data

View ORCID ProfileBrian Hie, View ORCID ProfileBryan Bryson, View ORCID ProfileBonnie Berger
doi: https://doi.org/10.1101/371179
Brian Hie
1Computer Science and Artificial Intelligence Laboratory, MIT, Cambridge, MA 02139, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Brian Hie
Bryan Bryson
2Department of Biological Engineering, MIT, Cambridge, MA 02139, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Bryan Bryson
Bonnie Berger
1Computer Science and Artificial Intelligence Laboratory, MIT, Cambridge, MA 02139, USA
3Department of Mathematics, MIT, Cambridge, MA 02139, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Bonnie Berger
  • For correspondence: bab@mit.edu
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Supplementary material
  • Preview PDF
Loading

Abstract

Researchers are generating single-cell RNA sequencing (scRNA-seq) profiles of diverse biological systems1–4 and every cell type in the human body.5 Leveraging this data to gain unprecedented insight into biology and disease will require assembling heterogeneous cell populations across multiple experiments, laboratories, and technologies. Although methods for scRNA-seq data integration exist6,7, they often naively merge data sets together even when the data sets have no cell types in common, leading to results that do not correspond to real biological patterns. Here we present Scanorama, inspired by algorithms for panorama stitching, that overcomes the limitations of existing methods to enable accurate, heterogeneous scRNA-seq data set integration. Our strategy identifies and merges the shared cell types among all pairs of data sets and is orders of magnitude faster than existing techniques. We use Scanorama to combine 105,476 cells from 26 diverse scRNA-seq experiments across 9 different technologies into a single comprehensive reference, demonstrating how Scanorama can be used to obtain a more complete picture of cellular function across a wide range of scRNA-seq experiments.

Copyright 
The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY 4.0 International license.
Back to top
PreviousNext
Posted July 17, 2018.
Download PDF

Supplementary Material

Email

Thank you for your interest in spreading the word about bioRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
Panoramic stitching of heterogeneous single-cell transcriptomic data
(Your Name) has forwarded a page to you from bioRxiv
(Your Name) thought you would like to see this page from the bioRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
Panoramic stitching of heterogeneous single-cell transcriptomic data
Brian Hie, Bryan Bryson, Bonnie Berger
bioRxiv 371179; doi: https://doi.org/10.1101/371179
Reddit logo Twitter logo Facebook logo LinkedIn logo Mendeley logo
Citation Tools
Panoramic stitching of heterogeneous single-cell transcriptomic data
Brian Hie, Bryan Bryson, Bonnie Berger
bioRxiv 371179; doi: https://doi.org/10.1101/371179

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Bioinformatics
Subject Areas
All Articles
  • Animal Behavior and Cognition (4232)
  • Biochemistry (9124)
  • Bioengineering (6774)
  • Bioinformatics (23985)
  • Biophysics (12115)
  • Cancer Biology (9520)
  • Cell Biology (13772)
  • Clinical Trials (138)
  • Developmental Biology (7626)
  • Ecology (11683)
  • Epidemiology (2066)
  • Evolutionary Biology (15501)
  • Genetics (10637)
  • Genomics (14317)
  • Immunology (9476)
  • Microbiology (22825)
  • Molecular Biology (9087)
  • Neuroscience (48947)
  • Paleontology (355)
  • Pathology (1480)
  • Pharmacology and Toxicology (2567)
  • Physiology (3844)
  • Plant Biology (8325)
  • Scientific Communication and Education (1471)
  • Synthetic Biology (2295)
  • Systems Biology (6185)
  • Zoology (1300)