End Sequence Analysis Toolkit (ESAT) expands the extractable information from single-cell RNA-seq data

  1. Manuel Garber1,8
  1. 1Program in Bioinformatics and Integrative Biology, University of Massachusetts Medical School, Worcester, Massachusetts 01655, USA;
  2. 2Program in Molecular Medicine, Diabetes Center of Excellence, University of Massachusetts Medical School, Worcester, Massachusetts 01655, USA;
  3. 3Department of System Biology, Harvard Medical School, Boston, Massachusetts 02115, USA;
  4. 4Institute of Biotechnology, Vilnius University, LT 02241 Vilnius, Lithuania;
  5. 5Computer Technologies Department, ITMO University, Saint Petersburg, 197101, Russia;
  6. 6Department of Pathology and Immunology, Washington University in St. Louis, St. Louis, Missouri 63110, USA;
  7. 7Department of Medicine, Diabetes Center of Excellence, University of Massachusetts Medical School, Worcester, Massachusetts 01655, USA;
  8. 8Program in Molecular Medicine, University of Massachusetts Medical School, Worcester, Massachusetts 01655, USA;
  9. 9Biological Chemistry Department, Silberman Institute of Life Sciences, The Hebrew University of Jerusalem, Jerusalem, 91904, Israel
  1. Corresponding author: manuel.garber{at}umassmed.edu
  1. 10 These authors contributed equally to this work.

Abstract

RNA-seq protocols that focus on transcript termini are well suited for applications in which template quantity is limiting. Here we show that, when applied to end-sequencing data, analytical methods designed for global RNA-seq produce computational artifacts. To remedy this, we created the End Sequence Analysis Toolkit (ESAT). As a test, we first compared end-sequencing and bulk RNA-seq using RNA from dendritic cells stimulated with lipopolysaccharide (LPS). As predicted by the telescripting model for transcriptional bursts, ESAT detected an LPS-stimulated shift to shorter 3′-isoforms that was not evident by conventional computational methods. Then, droplet-based microfluidics was used to generate 1000 cDNA libraries, each from an individual pancreatic islet cell. ESAT identified nine distinct cell types, three distinct β-cell types, and a complex interplay between hormone secretion and vascularization. ESAT, then, offers a much-needed and generally applicable computational pipeline for either bulk or single-cell RNA end-sequencing.

Footnotes

  • [Supplemental material is available for this article.]

  • Article published online before print. Article, supplemental material, and publication date are at http://www.genome.org/cgi/doi/10.1101/gr.207902.116.

  • Freely available online through the Genome Research Open Access option.

  • Received April 4, 2016.
  • Accepted July 27, 2016.

This article, published in Genome Research, is available under a Creative Commons License (Attribution 4.0 International), as described at http://creativecommons.org/licenses/by/4.0/.

| Table of Contents
OPEN ACCESS ARTICLE

Preprint Server