End Sequence Analysis Toolkit (ESAT) expands the extractable information from single-cell RNA-seq data
- Alan Derr1,
- Chaoxing Yang2,10,
- Rapolas Zilionis3,4,10,
- Alexey Sergushichev5,6,
- David M. Blodgett7,
- Sambra Redick2,
- Rita Bortell2,
- Jeremy Luban8,
- David M. Harlan7,
- Sebastian Kadener9,
- Dale L. Greiner2,8,
- Allon Klein3,
- Maxim N. Artyomov6 and
- Manuel Garber1,8
- 1Program in Bioinformatics and Integrative Biology, University of Massachusetts Medical School, Worcester, Massachusetts 01655, USA;
- 2Program in Molecular Medicine, Diabetes Center of Excellence, University of Massachusetts Medical School, Worcester, Massachusetts 01655, USA;
- 3Department of System Biology, Harvard Medical School, Boston, Massachusetts 02115, USA;
- 4Institute of Biotechnology, Vilnius University, LT 02241 Vilnius, Lithuania;
- 5Computer Technologies Department, ITMO University, Saint Petersburg, 197101, Russia;
- 6Department of Pathology and Immunology, Washington University in St. Louis, St. Louis, Missouri 63110, USA;
- 7Department of Medicine, Diabetes Center of Excellence, University of Massachusetts Medical School, Worcester, Massachusetts 01655, USA;
- 8Program in Molecular Medicine, University of Massachusetts Medical School, Worcester, Massachusetts 01655, USA;
- 9Biological Chemistry Department, Silberman Institute of Life Sciences, The Hebrew University of Jerusalem, Jerusalem, 91904, Israel
- Corresponding author: manuel.garber{at}umassmed.edu
-
↵10 These authors contributed equally to this work.
Abstract
RNA-seq protocols that focus on transcript termini are well suited for applications in which template quantity is limiting. Here we show that, when applied to end-sequencing data, analytical methods designed for global RNA-seq produce computational artifacts. To remedy this, we created the End Sequence Analysis Toolkit (ESAT). As a test, we first compared end-sequencing and bulk RNA-seq using RNA from dendritic cells stimulated with lipopolysaccharide (LPS). As predicted by the telescripting model for transcriptional bursts, ESAT detected an LPS-stimulated shift to shorter 3′-isoforms that was not evident by conventional computational methods. Then, droplet-based microfluidics was used to generate 1000 cDNA libraries, each from an individual pancreatic islet cell. ESAT identified nine distinct cell types, three distinct β-cell types, and a complex interplay between hormone secretion and vascularization. ESAT, then, offers a much-needed and generally applicable computational pipeline for either bulk or single-cell RNA end-sequencing.
Footnotes
-
[Supplemental material is available for this article.]
-
Article published online before print. Article, supplemental material, and publication date are at http://www.genome.org/cgi/doi/10.1101/gr.207902.116.
-
Freely available online through the Genome Research Open Access option.
- Received April 4, 2016.
- Accepted July 27, 2016.
This article, published in Genome Research, is available under a Creative Commons License (Attribution 4.0 International), as described at http://creativecommons.org/licenses/by/4.0/.