Skip to main content
bioRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search
New Results

Flexible expressed region analysis for RNA-seq with derfinder

Leonardo Collado-Torres, Abhinav Nellore, Alyssa C. Frazee, Christopher Wilks, Michael I. Love, Ben Langmead, Rafael A. Irizarry, Jeffrey T. Leek, Andrew E. Jaffe
doi: https://doi.org/10.1101/015370
Leonardo Collado-Torres
1Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health
2Center for Computational Biology, Johns Hopkins University
3Lieber Institute for Brain Development, Johns Hopkins Medical Campus
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Abhinav Nellore
1Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health
2Center for Computational Biology, Johns Hopkins University
4Department of Computer Science, Johns Hopkins University
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Alyssa C. Frazee
1Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health
2Center for Computational Biology, Johns Hopkins University
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Christopher Wilks
2Center for Computational Biology, Johns Hopkins University
4Department of Computer Science, Johns Hopkins University
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Michael I. Love
5Department of Biostatistics, Harvard T.H. Chan School of Public Health
6Dana-Farber Cancer Institute, Harvard University
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Ben Langmead
1Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health
2Center for Computational Biology, Johns Hopkins University
4Department of Computer Science, Johns Hopkins University
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Rafael A. Irizarry
5Department of Biostatistics, Harvard T.H. Chan School of Public Health
6Dana-Farber Cancer Institute, Harvard University
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Jeffrey T. Leek
1Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health
2Center for Computational Biology, Johns Hopkins University
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • For correspondence: jtleek@gmail.com
Andrew E. Jaffe
1Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health
2Center for Computational Biology, Johns Hopkins University
3Lieber Institute for Brain Development, Johns Hopkins Medical Campus
7Department of Mental Health, Johns Hopkins University
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • For correspondence: andrew.jaffe@libd.org
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Supplementary material
  • Preview PDF
Loading

Abstract

Background Differential expression analysis of RNA sequencing (RNA-seq) data typically relies on reconstructing transcripts or counting reads that overlap known gene structures. We previously introduced an intermediate statistical approach called differentially expressed region (DER) finder that seeks to identify contiguous regions of the genome showing differential expression signal at single base resolution without relying on existing annotation or potentially inaccurate transcript assembly.

Results We present the derfinder software that improves our annotation-agnostic approach to RNA-seq analysis by: (1) implementing a computationally efficient bump-hunting approach to identify DERs which permits genome-scale analyses in a large number of samples, (2) introducing a flexible statistical modeling framework, including multi-group and time-course analyses and (3) introducing a new set of data visualizations for expressed region analysis. We apply this approach to public RNA-seq data from the Genotype-Tissue Expression (GTEx) project and BrainSpan project to show that derfinder permits the analysis of hundreds of samples at base resolution in R, identifies expression outside of known gene boundaries and can be used to visualize expressed regions at base-resolution. In simulations our base resolution approaches enable discovery in the presence of incomplete annotation and is nearly as powerful as feature-level methods when the annotation is complete.

Conclusions derfinder analysis using expressed region-level and single base-level approaches provides a compromise between full transcript reconstruction and feature-level analysis.

The package is available from Bioconductor at www.bioconductor.org/packages/derfinder.

Copyright 
The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY 4.0 International license.
Back to top
PreviousNext
Posted May 19, 2016.
Download PDF

Supplementary Material

Email

Thank you for your interest in spreading the word about bioRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
Flexible expressed region analysis for RNA-seq with derfinder
(Your Name) has forwarded a page to you from bioRxiv
(Your Name) thought you would like to see this page from the bioRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
Flexible expressed region analysis for RNA-seq with derfinder
Leonardo Collado-Torres, Abhinav Nellore, Alyssa C. Frazee, Christopher Wilks, Michael I. Love, Ben Langmead, Rafael A. Irizarry, Jeffrey T. Leek, Andrew E. Jaffe
bioRxiv 015370; doi: https://doi.org/10.1101/015370
Reddit logo Twitter logo Facebook logo LinkedIn logo Mendeley logo
Citation Tools
Flexible expressed region analysis for RNA-seq with derfinder
Leonardo Collado-Torres, Abhinav Nellore, Alyssa C. Frazee, Christopher Wilks, Michael I. Love, Ben Langmead, Rafael A. Irizarry, Jeffrey T. Leek, Andrew E. Jaffe
bioRxiv 015370; doi: https://doi.org/10.1101/015370

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Bioinformatics
Subject Areas
All Articles
  • Animal Behavior and Cognition (4235)
  • Biochemistry (9136)
  • Bioengineering (6784)
  • Bioinformatics (24001)
  • Biophysics (12129)
  • Cancer Biology (9534)
  • Cell Biology (13778)
  • Clinical Trials (138)
  • Developmental Biology (7636)
  • Ecology (11702)
  • Epidemiology (2066)
  • Evolutionary Biology (15513)
  • Genetics (10644)
  • Genomics (14326)
  • Immunology (9483)
  • Microbiology (22840)
  • Molecular Biology (9090)
  • Neuroscience (48995)
  • Paleontology (355)
  • Pathology (1482)
  • Pharmacology and Toxicology (2570)
  • Physiology (3846)
  • Plant Biology (8331)
  • Scientific Communication and Education (1471)
  • Synthetic Biology (2296)
  • Systems Biology (6192)
  • Zoology (1301)