Computation for ChIP-seq and RNA-seq studies

Nat Methods. 2009 Nov;6(11 Suppl):S22-32. doi: 10.1038/nmeth.1371.

Abstract

Genome-wide measurements of protein-DNA interactions and transcriptomes are increasingly done by deep DNA sequencing methods (ChIP-seq and RNA-seq). The power and richness of these counting-based measurements comes at the cost of routinely handling tens to hundreds of millions of reads. Whereas early adopters necessarily developed their own custom computer code to analyze the first ChIP-seq and RNA-seq datasets, a new generation of more sophisticated algorithms and software tools are emerging to assist in the analysis phase of these projects. Here we describe the multilayered analyses of ChIP-seq and RNA-seq datasets, discuss the software packages currently available to perform tasks at each layer and describe some upcoming challenges and features for future analysis tools. We also discuss how software choices and uses are affected by specific aspects of the underlying biology and data structure, including genome size, positional clustering of transcription factor binding sites, transcript discovery and expression quantification.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't
  • Review

MeSH terms

  • Algorithms
  • Base Sequence
  • Chromatin Immunoprecipitation / methods*
  • Gene Expression Profiling / methods*
  • RNA / analysis
  • RNA / genetics*
  • Sequence Alignment / methods*
  • Sequence Analysis, DNA / methods*

Substances

  • RNA