De novo detection of differentially bound regions for ChIP-seq data using peaks and windows: controlling error rates correctly

Nucleic Acids Res. 2014 Jun;42(11):e95. doi: 10.1093/nar/gku351. Epub 2014 May 22.

Abstract

A common aim in ChIP-seq experiments is to identify changes in protein binding patterns between conditions, i.e. differential binding. A number of peak- and window-based strategies have been developed to detect differential binding when the regions of interest are not known in advance. However, careful consideration of error control is needed when applying these methods. Peak-based approaches use the same data set to define peaks and to detect differential binding. Done improperly, this can result in loss of type I error control. For window-based methods, controlling the false discovery rate over all detected windows does not guarantee control across all detected regions. Misinterpreting the former as the latter can result in unexpected liberalness. Here, several solutions are presented to maintain error control for these de novo counting strategies. For peak-based methods, peak calling should be performed on pooled libraries prior to the statistical analysis. For window-based methods, a hybrid approach using Simes' method is proposed to maintain control of the false discovery rate across regions. More generally, the relative advantages of peak- and window-based strategies are explored using a range of simulated and real data sets. Implementations of both strategies also compare favourably to existing programs for differential binding analyses.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Binding Sites
  • Chromatin Immunoprecipitation / methods*
  • DNA-Binding Proteins / metabolism*
  • Histones / metabolism
  • Sequence Analysis, DNA / methods*
  • Software
  • Transcription Factors / metabolism

Substances

  • DNA-Binding Proteins
  • Histones
  • Transcription Factors