PT - JOURNAL ARTICLE AU - Georg Stricker AU - Alexander Engelhardt AU - Daniel Schulz AU - Matthias Schmid AU - Achim Tresch AU - Julien Gagneur TI - GenoGAM: Genome-wide generalized additive models for ChIP-seq analysis AID - 10.1101/047464 DP - 2017 Jan 01 TA - bioRxiv PG - 047464 4099 - http://biorxiv.org/content/early/2017/02/16/047464.short 4100 - http://biorxiv.org/content/early/2017/02/16/047464.full AB - Motivation Chromatin immunoprecipitation followed by deep sequencing (ChIP-Seq) is a widely used approach to study protein-DNA interactions. Often, the quantities of interest are the differential occupancies relative to controls, between genetic backgrounds, treatments, or combinations thereof. Current methods for differential occupancy of ChIP-seq data rely however on binning or sliding window techniques, for which the choice of the window and bin sizes are subjective.Results Here, we present GenoGAM (Genome-wide Generalized Additive Model), which brings the well-established and flexible generalized additive models framework to genomic applications using a data parallelism strategy. We model ChIP-Seq read count frequencies as products of smooth functions along chromosomes. Smoothing parameters are objectively estimated from the data by cross-validation, eliminating ad-hoc binning and windowing needed by current approaches. GenoGAM provides base-level and region-level significance testing for full factorial designs. Application to a ChIP-Seq dataset in yeast showed increased sensitivity over existing differential occupancy methods while controlling for type I error rate. By analyzing a set of DNA methylation data and illustrating an extension to a peak caller, we further demonstrate the potential of GenoGAM as a generic statistical modeling tool for genome-wide assays.Availability Software is available from Bioconductor: https://www.bioconductor.org/packages/release/bioc/html/GenoGAM.htmlContact gagneur{at}in.tum.deSupplementary information Supplementary information is available at Bioinformatics online.