TY - JOUR T1 - Crunch: Completely Automated Analysis of ChIP-seq Data JF - bioRxiv DO - 10.1101/042903 SP - 042903 AU - Severin Berger AU - Saeed Omidi AU - Mikhail Pachkov AU - Phil Arnold AU - Nicholas Kelley AU - Silvia Salatino AU - Erik van Nimwegen Y1 - 2016/01/01 UR - http://biorxiv.org/content/early/2016/03/09/042903.abstract N2 - Today experimental groups routinely apply ChIP-seq technology to quantitatively characterize the genome-wide binding patterns of any molecule associated with the DNA. Here we present Crunch, a completely automated procedure for ChIP-seq data analysis, starting from raw read quality control, through read mapping, peak detection and annotation, and including comprehensive DNA sequence motif analysis. Among Crunch's novel features are a Bayesian mixture model that automatically fits a noise model and infers significantly enriched genomic regions in parallel, as well as a Gaussian mixture model for decomposing enriched regions into individual binding peaks. Moreover, Crunch uses a combination of de novo motif finding with binding site prediction for a large collection of known regulatory motifs to model the observed ChIP-seq signal in terms of novel and known regulatory motifs, extensively characterizing the contribution of each motif to explaining the ChIP-seq signal, and annotating which combinations of motifs occur in each binding peak. To make Crunch easily available to all researchers, including those without bioinformatics expertise, Crunch has been implemented as a web server (crunch.unibas.ch) that only requires users to upload their raw sequencing data, providing all results within an interactive graphical web interface.To demonstrate Crunch's power we apply it to a collection of 128 ChIP-seq data-sets from the ENCODE project, showing that Crunch's de novo motifs often outperform existing motifs in explaining the ChIP-seq signal, and that Crunch successfully identifies binding partners of the proteins that were immuno-precipitated. ER -