PT - JOURNAL ARTICLE AU - Jens Keilwagen AU - Stefan Posch AU - Jan Grau TI - Learning from mistakes: Accurate prediction of cell type-specific transcription factor binding AID - 10.1101/230011 DP - 2018 Jan 01 TA - bioRxiv PG - 230011 4099 - http://biorxiv.org/content/early/2018/06/12/230011.short 4100 - http://biorxiv.org/content/early/2018/06/12/230011.full AB - Computational prediction of cell type-specific, in-vivo transcription factor binding sites is still one of the central challenges in regulatory genomics, and a variety of approaches has been proposed for this purpose.Here, we present our approach that earned a shared first rank in the “ENCODE-DREAM in vivo Transcription Factor Binding Site Prediction Challenge” in 2017. This approach employs features derived from chromatin accessibility, binding motifs, gene expression, genomic sequence and annotation to train classifiers using a supervised, discriminative learning principle. Two further key aspects of this approach are learning classifier parameters in an iterative training procedure that successively adds additional negative examples to the training set, and creating an ensemble prediction by averaging over classifiers obtained for different training cell types.In post-challenge analyses, we benchmark the influence of different feature sets and find that chromatin accessiblity and binding motifs are sufficient to yield state-of-the-art performance for in-vivo binding site predictions. We also show that the iterative training procedure and the ensemble prediction are pivotal for the final prediction performance.To make predictions of this approach readily accessible, we predict 682 peak lists for a total of 31 transcription factors in 22 primary cell types and tissues, which are available for download at https://www.synapse.org/#!Synapse:syn11526239, and we demonstrate that these may help to yield biological conclusions. Finally, we provide a user-friendly version of our approach as open source software at http://jstacs.de/index.php/Catchitt.Contact grau{at}informatik.uni-halle.de