PT - JOURNAL ARTICLE AU - Dexiong Chen AU - Laurent Jacob AU - Julien Mairal TI - Predicting Transcription Factor Binding Sites with Convolutional Kernel Networks AID - 10.1101/217257 DP - 2017 Jan 01 TA - bioRxiv PG - 217257 4099 - http://biorxiv.org/content/early/2017/11/10/217257.short 4100 - http://biorxiv.org/content/early/2017/11/10/217257.full AB - The growing amount of biological sequences available makes it possible to learn genotype-phenotype relationships from data with increasingly high accuracy. By exploiting large sets of sequences with known phenotypes, machine learning methods can be used to build functions that predict the phenotype of new, unannotated sequences. In particular, deep neural networks have recently obtained good performances on such prediction tasks, but are notoriously difficult to analyze or interpret. Here, we introduce a hybrid approach between kernel methods and convolutional neural networks for sequences, which retains the ability of neural networks to learn good representations for a learning problem at hand, while defining a well characterized Hilbert space to describe prediction functions. Our method outperforms state-of-the-art convolutional neural networks on a transcription factor binding prediction task while being much faster to train and yielding more stable and interpretable results.Source code is freely available at https://gitlab.inria.fr/dchen/CKN-seq.