TY - JOUR T1 - Enhancer Identification using Transfer and Adversarial Deep Learning of DNA Sequences JF - bioRxiv DO - 10.1101/264200 SP - 264200 AU - Dikla Cohn AU - Or Zuk AU - Tommy Kaplan Y1 - 2018/01/01 UR - http://biorxiv.org/content/early/2018/02/14/264200.abstract N2 - Enhancer sequences regulate the expression of genes from afar by providing a binding platform for transcription factors, often in a tissue-specific or context-specific manner. Despite their importance in health and disease, our understanding of these DNA sequences, and their regulatory grammar, is limited. This impairs our ability to identify new enhancers along the genome, or to understand the effect of enhancer mutations and their role in genetic diseases.We trained deep Convolutional Neural Networks (CNN) to identify enhancer sequences in multiple species. We used multiple biological datasets, including simulated sequences, in vivo binding data of single transcription factors and genome-wide chromatin maps of active enhancers in 17 mammalian species. Our deep networks obtained high classification accuracy by combining two training strategies: First, training on enhancers vs. non-enhancer background sequences, we identified short (1-4bp) low-complexity motifs. Second, by replacing the negative training set by adversarial k-order random shuffles of enhancer sequences (thus maintaining base composition while shuttering longer motifs, including transcription factor binding sites), we identified a set of biologically meaningful motifs, unique to enhancers. In addition, classification performance improved when combining positive data from all species together, showing a shared mammalian regulatory architecture.Our results demonstrate that design of adversarial training data, and transfer of learned parameters between networks trained on different species/datasets improve the overall performance and capture biologically meaningful information in the parameters of the learned network.Contact: or.zuk{at}mail.huji.ac.il, tommy{at}cs.huji.ac.il ER -