PT - JOURNAL ARTICLE AU - Gherman Novakovsky AU - Oriol Fornes AU - Manu Saraswat AU - Sara Mostafavi AU - Wyeth W. Wasserman TI - ExplaiNN: interpretable and transparent neural networks for genomics AID - 10.1101/2022.05.20.492818 DP - 2022 Jan 01 TA - bioRxiv PG - 2022.05.20.492818 4099 - http://biorxiv.org/content/early/2022/05/22/2022.05.20.492818.short 4100 - http://biorxiv.org/content/early/2022/05/22/2022.05.20.492818.full AB - Sequence-based deep learning models, particularly convolutional neural networks (CNNs), have shown superior performance on a wide range of genomic tasks. A key limitation of these models is the lack of interpretability, slowing their broad adoption by the genomics community. Current approaches to model interpretation do not readily reveal how a model makes predictions, can be computationally intensive, and depend on the implemented architecture. Here, we introduce ExplaiNN, an adaptation of neural additive models1 for genomic tasks wherein predictions are computed as a linear combination of multiple independent CNNs, each consisting of a single convolutional filter and fully connected layers. This approach brings together the expressivity of CNNs with the interpretability of linear models, providing global (cell state level) as well as local (individual sequence level) insights of the biological processes studied. We use ExplaiNN to predict transcription factor (TF) binding and chromatin accessibility states, demonstrating performance levels comparable to state-of-the-art methods, while providing a transparent view of the model’s predictions in a straightforward manner. Applied to de novo motif discovery, ExplaiNN detects equivalent motifs to those obtained from specialized algorithms across a range of datasets. Finally, we present ExplaiNN as a plug and play platform in which pre-trained TF binding models and annotated position weight matrices from reference databases can be combined in a simple framework. We expect that ExplaiNN will accelerate the adoption of deep learning by biological domain experts in their daily genomic sequence analyses.Competing Interest StatementThe authors have declared no competing interest.