RT Journal Article SR Electronic T1 Quick and effective approximation of in silico saturation mutagenesis experiments with first-order Taylor expansion JF bioRxiv FD Cold Spring Harbor Laboratory SP 2023.11.10.566588 DO 10.1101/2023.11.10.566588 A1 Sasse, Alexander A1 Chikina, Maria A1 Mostafavi, Sara YR 2023 UL http://biorxiv.org/content/early/2023/11/14/2023.11.10.566588.abstract AB To understand the decision process of genomic sequence-to-function models, various explainable AI algorithms have been proposed. These methods determine the importance of each nucleotide in a given input sequence to the model’s predictions, and enable discovery of cis regulatory motif grammar for gene regulation. The most commonly applied method is in silico saturation mutagenesis (ISM) because its per-nucleotide importance scores can be intuitively understood as the computational counterpart to in vivo saturation mutagenesis experiments. While ISM is highly interpretable, it is computationally challenging to perform, because it requires computing three forward passes for every nucleotide in the given input sequence; these computations add up when analyzing a large number of sequences, and become prohibitive as the length of the input sequences and size of the model grows. Here, we show how to use the first-order Taylor approximation for ISM, which reduces its computation cost to a single forward pass for an input sequence, placing its scalability on equal footing with gradient-based approximation methods such as “gradient-times-input”. We show that the Taylor ISM (TISM) approximation is robust across different model ablations, random initializations, training parameters, and data set sizes. We use our theoretical derivation to connect ISM with the gradient values and show how this approximation is related to a recently suggested correction of the model’s gradients.Competing Interest StatementThe authors have declared no competing interest.