ABSTRACT
Post-hoc attribution methods are widely applied to provide insights into patterns learned by deep neural networks (DNNs). Despite their success in regulatory genomics, DNNs can learn arbitrary functions outside the probabilistic simplex that defines one-hot encoded DNA. This introduces a random gradient component that manifests as noise in attribution scores. Here we demonstrate the pervasiveness of off-simplex gradient noise for genomic DNNs and introduce a statistical correction that is effective at improving the interpretability of attribution methods.
Competing Interest Statement
The authors have declared no competing interest.
Footnotes
Shortened text and new results.
Copyright
The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license.