RT Journal Article SR Electronic T1 Augmented base pairing networks encode RNA-small molecule binding preferences JF bioRxiv FD Cold Spring Harbor Laboratory SP 701326 DO 10.1101/701326 A1 Carlos Oliver A1 Vincent Mallet A1 Roman Sarrazin Gendron A1 Vladimir Reinharz A1 William L. Hamilton A1 Nicolas Moitessier A1 Jérôme Waldispühl YR 2020 UL http://biorxiv.org/content/early/2020/02/01/701326.abstract AB Motivation The binding of small molecules to RNAs is an important mechanism which can stabilize 3D structures or activate key molecular functions. To date, computational and experimental efforts toward small molecule binding prediction have primarily focused on protein targets. Considering that a very large portion of the genome is transcribed into non-coding RNAs but only few regions are translated into proteins, successful annotations of RNA elements targeted by small-molecule would likely uncover a vast repertoire of biological pathways and possibly lead to new therapeutic avenues.Results Our work is a first attempt at bringing machine learning approaches to the problem of RNA drug discovery. RNAmigos takes advantage of the unique structural properties of RNA to predict small molecule ligands for unseen binding sites. A key feature of our model is an efficient representation of binding sites as augmented base pairing networks (ABPNs) aimed at encoding important structural patterns. We subject our ligand predictions to two virtual screen settings and show that we are able to rank the known ligand on average in the 73rd percentile, showing a significant improvement over several baselines. Furthermore, we observe that graphs which are augmented with non-Watson Crick (a.k.a non-canonical) base pairs are the only representation which is able to retrieve a significant signal, suggesting that non-canonical interactions are an necessary source of binding specificity in RNAs. We also find that an auxiliary graph representation task significantly boosts performance by providing efficient structural embeddings to the low data setting of ligand prediction. RNAmigos shows that RNA binding data contains structural patterns with potential for drug discovery, and provides methodological insights which can be applied to other structure-function learning tasks.Availability Code and data is freely available at http://csb.cs.mcgill.ca/RNAmigos.Contact jerome{at}cs.mcgill.ca