ABSTRACT
Ribonucleic acids (RNA) play crucial roles in living organisms as they are involved in key processes necessary for proper cell functioning. Some RNA molecules, such as bacterial ribosomes and precursor messenger RNA, are targets of small molecule drugs, while others, e.g., bacterial riboswitches or viral RNA motifs are considered as potential therapeutic targets. Thus, the continuous discovery of new functional RNA increases the demand for developing compounds targeting them and for methods for analyzing RNA—small molecule interactions. We recently developed fingeRNAt - a software for detecting non-covalent bonds formed within complexes of nucleic acids with different types of ligands. The program detects several non-covalent interactions, such as hydrogen and halogen bonds, ionic, Pi, inorganic ion-and water-mediated, lipophilic interactions, and encodes them as computational-friendly Structural Interaction Fingerprint (SIFt). Here we present the application of SIFts accompanied by machine learning methods for binding prediction of small molecules to RNA targets. We show that SIFt-based models outperform the classic, general-purpose scoring functions in virtual screening. We discuss the aid offered by Explainable Artificial Intelligence in the analysis of the binding prediction models, elucidating the decision-making process, and deciphering molecular recognition processes.
Key Points
Structural Interaction fingerprints (SIFts), combined with machine learning, were successfully used to develop activity models for ligands binding to RNA.
SIFt-based models outperformed the classic, general-purpose scoring functions in virtual screening.
Explainable Artificial Intelligence allowed us to understand the decision-making process and decipher molecular recognition processes in the analysis of RNA—ligand binding activity models.
We provide a benchmark dataset based on ligands with known or putative binding activity toward six RNA targets. It can be readily used by the scientific community to test new algorithms of virtual screening on RNA—ligand complexes.
Competing Interest Statement
The authors have declared no competing interest.
Footnotes
About Authors:
Natalia A. Szulc is a Ph.D. student at the International Institute of Molecular and Cell Biology in Warsaw. She holds M.Sc. degrees in Molecular Biotechnology and Computational Engineering. She is interested in RNA—ligand interactions in the pharmacological context.
Zuzanna Mackiewicz is a Ph.D. student at the International Institute of Molecular and Cell Biology in Warsaw. Her work is focused on studying cytoplasmic polyadenylation using both bioinformatics and molecular biology approaches.
Janusz M. Bujnicki is a head of the Laboratory of Bioinformatics and Protein Engineering at the International Institute of Molecular and Cell Biology in Warsaw. He is interested in understanding how the sequence of biopolymers defines their structure and interactions with other molecules. His team is currently focusing on the structural biology of RNA.
Filip Stefaniak is a senior researcher at the International Institute of Molecular and Cell Biology in Warsaw. His research interests include data-driven modeling of interactions of RNA with small molecule ligands.