An RNA foundation model enables discovery of disease mechanisms and candidate therapeutics

Abstract
Accurately modeling and predicting RNA biology has been a long-standing challenge, bearing significant clinical ramifications for variant interpretation and the formulation of tailored therapeutics. We describe a foundation model for RNA biology, “BigRNA”, which was trained on thousands of genome-matched datasets to predict tissue-specific RNA expression, splicing, microRNA sites, and RNA binding protein specificity from DNA sequence. Unlike approaches that are restricted to missense variants, BigRNA can identify pathogenic non-coding variant effects across diverse mechanisms, including polyadenylation, exon skipping and intron retention. BigRNA accurately predicted the effects of steric blocking oligonucleotides (SBOs) on increasing the expression of 4 out of 4 genes, and on splicing for 18 out of 18 exons across 14 genes, including those involved in Wilson disease and spinal muscular atrophy. We anticipate that BigRNA and foundation models like it will have widespread applications in the field of personalized RNA therapeutics.
Competing Interest Statement
All listed authors are present or past employees of Deep Genomics Inc. This study received funding from Deep Genomics in the form of salary support and covering of computational costs. The founder was involved in the decision to submit for publication.
Subject Area
- Biochemistry (10803)
- Bioengineering (8047)
- Bioinformatics (27330)
- Biophysics (13987)
- Cancer Biology (11130)
- Cell Biology (16075)
- Clinical Trials (138)
- Developmental Biology (8792)
- Ecology (13300)
- Epidemiology (2067)
- Evolutionary Biology (17371)
- Genetics (11690)
- Genomics (15932)
- Immunology (11038)
- Microbiology (26115)
- Molecular Biology (10658)
- Neuroscience (56619)
- Paleontology (420)
- Pathology (1736)
- Pharmacology and Toxicology (3005)
- Physiology (4552)
- Plant Biology (9646)
- Synthetic Biology (2691)
- Systems Biology (6979)
- Zoology (1511)