Abstract
RNA-based medicines and RNA-targeting drugs are emerging as promising new approaches for treating disease. Optimizing these therapeutics by naive experimental screening is a time-consuming and expensive process, while rational design requires an accurate understanding of the structure and function of RNA. To address this design challenge, we present ATOM-1, the first RNA foundation model trained on chemical mapping data, enabled by data collection strategies purposely developed for machine learning training. Using small probe neural networks on top of ATOM-1 embeddings, we demonstrate that this model has developed rich internal representations of RNA. Trained on limited amounts of additional data, these small networks achieve state-of-the-art accuracy on key RNA prediction tasks, suggesting that this approach can enable the design of therapies across the RNA landscape.
Competing Interest Statement
All authors are current or former employees of Atomic AI. There is a pending patent application in relation to this work.