Abstract
In oocytes of mammals and other animals, gene regulation is mediated primarily through changes in poly(A)-tail length1–9. Here, we introduce PAL-AI, an integrated neural network machine-learning model that accurately predicts tail-length changes in maturing oocytes of frogs and mammals. We show that PAL-AI learned known and previously unknown sequence elements and their contextual features that control poly(A)-tail length, enabling it to predict tail-length changes resulting from 3ʹ-UTR single-nucleotide substitutions. It also predicted tail-length-mediated translational changes, allowing us to nominate genes important for oocyte maturation. When comparing predicted tail-length changes in human oocytes with genomic datasets of the All of Us Research Program10 and gnomAD11 we found that genetic variants predicted to disrupt tail lengthening are under negative selection in the human population, thereby linking mRNA tail lengthening to human female fertility.
Competing Interest Statement
The authors have declared no competing interest.
Data availability
All standard sequencing data are available in the Gene Expression Omnibus under the accession number GSE280422. Raw intensity data for reporter mRNA tail-length sequencing cannot be deposited in public databases due to large sizes and are available upon request.
Oligo sequences used in this study are listed in Supplementary Table 1. Sequences of the oligo library used for the single-nucleotide mutagenesis library are listed in Supplementary Table 2.
Other publicly available data analyzed in this study were indicated in relevant sections of Methods.