Abstract
Long noncoding RNAs (lncRNAs) are crucial factors during plant development and environmental responses. High-throughput and accurate identification of lncRNAs is still lacking in plants. To build an accurate atlas of lncRNA in cotton, we combined Isoform-sequencing (Iso-seq), strand-specific RNA-seq (ssRNA-seq), cap analysis gene expression (CAGE-seq) with PolyA-seq and compiled a pipeline named plant full-length lncRNA (PULL) to integrate multi-omics data. A total of 9240 lncRNAs from 21 tissue samples of the diploid cotton Gossypium arboreum were identified. We revealed that alternative usage of transcription start site (TSS) and transcription end site (TES) of lncRNAs occurs pervasively during plant growth and responses to stress. We identified the lncRNAs which co-expressed or be linked to the protein coding genes (PCGs) or GWAS studied SNPs associated with ovule and fiber development. We also mapped the genome-wide binding sites of two lncRNAs with chromatin isolation by RNA purification sequencing (ChIRP-seq) and validated the trans transcriptional regulation of lnc-Ga13g0352 via virus induced gene suppression (VIGS) assay. These findings provide valuable research resources for plant community and broaden our understandings of biogenesis and regulation function of plant lncRNAs.
One sentence summary The full-length annotation and transcriptional regulation of long noncoding RNAs in cotton.
Footnotes
1 This work was supported by the following grants: the National Program on Research and Development of Transgenic Plants (2016ZX08009003-004) and the National Natural Science Foundation of China (31770310 and 31711530706) to K.W.; ‘One Thousand’ Youth Talent Program to Y.Z.; and Innovation Team Program from Wuhan University to Y.Z. and K.W. (2042017kf0233).
↵3 Senior author.