RT Journal Article SR Electronic T1 Integrated annotations and analyses of small RNA-producing loci from 47 diverse plants JF bioRxiv FD Cold Spring Harbor Laboratory SP 756858 DO 10.1101/756858 A1 Alice Lunardon A1 Nathan R. Johnson A1 Emily Hagerott A1 Tamia Phifer A1 Seth Polydore A1 Ceyda Coruh A1 Michael J. Axtell YR 2019 UL http://biorxiv.org/content/early/2019/09/04/756858.abstract AB Plant endogenous small RNAs (sRNAs) are important regulators of gene expression. There are two broad categories of plant sRNAs: microRNAs (miRNAs) and endogenous short interfering RNAs (siRNAs). MicroRNA loci are relatively well-annotated but comprise only a small minority of the total sRNA pool; siRNA locus annotations have lagged far behind. Here, we used a large dataset of published and newly generated sRNA sequencing data (1,333 sRNA-seq libraries containing over 20 billion reads) and a uniform bioinformatic pipeline to produce comprehensive sRNA locus annotations of 47 diverse plants, yielding over 2.7 million sRNA loci. The two most numerous classes of siRNA loci produced mainly 24 nucleotide and 21 nucleotide siRNAs, respectively. 24 nucleotide-dominated siRNA loci usually occurred in intergenic regions, especially at the 5’-flanking regions of protein-coding genes. In contrast, 21 nucleotide-dominated siRNA loci were most often derived from double-stranded RNA precursors copied from spliced mRNAs. Genic 21 nucleotide-dominated loci were especially common from disease resistance genes, including from a large number of monocots. Individual siRNA sequences of all types showed very little conservation across species, while mature miRNAs were more likely to be conserved. We developed a web server where our data and several search and analysis tools are freely accessible at http://plantsmallrnagenes.science.psu.edu.