RT Journal Article SR Electronic T1 Precursor peptide-targeted mining of more than one hundred thousand genomes expands the lanthipeptide natural product family JF bioRxiv FD Cold Spring Harbor Laboratory SP 2020.03.13.990614 DO 10.1101/2020.03.13.990614 A1 Mark C. Walker A1 Douglas A. Mitchell A1 Wilfred A. van der Donk YR 2020 UL http://biorxiv.org/content/early/2020/03/15/2020.03.13.990614.abstract AB Background Lanthipeptides belong to the ribosomally synthesized and post-translationally modified peptide group of natural products and have a variety of biological activities ranging from antibiotics to antinociceptives. These peptides are cyclized through thioether crosslinks and can bear other secondary post-translational modifications. While lanthipeptide biosynthetic gene clusters can be identified by the presence of characteristic enzymes involved in the post-translational modification of these peptides, locating the precursor peptides encoded within these clusters is challenging due to their short length and high sequence variability, which limits the high-throughput exploration of lanthipeptide precursor peptides. To address this challenge, we enhanced the predictive capabilities of Rapid ORF Description & Evaluation Online (RODEO) to identify all known classes of lanthipeptides.Results Using RODEO, we mined over 100,000 bacterial and archaeal genomes in the RefSeq database. We identified nearly 8,500 lanthipeptide precursor peptides. These precursor peptides were identified in a broad range of bacterial phyla as well as the Euryarchaeota phylum of archaea. Bacteroidetes were found to encode a large number of these biosynthetic gene clusters, despite making up a relatively small portion of the genomes in this dataset. While a number of these precursor peptides are similar to those of previously characterized lanthipeptides, even more were not, including potential antibiotics. Additionally, examination of the biosynthetic gene clusters revealed enzymes that install secondary post-translational modifications are more widespread than initially thought.Conclusion Lanthipeptide biosynthetic gene clusters are more widely distributed and the precursor peptides encoded within these clusters are more diverse than previously appreciated, demonstrating that the lanthipeptide sequence-function space remains largely underexplored.BGCbiosynthetic gene cluster,HMMhidden Markov model,LanAlanthipeptide precursor peptide;LanBclass I lanthipeptide dehydratatse;LanCclass I lanthipeptide cyclaseLanKCclass III lanthipeptide synthetaseLanLclass IV lanthipeptide synthetaseLanMclass II lanthipeptide synthetase;MEMEMultiple Em for Motif Elicitation;ORFopen reading frame.Pfamprotein family;RODEORapid ORF Description & Evaluation Online;RiPPribosomally synthesized and post-translationally modified peptide;SVMsupport vector machine.