PT - JOURNAL ARTICLE AU - Utpal Bakshi AU - Vinod K. Gupta AU - Aileen R. Lee AU - John M. Davis III AU - Sriram Chandrasekaran AU - Yong-Su Jin AU - Michael F. Freeman AU - Jaeyun Sung TI - TaxiBGC: a Taxonomy-guided Approach for the Identification of Experimentally Verified Microbial Biosynthetic Gene Clusters in Shotgun Metagenomic Data AID - 10.1101/2021.07.30.454505 DP - 2021 Jan 01 TA - bioRxiv PG - 2021.07.30.454505 4099 - http://biorxiv.org/content/early/2021/08/02/2021.07.30.454505.short 4100 - http://biorxiv.org/content/early/2021/08/02/2021.07.30.454505.full AB - Biosynthetic gene clusters (BGCs) in microbial genomes encode for the production of bioactive secondary metabolites (SMs). Given the well-recognized importance of SMs in microbe-microbe and microbe-host interactions, the large-scale identification of BGCs from microbial metagenomes could offer novel functional insights into complex chemical ecology. Despite recent progress, currently available tools for predicting BGCs from shotgun metagenomes have several limitations, including the need for computationally demanding read-assembly and prediction of a narrow breadth of BGC classes. To overcome these limitations, we developed TaxiBGC (Taxonomy-guided Identification of Biosynthetic Gene Clusters), a computational pipeline for identifying experimentally verified BGCs in shotgun metagenomes by first pinpointing the microbial species likely to produce them. We show that our species-centric approach was able to identify BGCs in simulated metagenomes more accurately than by solely detecting BGC genes. By applying TaxiBGC on 5,423 metagenomes from the Human Microbiome Project and various case-control studies, we identified distinct BGC signatures of major human body sites and candidate stool-borne biomarkers for multiple diseases, including inflammatory bowel disease, colorectal cancer, and psychiatric disorders. In all, TaxiBGC demonstrates a significant advantage over existing techniques for systematically characterizing BGCs and inferring their SMs from microbiome data.Competing Interest StatementThe authors have declared no competing interest.