Abstract
The growing number of sequenced genomes enables the study of secondary metabolite biosynthetic gene clusters (BGC) in phyla beyond well-studied soil bacteria. We mined 2627 enterobacterial genomes to detect 8604 BGCs, including nonribosomal peptide synthetases, siderophores, polyketide-nonribosomal peptide hybrids, and 60 other BGC types, with an average of around 3.3 BGCs per genome. These BGCs represented 212 distinct BGC families, of which only 20 have associated products in the MIBiG standard database with functions such as siderophores, antibiotics, and genotoxins. Pangenome analysis identified genes associated with a specific BGC encoding for colon cancer-related colibactin. In one example, we associated genes involved in the type VI secretion system with the presence of a colibactin BGC in Escherichia. This richness of BGCs in enterobacteria opens up the possibility to discover novel secondary metabolites, their physiological roles and provides a guide to identify and understand PKS associated gene sets.