SMURF: Genomic mapping of fungal secondary metabolite clusters
Introduction
Secondary metabolites (SMs) are small bioactive molecules produced by many organisms including bacteria, plants and fungi. These compounds are particularly abundant in soil-dwelling filamentous fungi, which exist as multicellular communities competing with each other for nutrients, minerals and water (Keller et al., 2005). Unlike primary metabolites, most SMs – as their name suggests – are not essential for fungal growth, development, or reproduction under in vitro conditions. They can however provide protection against various environmental stresses and during antagonistic interactions with other soil inhabitants or a eukaryotic host. Scientific appreciation of the importance of fungal SMs grew in the 1940s as the massive impact of penicillin on human health began to be seen. Since then, many other beneficial SM compounds have been discovered including immunosuppressants, cholesterol-lowering drugs, antiviral drugs, and anti-tumor drugs (for a recent review see Hoffmeister and Keller, 2007). At the same time, fungi are also known to produce numerous mycotoxins such as aflatoxin, fumonisin, trichothecene, and zearalone.
The first committed step in biosynthesis of an SM is catalyzed by one of five proteins, which we refer to here as “backbone” enzymes. They include nonribosomal peptide synthases (NRPSs), polyketide synthases (PKSs), hybrid NRPS–PKS enzymes, prenyltransferases (DMATSs), and terpene cyclases (TCs). These multidomain enzymes are associated, respectively, with production of the five classes of SM: nonribosomal peptides, polyketides, NRPS–PKS hybrids, indole alkaloids, and terpenes (Hoffmeister and Keller, 2007). Terpenes, which are composed of isoprene units, are not considered further in our analysis, because terpene cyclases are highly variable in sequence and difficult to detect by bioinformatic methods (Keller et al., 2005, Townsend, 1997). Intermediate products formed by the backbone enzymes can undergo further modifications catalyzed by “decorating” enzymes. The final product is then often steered by a transporter outside the fungal cell wall or sometimes remains within the cell. All these genes tend to be found in contiguous gene clusters, which are coordinately regulated by a specific Zn2Cys6 transcription factor and/or by the global regulator of secondary metabolism, putative methyltransferase LaeA (Keller and Hohn, 1997, Keller et al., 2005).
The availability of data from fungal genome sequencing projects has facilitated the discovery and characterization of new compounds and their biosynthetic pathways. Thus within months after completion of the first A. fumigatus genome (Nierman et al., 2005), several secondary metabolite clusters were characterized at the molecular level including the gliotoxin (Gardiner and Howlett, 2005), fumigaclavines (Coyle and Panaccione, 2005, Unsold and Li, 2005, Unsold and Li, 2006), fumitremorgin (Maiya et al., 2006), and siderophores (Reiber et al., 2005) biosynthesis clusters. Genome sequencing also revealed that the number of secondary metabolites characterized from a given species falls far behind the numbers of clusters that can be predicted based on its genomic sequence (Bok et al., 2006, Chiang et al., 2008). This has been attributed to the fact that not all clusters may be expressed under normal laboratory conditions.
Despite the medical and agricultural importance of fungal SMs, most putative SM clusters in fungal genomes have been predicted by ad hoc strategies based on manual reviews of BLAST searches generated for backbone genes and their neighbors (e.g. Nierman et al., 2005). Manual annotation of SM clusters, however, is time-consuming and may result in inconsistent annotation.
To facilitate systematic mapping of SM clusters in fungal genomes, we developed a web-based software tool, Secondary Metabolite Unknown Regions Finder (SMURF; www.jcvi.org/smurf/). It is based on three hallmarks of fungal SM biosynthetic pathways: (i) the presence of backbone genes, (ii) clustering, and (iii) characteristic protein domain content. Subsequent analyses of the predicted clusters present in 27 sequenced fungal genomes (Supplementary Table 1) shows SM gene enrichment in the genus Aspergillus, the absence of the clusters in unicellular fungi, and unexpected abundance and variability of the fungal clusters. Our results are also consistent with the view that SM profiles can be used as means of differentiating species and strains in filamentous fungi (Frisvad et al., 2008), and show that gene duplication plays an essential role in the creation and expansion of the SM repertoires of fungi.
Section snippets
Identification of putative backbone enzymes
SMURF relies on hidden Markov model (HMM) searches to detect backbone genes in sequenced fungal genomes. The HMMER program (http://hmmer.janelia.org) was used to search for conserved Pfam and TIGRFAM domains of backbone enzymes in the protein set of each sequenced species. Trusted threshold bit score cutoffs (predefined in HMMER) were used for each HMM search. NRPS enzymes were identified as enzymes with at least one module composed of an amino acid adenylation domain (A), a thiolation domain
Parameter optimization
SMURF predicts putative secondary metabolism clusters by using an algorithm that takes into account the domain content of putative “backbone” genes and adjacent “decorating” genes. One of the key challenges in developing this tool was identification of the adjacent genes. In choosing parameters for SMURF we were confronted with the dilemma of striking a balance between levels of under-prediction and over-prediction. We chose to favor the latter, because over-prediction is easier to address in
Validations and limitations
SMURF is the first web-based tool that can systematically predict putative backbone genes in fungal genomes with high accuracy. Currently, there are only two publicly available software programs (Starcevic et al., 2008, Weber et al., 2009) designed to annotate PKS, NRPS and hybrid genes and both have been tailored to bacterial genomes. In addition to the backbone genes, SMURF can also generate rule-based sets of clusters, which can be used as a first approximation in comparative genomics and
Conflict of interest
None declared.
Acknowledgments
This work was supported by Science Foundation Ireland (07/IN1/B911 to KHW], NIAID [N01-AI30071, R21-AI052236, U01 AI48830), and USDA (2004-35600-14172).
References (51)
Accurate prediction of the Aspergillus nidulans terrequinone gene cluster boundaries using the transcriptional regulator LaeA
Fungal Genet. Biol.
(2007)Evolution of beta-lactam biosynthesis genes and recruitment of trans-acting factors
Phytochemistry
(2005)Molecular genetic mining of the Aspergillus secondary metabolome: discovery of the emericellamide biosynthetic pathway
Chem. Biol.
(2008)The use of secondary metabolite profiling in chemotaxonomy of filamentous fungi
Mycol. Res.
(2008)Hydrolytic polyketide shortening by ayg1p, a novel enzyme involved in fungal melanin biosynthesis
J. Biol. Chem.
(2004)- et al.
Bioinformatic and expression analysis of the putative gliotoxin biosynthetic gene cluster of Aspergillus fumigatus
FEMS Microbiol. Lett.
(2005) - et al.
Metabolic pathway gene clusters in filamentous fungi
Fungal Genet. Biol.
(1997) Identification of a gene cluster responsible for the biosynthesis of aurofusarin in the Fusarium graminearum species complex
Fungal Genet. Biol.
(2005)Co-expression of 15 contiguous genes delineates a fumonisin biosynthetic gene cluster in Gibberella moniliformis
Fungal Genet. Biol.
(2003)The expression of selected non-ribosomal peptide synthetases in Aspergillus fumigatus is controlled by the availability of free iron
FEMS Microbiol. Lett.
(2005)
Structural studies of natural product biosynthetic proteins
Chem. Biol.
CLUSEAN: a computer-based framework for the automated analysis of bacterial secondary metabolite biosynthetic gene clusters
J. Biotechnol.
Secondary metabolic gene cluster silencing in Aspergillus nidulans
Mol. Microbiol.
Activation of fungal silent gene clusters: a new avenue to drug discovery
Prog. Drug Res.
Twenty-five coregulated transcripts define a sterigmatocystin gene cluster in Aspergillus nidulans
Proc. Natl. Acad. Sci. USA
A gene cluster containing two fungal polyketide synthases encodes the biosynthetic pathway for a polyketide, asperfuranone, in Aspergillus nidulans
J. Am. Chem. Soc.
An ergot alkaloid biosynthesis gene and clustered hypothetical genes from Aspergillus fumigatus
Appl. Environ. Microbiol.
The siderophore system is essential for viability of Aspergillus nidulans: functional analysis of two genes encoding l-ornithine N 5-monooxygenase (sidA) and a non-ribosomal peptide synthetase (sidC)
Mol. Microbiol.
Genomic islands in the pathogenic filamentous fungus Aspergillus fumigatus
PLoS Genet.
A fungal phylogeny based on 42 complete genomes derived from supertree and combined gene analysis
BMC Evol.Biol.
The genome sequence of the filamentous fungus Neurospora crassa
Nature
FtmPT2, an N-prenyltransferase from Aspergillus fumigatus, catalyses the last step in the biosynthesis of fumitremorgin B
ChemBioChem
Natural products of filamentous fungi: enzymes, genes, and their regulation
Nat. Prod. Rep.
Identification of cytochrome P450s required for fumitremorgin biosynthesis in Aspergillus fumigatus
ChemBioChem
Fungal secondary metabolism – from biochemistry to genomics
Nat. Rev. Microbiol.
Cited by (581)
Genetic conflicts in budding yeast: The 2μ plasmid as a model selfish element
2024, Seminars in Cell and Developmental BiologyDevelopment strategies and application of antimicrobial peptides as future alternatives to in-feed antibiotics
2024, Science of the Total EnvironmentUnlocking the power of synergy: Cosubstrate and coculture fermentation for enhanced biomethane production
2024, Biomass and BioenergyYeast-based heterologous production of the Colletochlorin family of fungal secondary metabolites
2023, Metabolic EngineeringAdvances in the integration of metabolomics and metagenomics for human gut microbiome and their clinical applications
2023, TrAC - Trends in Analytical Chemistry