Elsevier

Fungal Genetics and Biology

Volume 47, Issue 9, September 2010, Pages 736-741
Fungal Genetics and Biology

SMURF: Genomic mapping of fungal secondary metabolite clusters

https://doi.org/10.1016/j.fgb.2010.06.003Get rights and content

Abstract

Fungi produce an impressive array of secondary metabolites (SMs) including mycotoxins, antibiotics and pharmaceuticals. The genes responsible for their biosynthesis, export, and transcriptional regulation are often found in contiguous gene clusters. To facilitate annotation of these clusters in sequenced fungal genomes, we developed the web-based software SMURF (www.jcvi.org/smurf/) to systematically predict clustered SM genes based on their genomic context and domain content. We applied SMURF to catalog putative clusters in 27 publicly available fungal genomes. Comparison with genetically characterized clusters from six fungal species showed that SMURF accurately recovered all clusters and detected additional potential clusters. Subsequent comparative analysis revealed the striking biosynthetic capacity and variability of the fungal SM pathways and the correlation between unicellularity and the absence of SMs. Further genetics studies are needed to experimentally confirm these clusters.

Introduction

Secondary metabolites (SMs) are small bioactive molecules produced by many organisms including bacteria, plants and fungi. These compounds are particularly abundant in soil-dwelling filamentous fungi, which exist as multicellular communities competing with each other for nutrients, minerals and water (Keller et al., 2005). Unlike primary metabolites, most SMs – as their name suggests – are not essential for fungal growth, development, or reproduction under in vitro conditions. They can however provide protection against various environmental stresses and during antagonistic interactions with other soil inhabitants or a eukaryotic host. Scientific appreciation of the importance of fungal SMs grew in the 1940s as the massive impact of penicillin on human health began to be seen. Since then, many other beneficial SM compounds have been discovered including immunosuppressants, cholesterol-lowering drugs, antiviral drugs, and anti-tumor drugs (for a recent review see Hoffmeister and Keller, 2007). At the same time, fungi are also known to produce numerous mycotoxins such as aflatoxin, fumonisin, trichothecene, and zearalone.

The first committed step in biosynthesis of an SM is catalyzed by one of five proteins, which we refer to here as “backbone” enzymes. They include nonribosomal peptide synthases (NRPSs), polyketide synthases (PKSs), hybrid NRPS–PKS enzymes, prenyltransferases (DMATSs), and terpene cyclases (TCs). These multidomain enzymes are associated, respectively, with production of the five classes of SM: nonribosomal peptides, polyketides, NRPS–PKS hybrids, indole alkaloids, and terpenes (Hoffmeister and Keller, 2007). Terpenes, which are composed of isoprene units, are not considered further in our analysis, because terpene cyclases are highly variable in sequence and difficult to detect by bioinformatic methods (Keller et al., 2005, Townsend, 1997). Intermediate products formed by the backbone enzymes can undergo further modifications catalyzed by “decorating” enzymes. The final product is then often steered by a transporter outside the fungal cell wall or sometimes remains within the cell. All these genes tend to be found in contiguous gene clusters, which are coordinately regulated by a specific Zn2Cys6 transcription factor and/or by the global regulator of secondary metabolism, putative methyltransferase LaeA (Keller and Hohn, 1997, Keller et al., 2005).

The availability of data from fungal genome sequencing projects has facilitated the discovery and characterization of new compounds and their biosynthetic pathways. Thus within months after completion of the first A. fumigatus genome (Nierman et al., 2005), several secondary metabolite clusters were characterized at the molecular level including the gliotoxin (Gardiner and Howlett, 2005), fumigaclavines (Coyle and Panaccione, 2005, Unsold and Li, 2005, Unsold and Li, 2006), fumitremorgin (Maiya et al., 2006), and siderophores (Reiber et al., 2005) biosynthesis clusters. Genome sequencing also revealed that the number of secondary metabolites characterized from a given species falls far behind the numbers of clusters that can be predicted based on its genomic sequence (Bok et al., 2006, Chiang et al., 2008). This has been attributed to the fact that not all clusters may be expressed under normal laboratory conditions.

Despite the medical and agricultural importance of fungal SMs, most putative SM clusters in fungal genomes have been predicted by ad hoc strategies based on manual reviews of BLAST searches generated for backbone genes and their neighbors (e.g. Nierman et al., 2005). Manual annotation of SM clusters, however, is time-consuming and may result in inconsistent annotation.

To facilitate systematic mapping of SM clusters in fungal genomes, we developed a web-based software tool, Secondary Metabolite Unknown Regions Finder (SMURF; www.jcvi.org/smurf/). It is based on three hallmarks of fungal SM biosynthetic pathways: (i) the presence of backbone genes, (ii) clustering, and (iii) characteristic protein domain content. Subsequent analyses of the predicted clusters present in 27 sequenced fungal genomes (Supplementary Table 1) shows SM gene enrichment in the genus Aspergillus, the absence of the clusters in unicellular fungi, and unexpected abundance and variability of the fungal clusters. Our results are also consistent with the view that SM profiles can be used as means of differentiating species and strains in filamentous fungi (Frisvad et al., 2008), and show that gene duplication plays an essential role in the creation and expansion of the SM repertoires of fungi.

Section snippets

Identification of putative backbone enzymes

SMURF relies on hidden Markov model (HMM) searches to detect backbone genes in sequenced fungal genomes. The HMMER program (http://hmmer.janelia.org) was used to search for conserved Pfam and TIGRFAM domains of backbone enzymes in the protein set of each sequenced species. Trusted threshold bit score cutoffs (predefined in HMMER) were used for each HMM search. NRPS enzymes were identified as enzymes with at least one module composed of an amino acid adenylation domain (A), a thiolation domain

Parameter optimization

SMURF predicts putative secondary metabolism clusters by using an algorithm that takes into account the domain content of putative “backbone” genes and adjacent “decorating” genes. One of the key challenges in developing this tool was identification of the adjacent genes. In choosing parameters for SMURF we were confronted with the dilemma of striking a balance between levels of under-prediction and over-prediction. We chose to favor the latter, because over-prediction is easier to address in

Validations and limitations

SMURF is the first web-based tool that can systematically predict putative backbone genes in fungal genomes with high accuracy. Currently, there are only two publicly available software programs (Starcevic et al., 2008, Weber et al., 2009) designed to annotate PKS, NRPS and hybrid genes and both have been tailored to bacterial genomes. In addition to the backbone genes, SMURF can also generate rule-based sets of clusters, which can be used as a first approximation in comparative genomics and

Conflict of interest

None declared.

Acknowledgments

This work was supported by Science Foundation Ireland (07/IN1/B911 to KHW], NIAID [N01-AI30071, R21-AI052236, U01 AI48830), and USDA (2004-35600-14172).

References (51)

  • C.A. Townsend

    Structural studies of natural product biosynthetic proteins

    Chem. Biol.

    (1997)
  • T. Weber

    CLUSEAN: a computer-based framework for the automated analysis of bacterial secondary metabolite biosynthetic gene clusters

    J. Biotechnol.

    (2009)
  • J.W. Bok

    Secondary metabolic gene cluster silencing in Aspergillus nidulans

    Mol. Microbiol.

    (2006)
  • A.A. Brakhage

    Activation of fungal silent gene clusters: a new avenue to drug discovery

    Prog. Drug Res.

    (2008)
  • D.W. Brown

    Twenty-five coregulated transcripts define a sterigmatocystin gene cluster in Aspergillus nidulans

    Proc. Natl. Acad. Sci. USA

    (1996)
  • Y.M. Chiang

    A gene cluster containing two fungal polyketide synthases encodes the biosynthetic pathway for a polyketide, asperfuranone, in Aspergillus nidulans

    J. Am. Chem. Soc.

    (2009)
  • C.M. Coyle et al.

    An ergot alkaloid biosynthesis gene and clustered hypothetical genes from Aspergillus fumigatus

    Appl. Environ. Microbiol.

    (2005)
  • M. Eisendle

    The siderophore system is essential for viability of Aspergillus nidulans: functional analysis of two genes encoding l-ornithine N 5-monooxygenase (sidA) and a non-ribosomal peptide synthetase (sidC)

    Mol. Microbiol.

    (2003)
  • N.D. Fedorova

    Genomic islands in the pathogenic filamentous fungus Aspergillus fumigatus

    PLoS Genet.

    (2008)
  • D.A. Fitzpatrick

    A fungal phylogeny based on 42 complete genomes derived from supertree and combined gene analysis

    BMC Evol.Biol.

    (2006)
  • J.E. Galagan

    The genome sequence of the filamentous fungus Neurospora crassa

    Nature

    (2003)
  • A. Grundmann

    FtmPT2, an N-prenyltransferase from Aspergillus fumigatus, catalyses the last step in the biosynthesis of fumitremorgin B

    ChemBioChem

    (2008)
  • D. Hoffmeister et al.

    Natural products of filamentous fungi: enzymes, genes, and their regulation

    Nat. Prod. Rep.

    (2007)
  • N. Kato

    Identification of cytochrome P450s required for fumitremorgin biosynthesis in Aspergillus fumigatus

    ChemBioChem

    (2009)
  • N.P. Keller

    Fungal secondary metabolism – from biochemistry to genomics

    Nat. Rev. Microbiol.

    (2005)
  • Cited by (581)

    View all citing articles on Scopus
    View full text