PT  - JOURNAL ARTICLE
AU  - Rafael R. C. Cuadrat
AU  - Danny Ionescu
AU  - Alberto M. R. Davila
AU  - Hans-Peter Grossart
TI  - Recovering genomic clusters of secondary metabolites from lakes: a Metagenomics 2.0 approach
AID  - 10.1101/183061
DP  - 2017 Jan 01
TA  - bioRxiv
PG  - 183061
4099  - http://biorxiv.org/content/early/2017/08/31/183061.short
4100  - http://biorxiv.org/content/early/2017/08/31/183061.full
AB  - Background Metagenomic approaches became increasingly popular in the past decades due to decreasing costs of DNA sequencing and bioinformatics development. So far, however, the recovery of long genes coding for secondary metabolism still represents a big challenge. Often, the quality of metagenome assemblies is poor, especially in environments with a high microbial diversity where sequence coverage is low and complexity of natural communities high. Recently, new and improved algorithms for binning environmental reads and contigs have been developed to overcome such limitations. Some of these algorithms use a similarity detection approach to classify the obtained reads into taxonomical units and to assemble draft genomes. This approach, however, is quite limited since it can classify exclusively sequences similar to those available (and well classified) in the databases.In this work, we used draft genomes from Lake Stechlin, north-eastern Germany, recovered by MetaBat, an efficient binning tool that integrates empirical probabilistic distances of genome abundance, and tetranucleotide frequency for accurate metagenome binning. These genomes were screened for secondary metabolism genes, such as polyketide synthases (PKS) and non-ribosomal peptide synthases (NRPS), using the Anti-SMASH and NAPDOS workflows.Results With this approach we were able to identify 243 secondary metabolite clusters from 121 genomes recovered from the lake samples. A total of 18 NRPS, 19 PKS and 3 hybrid PKS/NRPS clusters were found. In addition, it was possible to predict the partial structure of several secondary metabolite clusters allowing for taxonomical classifications and phylogenetic inferences.Conclusions Our approach revealed a great potential to recover and study secondary metabolites genes from any aquatic ecosystem.