Abstract
Trillions of marine bacterial, archaeal and viral species contribute to the majority diversity of life on Earth. In the current study, we have done a comprehensive review of all the published studies of marine microbiome by re-analyzing most of the available high throughput sequencing data. We collected 17.59 Tb sequencing data from 8,165 metagenomic and prokaryotic samples, and systematically evaluated the genome characters, including genome size, GC content, phylogeny, and the functional and ecological roles of several typical phyla. A genome catalogue of 9,070 high quality genomes and a gene catalogue including 156,209,709 genes were constructed, representing the most integrate marine prokaryotic datasets till now. The genome size of Alphaproteobacteria and Actinobacteria was significant correlated to their GC content. A total of 44,322 biosynthetic gene clusters distributed in 53 types were detected from the reconstructed marine prokaryotic genome catalogue. Phylogenetic annotation of the 8,380 bacterial and 690 archaeal species revealed that most of the known bacterial phyla (99/111), including 62 classes and 181 orders, and four extra unclassified genomes from two candidate novel phyla were detected. In addition, taxonomically unclassified species represented a substantial fraction of 64.56% and 80.29% of the phylogenetic diversity of Bacteria and Archaea respectively. The genomic and ecological features of three groups of Cyanobacteria, luminous bacteria and methane-metabolizing archaea, including inhabitant preference, geolocation distribution and others were through discussed. Our database provides a comprehensive resource for marine microbiome, which would be a valuable reference for studies of marine life origination and evolution, ecology monitor and protection, bioactive compound development.
Competing Interest Statement
The authors have declared no competing interest.
Footnotes
↵† The Global Ocean Microbiome Project (GOMP) was initiated at Oct. 28, 2021.