Abstract
Known gut virus diversity is currently skewed by challenges in fecal virus-like particles (VLPs) enrichment and towards active viruses detectable in shotgun metagenomic sequencing. Here, we apply a virus detection procedure, including vigorous enrichment to harvest large quantity of VLPs, and combined Illumina and PacBio sequencing, to fecal samples of 180 Chinese volunteers. Integrated assembly of the short- and long-reads generate more and longer viral genomes compared to existing methods. The resulting viral genome dataset, referred as to the Chinese Human Gut Virome collection (CHGV), covers the full spectrum of the gut virome, i.e., an CHGV-trained machine-learning algorithm recognizes most (81∼97%) public gut viruses; meanwhile, it contains 71.50% novel genomes, including 20% that cannot be recognized by machine-learning models trained on public viruses. Further analysis of the CHGV reveals a substantially higher diversity of the human gut virome. For example, we identify thousands of viral genomes that are more prevalent than crAssphages and Gubaphages, the two most diverse phages in the human gut, and several viral clades that are more diverse than the two. Tother, our results indicate a vastly enlarged gut viral diversity that significantly broadens our knowledge on the viral dark matter of the human gut microbial ecology.
Competing Interest Statement
The authors have declared no competing interest.