Abstract
Average genome size (AGS) is an important, yet often overlooked property of microbial communities. We developed MicrobeCensus to rapidly and accurately estimate AGS from short-read metagenomics data and applied our tool to over 1,300 human microbiome samples. We found that AGS differs significantly within and between body sites and tracks with major functional and taxonomic differences. For example, in the gut, AGS ranges from 2.5 to 5.8 megabases and is positively correlated with the abundance of Bacteroides and polysaccharide metabolism. Furthermore, we found that AGS variation can bias comparative analyses, and that normalization improves detection of differentially abundant genes.
List of Abbreviations
- AGS
- average genome size of a microbial community
- CV
- coefficient of variation
- Mb
- megabase
- CPU
- central processing unit
- NCBI
- National Center for Biotechnology Institute
- HMP
- Human Microbiome Project
- T2D
- type-2 diabetes metagenomics sequencing project
- MetaHIT
- Metagenomics of the Human Intestinal Tract
- OTU
- operational taxonomic unit
- KEGG
- Kyoto Encyclopedia of Genes and Genomes
- KO
- KEGG Orthology Group
Copyright
The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC 4.0 International license.