Abstract
Human body habitats are home to a diverse array of microbes, and within these microbial ecosystems, there are exchanges of genetic material, including virulence factors (VFs). Little is known about the diversity and abundance of VFs in different body sites and different types of diseases. We developed a virulome analysis pipeline using the species-specific sequence identity inferred from intraspecies ANI values to precisely assign reads to virulence factors. We characterized the human virulome from four body habitats, including the gut, oral cavity, skin, and vagina. Specifically, the diversity and abundance of VFs in the oral cavity were significantly higher than those in other body sites, including stool. We highlight the importance of sex-specific analysis when studying the human virulome. We analyzed data from more than 4,000 samples across healthy and diseased subjects and 13 types of diseases from different metagenomic sequencing studies to characterize the disease-specific virulome. Atherosclerotic cardiovascular disease (ACVD) has a more diverse virulome than other diseases tested. Notably, many VFs, including genes for secretion systems and toxins, are more abundant in diseased subjects than in healthy controls. We present, to our knowledge, the most comprehensive healthy and diseased virulome dataset yet created.
Competing Interest Statement
The authors have declared no competing interest.
Abbreviations
- VFs
- virulence factors
- ACVD
- atherosclerotic cardiovascular disease
- IBD
- inflammatory bowel disease
- CRC
- colorectal carcinoma
- NSCLC
- non-small cell lung cancer
- HCC
- hepatocellular carcinoma
- GC
- gastric cancer
- PD
- Parkinson’s disease
- RCC
- renal cell carcinoma
- PCoA
- principal coordinate analysis
- NMDS
- nonmetric multidimensional scaling.