Abstract
Prevotella copri is a common inhabitant of the human gut. Interest in P. copri has gathered pace due to conflicting reports on whether it is beneficial or detrimental to health. In a cross-continent meta-analysis exploiting >6,500 available metagenomes supported by new isolate sequencing and recovery of high-quality genomes from metagenomes, we obtained >1,000 P. copri genomes. This 100-fold increase over existing isolate genomes allowed the genetic and global population structure of P. copri to be explored at an unprecedented depth. We demonstrate P. copri is not a monotypic species, but encompasses four distinct clades (>10% inter-clade vs. <4% intra-clade average single nucleotide variants) for which we propose the name P. copri complex, comprising clades A, B, C and D. We show the complex is near ubiquitous in non-Westernised populations (95.4% versus 29.6% in Westernised populations), where all four clades are typically co-present within an individual (61.6% of the cases), in contrast to Westernised populations (4.6%). Genomic analysis of the complex reveals substantial and complementary functional diversity, including the potential for utilisation of complex carbohydrates, suggestive that multi-generational dietary modifications may be a driver for the reduced P. copri prevalence in Westernised populations. Analysis of ancient stool microbiomes highlights a similar pattern of P. copri presence consistent with modern non-Westernised populations, allowing us to estimate the time of clade delineation to pre-date human migratory waves out of Africa. Our analysis reveals P. copri to be far more diverse than previously appreciated and this diversity appears to be underrepresented in Western-lifestyle populations.