Microbial strain-level population structure and genetic diversity from metagenomes

  1. Nicola Segata1
  1. 1Centre for Integrative Biology, University of Trento, 38123 Trento, Italy;
  2. 2Biostatistics Department, Harvard School of Public Health, Boston, Massachusetts 02115, USA;
  3. 3The Broad Institute, Cambridge, Massachusetts 02142, USA
  1. Corresponding author: nicola.segata{at}unitn.it

Abstract

Among the human health conditions linked to microbial communities, phenotypes are often associated with only a subset of strains within causal microbial groups. Although it has been critical for decades in microbial physiology to characterize individual strains, this has been challenging when using culture-independent high-throughput metagenomics. We introduce StrainPhlAn, a novel metagenomic strain identification approach, and apply it to characterize the genetic structure of thousands of strains from more than 125 species in more than 1500 gut metagenomes drawn from populations spanning North and South American, European, Asian, and African countries. The method relies on per-sample dominant sequence variant reconstruction within species-specific marker genes. It identified primarily subject-specific strain variants (<5% inter-subject strain sharing), and we determined that a single strain typically dominated each species and was retained over time (for >70% of species). Microbial population structure was correlated in several distinct ways with the geographic structure of the host population. In some cases, discrete subspecies (e.g., for Eubacterium rectale and Prevotella copri) or continuous microbial genetic variations (e.g., for Faecalibacterium prausnitzii) were associated with geographically distinct human populations, whereas few strains occurred in multiple unrelated cohorts. We further estimated the genetic variability of gut microbes, with Bacteroides species appearing remarkably consistent (0.45% median number of nucleotide variants between strains), whereas P. copri was among the most plastic gut colonizers. We thus characterize here the population genetics of previously inaccessible intestinal microbes, providing a comprehensive strain-level genetic overview of the gut microbial diversity.

Footnotes

  • [Supplemental material is available for this article.]

  • Article published online before print. Article, supplemental material, and publication date are at http://www.genome.org/cgi/doi/10.1101/gr.216242.116.

  • Freely available online through the Genome Research Open Access option.

  • Received September 19, 2016.
  • Accepted February 2, 2017.

This article, published in Genome Research, is available under a Creative Commons License (Attribution 4.0 International), as described at http://creativecommons.org/licenses/by/4.0/.

| Table of Contents
OPEN ACCESS ARTICLE

Preprint Server