Interpreting 16S metagenomic data without clustering to achieve sub-OTU resolution

ISME J. 2015 Jan;9(1):68-80. doi: 10.1038/ismej.2014.117. Epub 2014 Jul 11.

Abstract

The standard approach to analyzing 16S tag sequence data, which relies on clustering reads by sequence similarity into Operational Taxonomic Units (OTUs), underexploits the accuracy of modern sequencing technology. We present a clustering-free approach to multi-sample Illumina data sets that can identify independent bacterial subpopulations regardless of the similarity of their 16S tag sequences. Using published data from a longitudinal time-series study of human tongue microbiota, we are able to resolve within standard 97% similarity OTUs up to 20 distinct subpopulations, all ecologically distinct but with 16S tags differing by as little as one nucleotide (99.2% similarity). A comparative analysis of oral communities of two cohabiting individuals reveals that most such subpopulations are shared between the two communities at 100% sequence identity, and that dynamical similarity between subpopulations in one host is strongly predictive of dynamical similarity between the same subpopulations in the other host. Our method can also be applied to samples collected in cross-sectional studies and can be used with the 454 sequencing platform. We discuss how the sub-OTU resolution of our approach can provide new insight into factors shaping community assembly.

Publication types

  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Cluster Analysis
  • Humans
  • Metagenomics / methods*
  • Phylogeny
  • RNA, Ribosomal, 16S / genetics*
  • Sequence Analysis, DNA

Substances

  • RNA, Ribosomal, 16S