RT Journal Article SR Electronic T1 SINAPS: Prediction of microbial traits from marker gene sequences JF bioRxiv FD Cold Spring Harbor Laboratory SP 124156 DO 10.1101/124156 A1 Robert C. Edgar YR 2017 UL http://biorxiv.org/content/early/2017/04/04/124156.abstract AB Microbial communities are often studied by sequencing marker genes such as 16S ribosomal RNA. Marker gene sequences can be used to assess diversity and taxonomy, but do not directly measure functions arising from other genes in the community metagenome. Such functions can be predicted by algorithms that associate marker genes with experimentally determined traits in well-studied species. Typically, such methods use ancestral state reconstruction. Here I describe SINAPS, a new algorithm that predicts traits for marker gene sequences using a fast, simple word-counting algorithm that does not require alignments or trees. A measure of prediction confidence is obtained by bootstrapping. I tested SINAPS predictions from 16S V4 query sequences for traits including energy metabolism, Gram-positive staining, presence of a flagellum, V4 primer mismatches, and 16S copy number. Accuracy was >90% except for copy number, where a large majority of predictions were within +/−2 of the true value.