RT Journal Article SR Electronic T1 K-mer similarity, networks of microbial genomes and taxonomic rank JF bioRxiv FD Cold Spring Harbor Laboratory SP 125237 DO 10.1101/125237 A1 Guillaume Bernard A1 Paul Greenfield A1 Mark A. Ragan A1 Cheong Xin Chan YR 2017 UL http://biorxiv.org/content/early/2017/04/07/125237.abstract AB Alignment-free (AF) methods have recently been adopted to infer phylogenetic trees. However, the evolutionary relationships among microbes, impacted by common phenomena such as lateral genetic transfer and rearrangement, cannot be adequately captured in a strictly tree-like structure. Bacterial and archaeal genomes consist of highly conserved regions, e.g. ribosomal RNA genes (commonly used as phylogenetic markers), more-variable regions and extrachromosomal elements, i.e. plasmids (that contain genes critical under a selective condition e.g. antibiotic resistance). The impact of these elements on genome-scale inference of microbial phylogeny remains little known. Here, using an AF approach, we inferred phylogenomic networks of microbial life based on 2785 completely sequenced bacterial and archaeal genomes, and systematically assessed the impact of ribosomal RNA genes and plasmid sequences in this network. Our results indicate that k-mer similarity can correlate with taxonomic rank of microbes. Using a relational database approach, we linked the implicated k-mers to annotated genomic regions (thus functions), and defined core functions in specific phyletic groups and genera. We found that, in most phyla, highly conserved functions are often related to Amino acid metabolism and transport, and Energy production and conversion. Our findings indicate that AF phylogenomics can be used to infer reticulate relationships in a scalable manner and provide new perspective into microbial biology and evolution.