PT - JOURNAL ARTICLE AU - Olm, Matthew R. AU - Brown, Christopher T. AU - Brooks, Brandon AU - Banfield, Jillian F. TI - dRep: A tool for fast and accurate genome de-replication that enables tracking of microbial genotypes and improved genome recovery from metagenomes AID - 10.1101/108142 DP - 2017 Jan 01 TA - bioRxiv PG - 108142 4099 - http://biorxiv.org/content/early/2017/02/13/108142.short 4100 - http://biorxiv.org/content/early/2017/02/13/108142.full AB - The number of microbial genomes sequenced each year is expanding rapidly, in part due to genome-resolved metagenomic studies that routinely recover hundreds of draft-quality genomes. Rapid algorithms have been developed to comprehensively compare large genome sets, but they are not accurate with draft-quality genomes. Here we present dRep, a program that sequentially applies a fast, inaccurate estimation of genome distance and a slow but accurate measure of average nucleotide identity to reduce the computational time for pair-wise genome set comparisons by orders of magnitude. We demonstrate its use in a study where we separately assembled each metagenome from time series datasets. Groups of essentially identical genomes were identified with dRep, and the best genome from each set was selected. This resulted in recovery of significantly more and higher-quality genomes compared to the set recovered using the typical co-assembly method. Documentation is available at http://drep.readthedocs.io/en/master/ and source code is available at https://github.com/MrOlm/drep.