RT Journal Article SR Electronic T1 SNP Calling via Read Colored de Bruijn Graphs JF bioRxiv FD Cold Spring Harbor Laboratory SP 156174 DO 10.1101/156174 A1 Bahar Alipanahi A1 Martin D. Muggli A1 Musa Jundi A1 Noelle Noyes A1 Christina Boucher YR 2018 UL http://biorxiv.org/content/early/2018/04/16/156174.abstract AB Motivation The resistome, which refers to all of the antimicrobial resistance (AMR) genes in pathogenic and non-pathogenic bacteria, is frequently studied using shotgun metagenomic data [14, 47]. Unfortunately, few existing methods are able to identify single nucleotide polymorphisms (SNPs) within metagenomic data, and to the best of our knowledge, no methods exist to detect SNPs within AMR genes within the resistome. The ability to identify SNPs in AMR genes across the resistome would represent a significant advance in understanding the dissemination and evolution of AMR, as SNP identification would enable “fingerprinting” of the resistome, which could then be used to track AMR dynamics across various settings and/or time periods.Results We present LueVari, a reference-free SNP caller based on the read colored de Bruijn graph, an extension of the traditional de Bruijn graph that allows repeated regions longer than the k-mer length and shorter than the read length to be identified unambiguously. We demonstrate LueVari was the only method that had reliable sensitivity (between 73% and 98%) as the performance of competing methods varied widely. Furthermore, we show LueVari constructs sequences containing the variation which span 93% of the gene in datasets with lower coverage (15X), and 100% of the gene in datasets with higher coverage (30X).Availability Code and datasets are publicly available at https://github.com/baharpan/cosmo/tree/LueVari.