TY - JOUR T1 - Pheniqs: Fast and flexible quality-aware sequence demultiplexing JF - bioRxiv DO - 10.1101/128512 SP - 128512 AU - Lior Galanti AU - Dennis Shasha AU - Kristin C. Gunsalus Y1 - 2017/01/01 UR - http://biorxiv.org/content/early/2017/04/19/128512.abstract N2 - Motivation Output from high throughput sequencing instruments often exceeds what is necessary to assay a single sample. To better utilize this capacity, multiple samples are independently tagged with a unique “barcode” sequence and are then pooled, or “multiplexed”, and sequenced together. Classifying, or “demultiplexing”, the reads involves decoding the barcode sequence. Although instruments estimate the probability of incorrectly calling each nucleobase, available demultiplexers do not consult those estimates or report classification error probabilities.Results We present Pheniqs, a fast and flexible sequence demultiplexer and quality analyzer. In addition to providing an efficient implementation of the widespread minimum distance decoder, Pheniqs introduces a novel Phred-adjusted maximum likelihood decoder that consults base calling quality scores and estimates the probability of a barcode decoding error. Setting an upper bound on the permissible error provides an intuitive way to control demultiplexing confidence and directly influence precision and recall. Pheniqs supports FASTQ and multiple Sequence Alignment/Map formats and uses auxiliary SAM tags to report both library classification and demultiplexing error probability. Evaluation on both real and semi-synthetic data indicates that Pheniqs is faster than existing demultiplexers, substantially when demultiplexing longer reads, and achieves greater accuracy by correctly reflecting quality measurements.Availability and Implementation Implemented in multithreaded C++ and available under the terms of the AGPL-3.0 license agreement at http://github.com/biosails/pheniqs. Manual and examples are available at http://biosails.github.io/pheniqs. ER -