deML: robust demultiplexing of Illumina sequences using a likelihood-based approach

Gabriel Renaud; Udo Stenzel; Tomislav Maricic; Victor Wiebe; Janet Kelso

doi:10.1093/bioinformatics/btu719

deML: robust demultiplexing of Illumina sequences using a likelihood-based approach

Bioinformatics. 2015 Mar 1;31(5):770-2. doi: 10.1093/bioinformatics/btu719. Epub 2014 Oct 30.

Authors

Gabriel Renaud¹, Udo Stenzel¹, Tomislav Maricic¹, Victor Wiebe¹, Janet Kelso¹

Affiliation

¹ Department of Evolutionary Genetics, Max Planck Institute for Evolutionary Anthropology, Leipzig, Saxony D-04103, Germany.

Abstract

Motivation: Pooling multiple samples increases the efficiency and lowers the cost of DNA sequencing. One approach to multiplexing is to use short DNA indices to uniquely identify each sample. After sequencing, reads must be assigned in silico to the sample of origin, a process referred to as demultiplexing. Demultiplexing software typically identifies the sample of origin using a fixed number of mismatches between the read index and a reference index set. This approach may fail or misassign reads when the sequencing quality of the indices is poor.

Results: We introduce deML, a maximum likelihood algorithm that demultiplexes Illumina sequences. deML computes the likelihood of an observed index sequence being derived from a specified sample. A quality score which reflects the probability of the assignment being correct is generated for each read. Using these quality scores, even very problematic datasets can be demultiplexed and an error threshold can be set.

Availability and implementation: deML is freely available for use under the GPL (http://bioinf.eva.mpg.de/deml/).

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Algorithms*
Humans
Likelihood Functions
Sequence Analysis, DNA / instrumentation*
Sequence Analysis, DNA / methods*
Software*