PT - JOURNAL ARTICLE AU - Markus Lux AU - Barbara Hammer AU - Alexander Sczyrba TI - Automated Contamination Detection in Single-Cell Sequencing AID - 10.1101/020859 DP - 2015 Jan 01 TA - bioRxiv PG - 020859 4099 - http://biorxiv.org/content/early/2015/06/15/020859.short 4100 - http://biorxiv.org/content/early/2015/06/15/020859.full AB - Novel methods for the sequencing of single-cell DNA offer tremendous opportunities. However, many techniques are still in their infancy and a major obstacle is given by sample contamination with foreign DNA. In this contribution, we present a pipeline that allows for fast, automated detection of contaminated samples by the use of modern machine learning methods. First, a vectorial representation of the genomic data is obtained using oligonucleotide signatures. Using non-linear subspace projections, data is transformed to be suitable for automatic clustering. This allows for the detection of one vs. more genomes (clusters) in a sample. As clustering is an ill-posed problem, the pipeline relies on a thorough choice of all involved methods and parameters. We give an overview of the problem and evaluate techniques suitable for this task.