RT Journal Article SR Electronic T1 Training Genotype Callers with Neural Networks JF bioRxiv FD Cold Spring Harbor Laboratory SP 097469 DO 10.1101/097469 A1 Remi Torracinta A1 Fabien Campagne YR 2016 UL http://biorxiv.org/content/early/2016/12/30/097469.abstract AB We present an open source software toolkit for training deep learning models to call genotypes in high-throughput sequencing data. The software supports SAM, BAM, CRAM and Goby alignments and the training of models for a variety of experimental assays and analysis protocols. We evaluate this software in the Illumina Platinum whole genome datasets and find that a deep learning model trained on 80% of the genome achieves a 0.986% accuracy on variants (genotype concordance) when trained with 10% of the data from a genome. The software is distributed at https://github.com/CampagneLaboratory/variationanalysis. The software makes it possible to train genotype calling models on consumer hardware with CPUs or GPU(s). It will enable individual investigators and small laboratories to train and evaluate their own models and to make open source contributions. We welcome contributions to extend this early prototype or evaluate its performance on other gold standard datasets.