RT Journal Article SR Electronic T1 Rapid Genotype Refinement for Whole-Genome Sequencing Data using Multi-Variate Normal Distributions JF bioRxiv FD Cold Spring Harbor Laboratory SP 031484 DO 10.1101/031484 A1 Rudy Arthur A1 Jared O’Connell A1 Ole Schulz-Trieglaff A1 Anthony J. Cox YR 2015 UL http://biorxiv.org/content/early/2015/11/12/031484.abstract AB Whole-genome low-coverage sequencing has been combined with linkage-disequilibrium (LD) based genotype refinement to accurately and cost-effectively infer genotypes in large cohorts of individuals. Most genotype refinement methods are based on hidden Markov models, which are accurate but computationally expensive. We introduce an algorithm that models LD using a simple multivariate Gaussian distribution. The key feature of our algorithm is its speed, it is hundreds of times faster than other methods on the same data set and its scaling behaviour is linear in the number of samples. We demonstrate the performance of the method on both low-coverage and high-coverage samples.Availability: The source code is available at https://github.com/sequencing/marvinContact: rarthur{at}illumina.com