PT - JOURNAL ARTICLE AU - David Porubsky AU - Shilpa Garg AU - Ashley D. Sanders AU - Jan O. Korbel AU - Victor Guryev AU - Peter M. Lansdorp AU - Tobias Marschall TI - Dense and accurate whole-chromosome haplotyping of individual genomes AID - 10.1101/126136 DP - 2017 Jan 01 TA - bioRxiv PG - 126136 4099 - http://biorxiv.org/content/early/2017/04/10/126136.short 4100 - http://biorxiv.org/content/early/2017/04/10/126136.full AB - The diploid nature of the genome is neglected in many analyses done today, where a genome is perceived as a set of unphased variants with respect to a reference genome. Many important biological phenomena such as compound heterozygosity and epistatic effects between enhancers and target genes, however, can only be studied when haplotype-resolved genomes are available. This lack of haplotype-level analyses can be explained by a dearth of methods to produce dense and accurate chromosome-length haplotypes at reasonable costs. Here we introduce an integrative phasing strategy that combines global, but sparse haplotypes obtained from strand-specific single cell sequencing (Strand-seq) with dense, yet local, haplotype information available through long-read or linked-read sequencing. Our experiments provide comprehensive guidance on favorable combinations of Strand-seq libraries and sequencing coverages to obtain complete and genome-wide haplotypes of a single individual genome (NA12878) at manageable costs. We were able to reliably assign > 95% of alleles to their parental haplotypes using as few as 10 Strand-seq libraries in combination with 10-fold coverage PacBio data or, alternatively, 10X Genomics linked-read sequencing data. We conclude that the combination of Strand-seq with different sequencing technologies represents an attractive solution to chart the unique genetic variation of diploid genomes.