%0 Journal Article %A Richard W. Tourdot %A Cheng-Zhong Zhang %T Whole Chromosome Haplotype Phasing from Long-Range Sequencing %D 2019 %R 10.1101/629337 %J bioRxiv %P 629337 %X Haplotype phase represents the collective genetic variation between homologous chromosomes and is an essential feature of polyploid genomes. Determining the haplotype phase requires knowledge of both the genotypes at variant sites and their linkage across each homologous chromosome. Although short-read sequencing can produce accurate genotype information, it cannot resolve linkage between genotypes due to the short size (≲1kb) of sequencing fragments. Long-read and long-range sequencing technologies can reveal linkage information across a wide range of genomic lengths (10kb-100 Mb), but such information is often sparse and contaminated with different sources of errors. To what extent can long-range sequencing produce accurate long-range haplotype information remains unknown. Here we describe a general computational framework for inferring haplotype phase and assessing phasing accuracy from long-range sequencing data using a one-dimensional spin model. Building on this model, we demonstrate a two-tier phasing strategy that enables complete whole-chromosome phasing of diploid genomes combining 60× linked-reads sequencing and 60× Hi-C sequencing. The computationally inferred haplotypes from long-range sequencing show high completeness (>95%) and accuracy (~99%) when compared to haplotypes directly determined from sequencing of single chromosomes. Our results provide a scalable solution to generating completely phased genomes from bulk sequencing and enable haplotype-resolved genome analysis at large. %U https://www.biorxiv.org/content/biorxiv/early/2019/05/07/629337.full.pdf