PT - JOURNAL ARTICLE AU - Olivier Delaneau AU - Jean-François Zagury AU - Matthew Robinson AU - Jonathan Marchini AU - Emmanouil Dermitzakis TI - Integrative haplotype estimation with sub-linear complexity AID - 10.1101/493403 DP - 2018 Jan 01 TA - bioRxiv PG - 493403 4099 - http://biorxiv.org/content/early/2018/12/13/493403.short 4100 - http://biorxiv.org/content/early/2018/12/13/493403.full AB - The number of human genomes being genotyped or sequenced increases exponentially and efficient haplotype estimation methods able to handle this amount of data are now required. Here, we present a new method, SHAPEIT4, which substantially improves upon other methods to process large genotype and high coverage sequencing datasets. It notably exhibits sub-linear scaling with sample size, provides highly accurate haplotypes and allows integrating external phasing information such as large reference panels of haplotypes, collections of pre-phased variants and long sequencing reads. We provide SHAPET4 in an open source format on https://odelaneau.github.io/shapeit4/ and demonstrate its performance in terms of accuracy and running times on two gold standard datasets: the UK Biobank data and the Genome In A Bottle.