Skip to main content
bioRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search
New Results

Fast and accurate long-range phasing and imputation in a UK Biobank cohort

Po-Ru Loh, Pier Francesco Palamara, Alkes L Price
doi: https://doi.org/10.1101/028282
Po-Ru Loh
1Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, Massachusetts, USA.
2Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, Massachusetts, USA.
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Pier Francesco Palamara
1Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, Massachusetts, USA.
2Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, Massachusetts, USA.
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Alkes L Price
1Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, Massachusetts, USA.
2Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, Massachusetts, USA.
3Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, Massachusetts, USA.
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Supplementary material
  • Preview PDF
Loading

Abstract

Recent work has leveraged the unique genealogical structure and extensive genotyping (>30%) of the Icelandic population to perform long-range phasing (LRP), enabling accurate imputation and association analysis of rare variants in target samples typed on genotyping arrays. Here, we develop a fast and accurate LRP method, Eagle, that extends this paradigm to outbred populations by harnessing long (>4cM) identical-by-descent (IBD) tracts shared among distantly related individuals. We applied Eagle to N=150K samples (0.2% of the British population) from the UK Biobank, and we determined that it is 1–2 orders of magnitude faster than existing methods while achieving exquisite phasing accuracy (switch error rate ≈0.3%, corresponding to perfect phase at the scale of >10Mb). Moreover, we observed that Eagle imputed masked genotypes with accuracy R2>0.75 down to a minor allele frequency of 0.1%. Compared to computationally tractable alternatives, Eagle attained large improvements in phasing and imputation accuracy at N=150K and smaller improvements at smaller sample sizes, illustrating the advantages that LRP-based imputation will yield as very large reference panels become available.

Copyright 
The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license.
Back to top
PreviousNext
Posted October 04, 2015.
Download PDF

Supplementary Material

Email

Thank you for your interest in spreading the word about bioRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
Fast and accurate long-range phasing and imputation in a UK Biobank cohort
(Your Name) has forwarded a page to you from bioRxiv
(Your Name) thought you would like to see this page from the bioRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
Fast and accurate long-range phasing and imputation in a UK Biobank cohort
Po-Ru Loh, Pier Francesco Palamara, Alkes L Price
bioRxiv 028282; doi: https://doi.org/10.1101/028282
Reddit logo Twitter logo Facebook logo LinkedIn logo Mendeley logo
Citation Tools
Fast and accurate long-range phasing and imputation in a UK Biobank cohort
Po-Ru Loh, Pier Francesco Palamara, Alkes L Price
bioRxiv 028282; doi: https://doi.org/10.1101/028282

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Genetics
Subject Areas
All Articles
  • Animal Behavior and Cognition (4685)
  • Biochemistry (10362)
  • Bioengineering (7682)
  • Bioinformatics (26343)
  • Biophysics (13534)
  • Cancer Biology (10694)
  • Cell Biology (15446)
  • Clinical Trials (138)
  • Developmental Biology (8501)
  • Ecology (12824)
  • Epidemiology (2067)
  • Evolutionary Biology (16867)
  • Genetics (11402)
  • Genomics (15484)
  • Immunology (10621)
  • Microbiology (25226)
  • Molecular Biology (10225)
  • Neuroscience (54482)
  • Paleontology (402)
  • Pathology (1669)
  • Pharmacology and Toxicology (2897)
  • Physiology (4345)
  • Plant Biology (9254)
  • Scientific Communication and Education (1587)
  • Synthetic Biology (2558)
  • Systems Biology (6781)
  • Zoology (1466)