Skip to main content
bioRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search
New Results

Ancestral Haplotype Reconstruction in Endogamous Populations using Identity-By-Descent

Kelly Finke, Michael Kourakos, Gabriela Brown, Yuval B. Simons, Alejandro A. Schäffer, Rachel L. Kember, Maja Bućan, Sara Mathieson
doi: https://doi.org/10.1101/2020.01.15.908459
Kelly Finke
Department of Computer Science, Swarthmore College, Swarthmore, PADepartment of Biology, Swarthmore College, Swarthmore, PA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Michael Kourakos
Department of Computer Science, Swarthmore College, Swarthmore, PA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Gabriela Brown
Department of Computer Science, Swarthmore College, Swarthmore, PA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Yuval B. Simons
Department of Genetics, Stanford University, Stanford, CA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Alejandro A. Schäffer
Cancer Data Science Laboratory, National Cancer Institute, National Institutes of Health, Bethesda, MD
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Rachel L. Kember
Department of Genetics, University of Pennsylvania, Philadelphia, PA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Maja Bućan
Department of Genetics, University of Pennsylvania, Philadelphia, PA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Sara Mathieson
Department of Computer Science, Haverford College, Haverford, PA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • For correspondence: smathieson@haverford.edu
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Supplementary material
  • Data/Code
  • Preview PDF
Loading

ABSTRACT

In this work we develop a novel algorithm for reconstructing the genomes of ancestral individuals, given genotype or sequence data from contemporary individuals and an extended pedigree of family relationships. A pedigree with complete genomes for every individual enables the study of allele frequency dynamics and haplotype diversity across generations, including deviations from neutrality such as transmission distortion. When studying heritable diseases, ancestral haplotypes can be used to augment genome-wide association studies or compute polygenic risk scores for the reconstructed individuals.

The building blocks of our reconstruction algorithm are segments of Identity-By-Descent (IBD) shared between two or more genotyped individuals. The method alternates between finding a source for each IBD segment and assembling IBD segments placed within each ancestral individual. After each iteration we perform conflict resolution to remove IBD segments that do not align with well-reconstructed haplotypes and upweight the probability that these segments should be placed in other individuals. We repeat this process until we are no longer successfully reconstructing additional ancestral haplotypes. Unlike previous approaches, our method is able to accommodate complex pedigree structures with hundreds of individuals genotyped at millions of SNPs.

We apply our method to an Old Order Amish pedigree from Lancaster, Pennsylvania, whose founders came to the United States from Europe during the early 18th century. The pedigree includes 1338 individuals from the past 10 generations, 394 with genotype data. The motivation for reconstruction is to understand the genetic basis of diseases segregating in the family through tracking haplotype transmission over time. Using our algorithm thread, we are able to reconstruct an average of 230 ancestral individuals per autosome. thread was developed for endogamous populations, but can be applied to any extensive pedigree with the recent generations genotyped. We anticipate that this type of practical ancestral reconstruction will become more common and necessary to understand rare and complex heritable diseases in extended families.

Footnotes

  • https://github.com/mathiesonlab/thread

Copyright 
The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license.
Back to top
PreviousNext
Posted January 16, 2020.
Download PDF

Supplementary Material

Data/Code
Email

Thank you for your interest in spreading the word about bioRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
Ancestral Haplotype Reconstruction in Endogamous Populations using Identity-By-Descent
(Your Name) has forwarded a page to you from bioRxiv
(Your Name) thought you would like to see this page from the bioRxiv website.
Share
Ancestral Haplotype Reconstruction in Endogamous Populations using Identity-By-Descent
Kelly Finke, Michael Kourakos, Gabriela Brown, Yuval B. Simons, Alejandro A. Schäffer, Rachel L. Kember, Maja Bućan, Sara Mathieson
bioRxiv 2020.01.15.908459; doi: https://doi.org/10.1101/2020.01.15.908459
Digg logo Reddit logo Twitter logo CiteULike logo Facebook logo Google logo Mendeley logo
Citation Tools
Ancestral Haplotype Reconstruction in Endogamous Populations using Identity-By-Descent
Kelly Finke, Michael Kourakos, Gabriela Brown, Yuval B. Simons, Alejandro A. Schäffer, Rachel L. Kember, Maja Bućan, Sara Mathieson
bioRxiv 2020.01.15.908459; doi: https://doi.org/10.1101/2020.01.15.908459

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Bioinformatics
Subject Areas
All Articles
  • Animal Behavior and Cognition (1641)
  • Biochemistry (2722)
  • Bioengineering (1902)
  • Bioinformatics (10203)
  • Biophysics (4174)
  • Cancer Biology (3202)
  • Cell Biology (4522)
  • Clinical Trials (135)
  • Developmental Biology (2831)
  • Ecology (4447)
  • Epidemiology (2041)
  • Evolutionary Biology (7213)
  • Genetics (5464)
  • Genomics (6795)
  • Immunology (2380)
  • Microbiology (7462)
  • Molecular Biology (2978)
  • Neuroscience (18529)
  • Paleontology (135)
  • Pathology (472)
  • Pharmacology and Toxicology (776)
  • Physiology (1147)
  • Plant Biology (2692)
  • Scientific Communication and Education (679)
  • Synthetic Biology (885)
  • Systems Biology (2840)
  • Zoology (465)