Skip to main content
bioRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search
New Results

Approximate Likelihood Inference of Complex Population Histories and Recombination from Multiple Genomes

Champak R. Beeravolu, Michael J. Hickerson, Laurent A.F. Frantz, Konrad Lohse
doi: https://doi.org/10.1101/077958
Champak R. Beeravolu
1Biology Department, The City College of New York, New York, New York 10031, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Michael J. Hickerson
1Biology Department, The City College of New York, New York, New York 10031, USA
2The Graduate Center, The City University of New York, New York, New York 10016, USA
3Division of Invertebrate Zoology, American Museum of Natural History, New York, New York 10024, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Laurent A.F. Frantz
4Paleogenomics and Bio-Archaeology Research Network, Research Laboratory for Archeology and History of Art, University of Oxford, Oxford OX1 3QY, UK
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Konrad Lohse
5Institute of Evolutionary Biology, University of Edinburgh, King’s Buildings, Edinburgh EH9 3FL, UK
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Preview PDF
Loading

Abstract

The wealth of information contained in genome-scale datasets has substantially encouraged the development of methods inferring population histories with unprecedented resolution. Methods based on the Site Frequency Spectrum (SFS) are computationally efficient but discard information about linkage disequilibrium, while methods making use of linkage and recombination are computationally more intensive and rely on approximations such as the Sequentially Markov Coalescent. Overcoming these limitations, we introduce a novel Composite Likelihood (CL) framework which allows for the joint inference of arbitrarily complex population histories and the average genome-wide recombination rate from multiple genomes. We build upon an existing analytic approach that partitions the genome into blocks of equal (and arbitrary) size and summarizes the polymorphism and linkage information as blockwise counts of SFS types (bSFS). This statistic is a richer summary than the SFS because it retains information on the variation in genealogies contained in short-range linkage blocks across the genome. Our method, ABLE (Approximate Blockwise Likelihood Estimation), approximates the CL of arbitrary population histories via Monte Carlo simulations form the coalescent with recombination and overcomes limitations arising from analytical likelihood calculations. ABLE is first assessed by comparing it to expected analytic results for small samples and no intra-block recombination. The power of this approach is further illustrated by using whole genome data from the two species of orangutan and comparing our inferences under a series of models involving divergence and various forms of continuous or pulsed admixture with previous analyses based on the SFS and the SMC. Finally, we explore the effects of sampling (different block lengths and number of individuals) and find that accurate inference of demography and recombination can be achieved with reasonable computational effort. Our approach is also notably adapted to unphased data and fragmented assemblies making it particularly suitable for model as well as non-model organisms.

Author Summary In this paper, we make use of the distribution of blockwise SFS (bSFS) patterns for the inference of arbitrary population histories from mutliple genome sequences. The latter can be whole genomes or of a fragmented nature such as RADSeq data. Our method notably allows for the simultaneous inference of demographic history and the genome-wide historical recombination rate. Additionally, we do not require phased genomes as the bSFS approach does not distinguish the sampled lineage in which a mutation occurred. As with the Site Frequency Spectrum (SFS), we can also ignore outgroups by folding the bSFS. Our Approximate Blockwise Likelihood Estimation (ABLE) approach implemented in C/C++ and taking advantage of parallel computing power is tailored for studying the population histories of model as well as non-model species.

Footnotes

  • ↵* champak.br{at}gmail.com

Copyright 
The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license.
Back to top
PreviousNext
Posted September 28, 2016.
Download PDF
Email

Thank you for your interest in spreading the word about bioRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
Approximate Likelihood Inference of Complex Population Histories and Recombination from Multiple Genomes
(Your Name) has forwarded a page to you from bioRxiv
(Your Name) thought you would like to see this page from the bioRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
Approximate Likelihood Inference of Complex Population Histories and Recombination from Multiple Genomes
Champak R. Beeravolu, Michael J. Hickerson, Laurent A.F. Frantz, Konrad Lohse
bioRxiv 077958; doi: https://doi.org/10.1101/077958
Digg logo Reddit logo Twitter logo Facebook logo Google logo LinkedIn logo Mendeley logo
Citation Tools
Approximate Likelihood Inference of Complex Population Histories and Recombination from Multiple Genomes
Champak R. Beeravolu, Michael J. Hickerson, Laurent A.F. Frantz, Konrad Lohse
bioRxiv 077958; doi: https://doi.org/10.1101/077958

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Evolutionary Biology
Subject Areas
All Articles
  • Animal Behavior and Cognition (3484)
  • Biochemistry (7336)
  • Bioengineering (5308)
  • Bioinformatics (20225)
  • Biophysics (9991)
  • Cancer Biology (7717)
  • Cell Biology (11280)
  • Clinical Trials (138)
  • Developmental Biology (6426)
  • Ecology (9930)
  • Epidemiology (2065)
  • Evolutionary Biology (13298)
  • Genetics (9354)
  • Genomics (12566)
  • Immunology (7687)
  • Microbiology (18979)
  • Molecular Biology (7428)
  • Neuroscience (40944)
  • Paleontology (300)
  • Pathology (1226)
  • Pharmacology and Toxicology (2132)
  • Physiology (3146)
  • Plant Biology (6850)
  • Scientific Communication and Education (1272)
  • Synthetic Biology (1893)
  • Systems Biology (5306)
  • Zoology (1087)