RT Journal Article SR Electronic T1 Accounting for Gene Flow from Unsampled ‘Ghost’ Populations while Estimating Evolutionay History under the Isolation with Migration Model JF bioRxiv FD Cold Spring Harbor Laboratory SP 733600 DO 10.1101/733600 A1 Arun Sethuraman A1 Melissa Lynch YR 2020 UL http://biorxiv.org/content/early/2020/05/23/733600.abstract AB Unsampled or extinct ‘ghost’ populations leave signatures on the genomes of individuals from extant, sampled populations, especially if they have exchanged genes with them over evolutionary time. This gene flow from ‘ghost’ populations can introduce biases when estimating evolutionary history from genomic data, often leading to data misinterpretation and ambiguous results. Here we assess these biases while accounting, or not accounting for gene flow from ‘ghost’ populations under the Isolation with Migration (IM) model. We perform extensive simulations under five scenarios with no gene flow (Scenario A), to extensive gene flow to- and from- an unsampled ‘ghost’ population (Scenarios B, C, D, and E). Estimates of evolutionary history across all scenarios A-E (effective population sizes, divergence times, and migration rates) indicate consistent a) under-estimation of divergence times between sampled populations, (b) over-estimation of effective population sizes of sampled populations, and (c) under-estimation of migration rates between sampled populations, with increased gene flow from the unsampled ‘ghost’ population. Without accounting for an unsampled ‘ghost’, summary statistics like FST are under-estimated, and π is over-estimated with increased gene flow from the‘ghost’. To show this persistent issue in empirical data, we use a 355 locus dataset from African Hunter-Gatherer populations and discuss similar biases in estimating evolutionary history while not accounting for unsampled ‘ghosts’. Considering the large effects of gene flow from these ‘ghosts’, we propose a multi-pronged approach to account for the presence of unsampled ‘ghost’ populations in population genomics studies to reduce erroneous inferences.Competing Interest StatementThe authors have declared no competing interest.