PT - JOURNAL ARTICLE AU - Alexandre M. Harris AU - Michael DeGiorgio TI - Identifying and classifying shared selective sweeps from multilocus data AID - 10.1101/446005 DP - 2019 Jan 01 TA - bioRxiv PG - 446005 4099 - http://biorxiv.org/content/early/2019/04/05/446005.short 4100 - http://biorxiv.org/content/early/2019/04/05/446005.full AB - Positive selection causes beneficial alleles to rise to high frequency, resulting in a selective sweep of the diversity surrounding the selected sites. Accordingly, the signature of a selective sweep in an ancestral population may still remain in its descendants. Identifying signatures of selection in the ancestor that are shared among its descendants is important to contextualize the timing of a sweep, but few methods exist for this purpose. We introduce the statistic SS-H12, which can identify genomic regions under shared positive selection across populations and is based on the theory of the expected haplotype homozygosity statistic H12, which detects recent hard and soft sweeps from the presence of high-frequency haplotypes. SS-H12, is distinct from other statistics that detect shared sweeps because it requires a minimum of only two populations, and properly identifies and differentiates between independent convergent sweeps and true ancestral sweeps, with high power and robustness to a variety of demographic models. Furthermore, we can apply SS-H12 in conjunction with the ratio of a different set of expected haplotype homozygosity statistics to further classify identified shared sweeps as hard or soft. Finally, we identified both previously-reported and novel shared sweep candidates from whole-genome sequences of global human populations. Previously-reported candidates include the well-characterized ancestral sweeps at LCT and SLC24A5 in Indo-European populations, as well as GPHN worldwide. Novel candidates include an ancestral sweep at RGS18 in sub-Saharan African populations involved in regulating the platelet response and implicated in sudden cardiac death, and a convergent sweep at C2CD5 between European and East Asian populations that may explain their different insulin responses.Introduction