TY - JOUR T1 - <em>D</em><sub>GEN</sub>: A Test Statistic for Detection of General Introgression Scenarios JF - bioRxiv DO - 10.1101/348649 SP - 348649 AU - Ryan A. Leo Elworth AU - Chabrielle Allen AU - Travis Benedict AU - Peter Dulworth AU - Luay Nakhleh Y1 - 2018/01/01 UR - http://biorxiv.org/content/early/2018/06/17/348649.abstract N2 - When two species hybridize, one outcome is the integration of genetic material from one species into the genome of the other, a process known as introgression. Detecting introgression in genomic data is a very important question in evolutionary biology. However, given that hybridization occurs between closely related species, a compli-cating factor for introgression detection is the presence of incomplete lineage sorting, or ILS. The D-statistic, famously referred to as the “ABBA-BABA” test, was pro-posed for introgression detection in the presence of ILS in data sets that consist of four genomes. More recently, DFOIL—a set of statistics—was introduced to extend the D-statistic to data sets of five genomes.The major contribution of this paper is demonstrating that the invariants underly-ing both the D-statistic and DFOIL can be derived automatically from the probability mass functions of gene tree topologies under the null species tree model and alterna-tive phylogenetic network model. Computational requirements aside, this automatic derivation provides a way to generalize these statistics to data sets of any size and with any scenarios of introgression. We demonstrate the accuracy of the general statistic, which we call DGEN, on simulated data sets with varying rates of introgression, and apply it to an empirical data set of mosquito genomes.We have implemented DGEN and made it available, both as a graphical user interface tool and as a command-line tool, as part of the freely available, open-source software package ALPHA (https://github.com/chilleo/ALPHA). ER -