PT - JOURNAL ARTICLE AU - Brandon Legried AU - Erin K. Molloy AU - Tandy Warnow AU - Sébastien Roch TI - Polynomial-Time Statistical Estimation of Species Trees under Gene Duplication and Loss AID - 10.1101/821439 DP - 2020 Jan 01 TA - bioRxiv PG - 821439 4099 - http://biorxiv.org/content/early/2020/01/21/821439.short 4100 - http://biorxiv.org/content/early/2020/01/21/821439.full AB - Phylogenomics—the estimation of species trees from multilocus datasets—is a common step in many biological studies. However, this estimation is challenged by the fact that genes can evolve under processes, including incomplete lineage sorting (ILS) and gene duplication and loss (GDL), that make their trees different from the species tree. In this paper, we address the challenge of estimating the species tree under GDL. We show that species trees are identifiable under a standard stochastic model for GDL, and that the polynomial-time algorithm ASTRAL-multi, a recent development in the ASTRAL suite of methods, is statistically consistent under this GDL model. We also provide a simulation study evaluating ASTRAL-multi for species tree estimation under GDL. All scripts and datasets used in this study are available on the Illinois Data Bank: https://doi.org/10.13012/B2IDB-2626814_V1.