RT Journal Article SR Electronic T1 Improving orthologous signal and model fit in datasets addressing the root of the animal phylogeny JF bioRxiv FD Cold Spring Harbor Laboratory SP 2022.11.21.517274 DO 10.1101/2022.11.21.517274 A1 Charley GP McCarthy A1 Peter O Mulhair A1 Karen Siu-Ting A1 Christopher J Creevey A1 Mary J O’Connell YR 2022 UL http://biorxiv.org/content/early/2022/11/22/2022.11.21.517274.abstract AB There is conflicting evidence as to whether Porifera (sponges) or Ctenophora (comb jellies) comprise the root of the animal phylogeny. Support for either a Porifera-sister or Ctenophore-sister tree has been extensively examined in the context of model selection, taxon sampling and outgroup selection. The influence of dataset construction is comparatively understudied. We re-examine five animal phylogeny datasets that have supported either root hypothesis using an approach designed to enrich orthologous signal in phylogenomic datasets. We find that many component orthogroups in animal datasets fail to recover major animal lineages as monophyletic with the exception of Ctenophora, regardless of the supported root. Enriching these datasets to retain orthogroups recovering ≥3 major lineages reduces dataset size by up to 50% while retaining underlying phylogenetic information and taxon sampling. Site- heterogeneous phylogenomic analysis of these enriched datasets recovers both Porifera-sister and Ctenophora-sister positions, even with additional constraints on outgroup sampling. Two datasets which previously supported Ctenophora-sister support Porifera-sister upon enrichment. All enriched datasets display improved model fitness under posterior predictive analysis. While not conclusively rooting animals at either Porifera or Ctenophora, our results indicate that dataset size and construction as well as model fit influence animal root inference.Competing Interest StatementThe authors have declared no competing interest.