PT - JOURNAL ARTICLE AU - Ajith Harish TI - What is an archaeon and are the Archaea really unique? AID - 10.1101/256263 DP - 2018 Jan 01 TA - bioRxiv PG - 256263 4099 - http://biorxiv.org/content/early/2018/03/23/256263.short 4100 - http://biorxiv.org/content/early/2018/03/23/256263.full AB - The recognition of the group Archaea as a major branch of the Tree of Life (ToL) prompted a new view of the evolution of biodiversity. The genomic representation of archaeal biodiversity has since significantly increased. In addition, advances in phylogenetic modeling of multi-locus datasets have resolved many recalcitrant branches of the ToL. Despite the technical advances and an expanded taxonomic representation, two important aspects of the origins and evolution of the Archaea remain controversial, even as we celebrate the 40th anniversary of the monumental discovery. These issues concern (i) the uniqueness (monophyly) of the Archaea, and (ii) the evolutionary relationships of the Archaea to the Bacteria and the Eukarya; both of these are relevant to the deep structure of the ToL. Here, to explore the causes for this persistent ambiguity, I examine multiple datasets that support contradicting conclusions. Results indicate that the uncertainty is primarily due to a scarcity of information in standard datasets — the core genes datasets — to reliably resolve the conflicts. These conflicts can be resolved efficiently by comparing patterns of variation in the distribution of functional genomic signatures, which are less diffused unlike patterns of primary sequence variation. Relatively lower heterogeneity in distribution patterns minimizes uncertainties, which supports statistically robust phylogenetic inferences, especially of the earliest divergences of life. This case study further highlights the limits of primary sequence data in resolving difficult phylogenetic problems and casts doubt on evolutionary inferences drawn solely from the analyses of a small set of core genes.