Abstract
The common ancestry of life is supported by an enormous body of evidence and is universally accepted within the scientific community. However, some potential sources of data that can be used to test the thesis of common ancestry have not yet been formally analyzed.
We developed a new test of common ancestry based on nucleotide sequences at amino acid invariant sites in aligned homologous protein coding genes. We reasoned that since nucleotide variation at amino acid invariant sites is selectively neutral and, thus, unlikely to be due to convergent evolution, the observation that an amino acid is consistently encoded by the same codon sequence in different species could provide strong evidence of their common ancestry. Our method uses the observed variation in codon sequences at amino acid invariant sites as a test statistic, and compares such variation to that which is expected under three different models of codon frequency under the alternative hypothesis of separate ancestry. We also examine hierarchical structure in the nucleotide sequences at amino acid invariant sites and quantified agreement between trees generated from amino acid sequence and those inferred from the nucleotide sequences at amino acid invariant sites.
When these tests are applied to the primate families as a test case, we find that observed nucleotide variation at amino acid invariant sites is considerably lower than nucleotide variation predicted by any model of codon frequency under separate ancestry. Phylogenetic trees generated from amino-acid invariant site nucleotide data agree with those generated from protein-coding data, and there is far more hierarchical structure in amino-acid invariant site data than would be expected under separate ancestry.
We definitively reject the separate ancestry of the primate families, and demonstrate that our tests can be applied to any group of interest to test common ancestry.