Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

Genome-scale approaches to resolving incongruence in molecular phylogenies

Abstract

One of the most pervasive challenges in molecular phylogenetics is the incongruence between phylogenies obtained using different data sets, such as individual genes. To systematically investigate the degree of incongruence, and potential methods for resolving it, we screened the genome sequences of eight yeast species and selected 106 widely distributed orthologous genes for phylogenetic analyses, singly and by concatenation. Our results suggest that data sets consisting of single or a small number of concatenated genes have a significant probability of supporting conflicting topologies. By contrast, analyses of the entire data set of concatenated genes yielded a single, fully resolved species tree with maximum support. Comparable results were obtained with a concatenation of a minimum of 20 genes; substantially more genes than commonly used but a small fraction of any genome. These results have important implications for resolving branches of the tree of life.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Figure 1: Single-gene data sets generate multiple, robustly supported alternative topologies.
Figure 2: The distribution of bootstrap values for the eight prevalent branches recovered from 106 single-gene analyses highlights the pervasive conflict among single-gene analyses.
Figure 3: Extensive incongruence between trees derived from the 106 individual-gene data sets.
Figure 4: Phylogenetic analyses of the concatenated data set composed of 106 genes yield maximum support for a single tree, irrespective of method and type of character evaluated.
Figure 5: A minimum of 20 genes is required to recover >95% bootstrap values for each branch of the species tree.
Figure 6: Concatenation of genes supporting alternative branches leads to further amplification of bias.

Similar content being viewed by others

References

  1. Baldauf, S. L., Roger, A. J., Wenk-Siefert, I. & Doolittle, W. F. A kingdom-level phylogeny of eukaryotes based on combined protein data. Science 290, 972–977 (2000)

    Article  ADS  CAS  PubMed  Google Scholar 

  2. Brown, J. R., Douady, C. J., Italia, M. J., Marshall, W. E. & Stanhope, M. J. Universal trees based on large combined protein sequence data sets. Nature Genet. 28, 281–285 (2001)

    Article  CAS  PubMed  Google Scholar 

  3. Aguinaldo, A. M. A. et al. Evidence for a clade of nematodes, arthropods and other moulting animals. Nature 387, 489–493 (1997)

    Article  CAS  PubMed  Google Scholar 

  4. Knoll, A. H. & Carroll, S. B. Early animal evolution: emerging views from comparative biology and geology. Science 284, 2129–2137 (1999)

    Article  CAS  PubMed  Google Scholar 

  5. Pace, N. R. A molecular view of microbial diversity and the biosphere. Science 276, 734–740 (1997)

    Article  CAS  PubMed  Google Scholar 

  6. Kopp, A. & True, J. R. Phylogeny of the Oriental Drosophila melanogaster species group: a multilocus reconstruction. Syst. Biol. 51, 786–805 (2002)

    Article  PubMed  Google Scholar 

  7. Mason-Gamer, R. J. & Kellogg, E. A. Testing for phylogenetic conflict among molecular data sets in the tribe Triticeae (Gramineae). Syst. Biol. 45, 522–543 (1996)

    Article  Google Scholar 

  8. Giribet, G., Edgecombe, G. D. & Wheeler, W. C. Arthropod phylogeny based on eight molecular loci and morphology. Nature 413, 157–161 (2001)

    Article  ADS  CAS  PubMed  Google Scholar 

  9. Hwang, U. W., Friedrich, M., Tautz, D., Park, C. J. & Kim, W. Mitochondrial protein phylogeny joins myriapods with chelicerates. Nature 413, 154–157 (2001)

    Article  ADS  CAS  PubMed  Google Scholar 

  10. Rokas, A., King, N., Finnerty, J. & Carroll, S. B. Conflicting phylogenetic signals at the base of the metazoan tree. Evol. Dev. 5, 346–359 (2003)

    Article  CAS  PubMed  Google Scholar 

  11. Loytynoja, A. & Milinkovitch, M. C. Molecular phylogenetic analyses of the mitochondrial ADP–ATP carriers: the Plantae/Fungi/Metazoa trichotomy revisited. Proc. Natl Acad. Sci. USA 98, 10202–10207 (2001)

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  12. Baldauf, S. L. & Palmer, J. D. Animals and fungi are each other's closest relatives: congruent evidence from multiple proteins. Proc. Natl Acad. Sci. USA 90, 11558–11562 (1993)

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  13. Wendel, J. F. & Doyle, J. J. in Molecular Systematics of Plants II: DNA Sequencing (eds Soltis, D. E., Soltis, P. S. & Doyle, J. J.) 265–296 (Kluwer, Boston, Massachusetts, 1998)

    Book  Google Scholar 

  14. Huelsenbeck, J. P. Performance of phylogenetic methods in simulation. Syst. Biol. 44, 17–48 (1995)

    Article  Google Scholar 

  15. Philippe, H., Chenuil, A. & Adoutte, A. Can the cambrian explosion be inferred through molecular phylogeny? Development (Suppl.) 15–25 (1994)

  16. Cummings, M. P., Otto, S. P. & Wakeley, J. Sampling properties of DNA sequence data in phylogenetic analysis. Mol. Biol. Evol. 12, 814–822 (1995)

    CAS  PubMed  Google Scholar 

  17. Graybeal, A. Is it better to add taxa or characters to a difficult phylogenetic problem? Syst. Biol. 47, 9–17 (1998)

    Article  CAS  PubMed  Google Scholar 

  18. Yang, Z., Goldman, N. & Friday, A. Comparison of models for nucleotide substitution used in maximum-likelihood phylogenetic estimation. Mol. Biol. Evol. 11, 316–324 (1994)

    CAS  PubMed  Google Scholar 

  19. Martin, A. P. & Burg, T. M. Perils of paralogy: using HSP70 genes for inferring organismal phylogenies. Syst. Biol. 51, 570–587 (2002)

    Article  PubMed  Google Scholar 

  20. Maddison, W. P. Gene trees in species trees. Syst. Biol. 46, 523–536 (1997)

    Article  Google Scholar 

  21. Satta, Y., Klein, J. & Takahata, N. DNA archives and our nearest relative: the trichotomy problem revisited. Mol. Phylog. Evol. 14, 259–275 (2000)

    Article  CAS  Google Scholar 

  22. Rieseberg, L. H., Whitton, J. & Linder, C. R. Molecular marker incongruence in plant hybrid zones and in phylogenetic trees. Acta Bot. Neerland. 45, 243–262 (1996)

    Article  CAS  Google Scholar 

  23. Huelsenbeck, J. P., Bull, J. J. & Cunningham, C. W. Combining data in phylogenetic analysis. Trends Ecol. Evol. 11, 152–158 (1996)

    Article  CAS  PubMed  Google Scholar 

  24. Bull, J. J., Huelsenbeck, J. P., Cunningham, C. W., Swofford, D. L. & Waddell, P. J. Partitioning and combining data in phylogenetic analysis. Syst. Biol. 42, 384–397 (1993)

    Article  Google Scholar 

  25. Sullivan, J. Combining data with different distributions of among-site rate variation. Syst. Biol. 45, 375–380 (1996)

    Article  Google Scholar 

  26. Cunningham, C. W. Can three incongruence tests predict when data should be combined? Mol. Biol. Evol. 14, 733–740 (1997)

    Article  CAS  PubMed  Google Scholar 

  27. Soltis, P. S., Soltis, D. E. & Chase, M. W. Angiosperm phylogeny inferred from multiple genes as a tool for comparative biology. Nature 402, 402–404 (1999)

    Article  ADS  CAS  PubMed  Google Scholar 

  28. Murphy, W. J. et al. Molecular phylogenetics and the origins of placental mammals. Nature 409, 614–618 (2001)

    Article  ADS  CAS  PubMed  Google Scholar 

  29. Moreira, D., Le Guyader, H. & Philippe, H. The origin of red algae and the evolution of chloroplasts. Nature 405, 69–72 (2000)

    Article  ADS  CAS  PubMed  Google Scholar 

  30. Naylor, G. J. P. & Brown, W. M. Amphioxus mitochondrial DNA, chordate phylogeny, and the limits of inference based on comparisons of sequences. Syst. Biol. 47, 61–76 (1998)

    Article  CAS  PubMed  Google Scholar 

  31. Hillis, D. M. Inferring complex phylogenies. Nature 383, 130–131 (1996)

    Article  ADS  CAS  PubMed  Google Scholar 

  32. Kellis, M., Patterson, N., Endrizzi, M., Birren, B. & Lander, E. S. Sequencing and comparison of yeast species to identify genes and regulatory elements. Nature 423, 241–254 (2003)

    Article  ADS  CAS  PubMed  Google Scholar 

  33. Goffeau, A. et al. Life with 6000 genes. Science 274, 546 (1996) 563–567

    Article  ADS  CAS  PubMed  Google Scholar 

  34. Cliften, P. et al. Finding functional features in Saccharomyces genomes by phylogenetic footprinting. Science 301, 71–76 (2003)

    Article  ADS  CAS  PubMed  Google Scholar 

  35. Wolfe, K. H. & Shields, D. C. Molecular evidence for an ancient duplication of the entire yeast genome. Nature 387, 708–713 (1997)

    Article  ADS  CAS  PubMed  Google Scholar 

  36. Wong, S., Butler, G. & Wolfe, K. H. Gene order evolution and paleopolyploidy in hemiascomycete yeasts. Proc. Natl Acad. Sci. USA 99, 9272–9277 (2002)

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  37. Langkjaer, R. B., Cliften, P. F., Johnston, M. & Piskur, J. Yeast genome duplication was followed by asynchronous differentiation of duplicated genes. Nature 421, 848–852 (2003)

    Article  ADS  CAS  PubMed  Google Scholar 

  38. Hillis, D. M., Moritz, C. & Mable, B. K. (eds) Molecular Systematics (Sinauer, Sunderland, Massachusetts, 1996)

  39. Bapteste, E. et al. The analysis of 100 genes supports the grouping of three highly divergent amoebae: Dictyostelium, Entamoeba, and Mastigamoeba. Proc. Natl Acad. Sci. USA 99, 1414–1419 (2002)

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  40. Averof, M., Rokas, A., Wolfe, K. H. & Sharp, P. M. Evidence for a high frequency of simultaneous double-nucleotide substitutions. Science 287, 1283–1286 (2000)

    Article  ADS  CAS  PubMed  Google Scholar 

  41. Sanderson, M. J. & Shaffer, H. B. Troubleshooting molecular phylogenetic analyses. Annu. Rev. Ecol. Syst. 33, 49–72 (2002)

    Article  Google Scholar 

  42. Kurtzman, C. P. & Robnett, C. J. Phylogenetic relationships among yeasts of the ‘Saccharomyces complex’ determined from multigene sequence analyses. FEMS Yeast Res. 3, 417–432 (2003)

    Article  CAS  PubMed  Google Scholar 

  43. Lopez, P., Casane, D. & Philippe, H. Heterotachy, an important process of protein evolution. Mol. Biol. Evol. 19, 1–7 (2002)

    Article  CAS  PubMed  Google Scholar 

  44. Naumov, G. I., James, S. A., Naumova, E. S., Louis, E. J. & Roberts, I. N. Three new species in the Saccharomyces sensu stricto complex: Saccharomyces cariocanus, Saccharomyces kudriavzevii and Saccharomyces mikatae. Int. J. Syst. Evol. Microbiol. 50, 1931–1942 (2000)

    Article  CAS  PubMed  Google Scholar 

  45. Bergthorsson, U., Adams, K. L., Thomason, B. & Palmer, J. D. Widespread horizontal transfer of mitochondrial genes in flowering plants. Nature 424, 197–201 (2003)

    Article  ADS  CAS  PubMed  Google Scholar 

  46. Blair, J. E., Ikeo, K., Gojobori, T. & Hedges, S. B. The evolutionary position of nematodes. BMC Evol. Biol. 2, 7 (2002)

    Article  PubMed  PubMed Central  Google Scholar 

  47. Thompson, J. D., Higgins, D. G. & Gibson, T. J. Clustal-W—improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res. 22, 4673–4680 (1994)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  48. Hall, T. A. BioEdit: a user-friendly biological sequence alignment editor and analysis program for Windows 95/98/NT. Nucleic Acids Symp. Ser. 41, 95–98 (1999)

    CAS  Google Scholar 

  49. Swofford, D. L. PAUP*: Phylogenetic Analysis Using Parsimony (*and Other Methods) (Version 4.0b10) (Sinauer, Sunderland, Massachusetts, 2002)

    Google Scholar 

  50. Posada, D. & Crandall, K. A. MODELTEST: testing the model of DNA substitution. Bioinformatics 14, 817–818 (1998)

    Article  CAS  PubMed  Google Scholar 

Download references

Acknowledgements

We are grateful to P. Cliften, M. Johnston and the Washington University Genome Sequencing Center for access to genome sequence data for S. kudriavzevii, S. castellii and S. kluyveri; the staff of the Saccharomyces Genome Database (http://www.yeastgenome.org/) for access to genome sequence data for S. cerevisiae, S. paradoxus, S. mikatae and S. bayanus; and the Stanford Genome Technology Center website (http://www-sequence.stanford.edu/group/candida) for access to sequence data for C. albicans. We thank D. Baum, B. Hersh, C. Hittinger and K. Johnson for useful comments on the manuscript, D. Baum and members of the Carroll laboratory for useful discussions on phylogenetics, and D. Lautenschleger for computer support. A.R. is a Human Frontier Science Program long-term fellow, B.L.W. and N.K. are NIH post-doctoral fellows, and S.B.C. is an investigator of the Howard Hughes Medical Institute. This work was funded by the Howard Hughes Medical Institute.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sean B. Carroll.

Ethics declarations

Competing interests

The authors declare that they have no competing financial interests.

Supplementary information

Rights and permissions

Reprints and permissions

About this article

Cite this article

Rokas, A., Williams, B., King, N. et al. Genome-scale approaches to resolving incongruence in molecular phylogenies. Nature 425, 798–804 (2003). https://doi.org/10.1038/nature02053

Download citation

  • Received:

  • Accepted:

  • Issue Date:

  • DOI: https://doi.org/10.1038/nature02053

This article is cited by

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing