Limits of principal components analysis for producing a common trait space: implications for inferring selection, contingency, and chance in evolution

PLoS One. 2009 Nov 23;4(11):e7957. doi: 10.1371/journal.pone.0007957.

Abstract

Background: Comparing patterns of divergence among separate lineages or groups has posed an especially difficult challenge for biologists. Recently a new, conceptually simple methodology called the "ordered-axis plot" approach was introduced for the purpose of comparing patterns of diversity in a common morphospace. This technique involves a combination of principal components analysis (PCA) and linear regression. Given the common use of these statistics the potential for the widespread use of the ordered axis approach is high. However, there are a number of drawbacks to this approach, most notably that lineages with the greatest amount of variance will largely bias interpretations from analyses involving a common morphospace. Therefore, without meeting a set of a priori requirements regarding data structure the ordered-axis plot approach will likely produce misleading results.

Methodology/principal findings: Morphological data sets from cichlid fishes endemic to Lakes Tanganyika, Malawi, and Victoria were used to statistically demonstrate how separate groups can have differing contributions to a common morphospace produced by a PCA. Through a matrix superimposition of eigenvectors (scale-free trajectories of variation identified by PCA) we show that some groups contribute more to the trajectories of variation identified in a common morphospace. Furthermore, through a set of randomization tests we show that a common morphospace model partitions variation differently than group-specific models. Finally, we demonstrate how these limitations may influence an ordered-axis plot approach by performing a comparison on data sets with known alterations in covariance structure. Using these results we provide a set of criteria that must be met before a common morphospace can be reliably used.

Conclusions/significance: Our results suggest that a common morphospace produced by PCA would not be useful for producing biologically meaningful results unless a restrictive set of criteria are met. We therefore suggest biologists be aware of the limitations of the ordered-axis plot approach before employing it on their own data, and possibly consider other, less restrictive methods for addressing the same question.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Algorithms
  • Animals
  • Biological Evolution*
  • Biology / methods
  • Cichlids / anatomy & histology
  • Cichlids / physiology*
  • Models, Anatomic
  • Models, Biological
  • Models, Statistical
  • Multivariate Analysis
  • Phenotype
  • Population Dynamics
  • Principal Component Analysis
  • Selection, Genetic