A maximum likelihood method for detecting functional divergence at individual codon sites, with application to gene family evolution

J Mol Evol. 2004 Jul;59(1):121-32. doi: 10.1007/s00239-004-2597-8.

Abstract

The tailoring of existing genetic systems to new uses is called genetic co-option. Mechanisms of genetic co-option have been difficult to study because of difficulties in identifying functionally important changes. One way to study genetic co-option in protein-coding genes is to identify those amino acid sites that have experienced changes in selective pressure following a genetic co-option event. In this paper we present a maximum likelihood method useful for measuring divergent selective pressures and identifying the amino acid sites affected by divergent selection. The method is based on a codon model of evolution and uses the nonsynonymous-to-synonymous rate ratio (omega) as a measure of selection on the protein, with omega = 1, < 1, and > 1 indicating neutral evolution, purifying selection, and positive selection, respectively. The model allows variation in omega among sites, with a fraction of sites evolving under divergent selective pressures. Divergent selection is indicated by different omega's between clades, such as between paralogous clades of a gene family. We applied the codon model to duplication followed by functional divergence of (i) the epsilon and gamma globin genes and (ii) the eosinophil cationic protein (ECP) and eosinophil-derived neurotoxin (EDN) genes. In both cases likelihood ratio tests suggested the presence of sites evolving under divergent selective pressures. Results of the epsilon and gamma globin analysis suggested that divergent selective pressures might be a consequence of a weakened relationship between fetal hemoglobin and 2,3-diphosphoglycerate. We suggest that empirical Bayesian identification of sites evolving under divergent selective pressures, combined with structural and functional information, can provide a valuable framework for identifying and studying mechanisms of genetic co-option. Limitations of the new method are discussed.

Publication types

  • Comparative Study
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Amino Acids / genetics*
  • Bayes Theorem
  • Codon / genetics*
  • Eosinophil Cationic Protein / genetics
  • Eosinophil-Derived Neurotoxin / genetics
  • Evolution, Molecular*
  • Gene Duplication
  • Genetic Variation
  • Globins / genetics
  • Likelihood Functions
  • Models, Genetic*
  • Multigene Family / genetics*
  • Phylogeny
  • Selection, Genetic

Substances

  • Amino Acids
  • Codon
  • Globins
  • Eosinophil-Derived Neurotoxin
  • Eosinophil Cationic Protein