Positive selection on the H3 hemagglutinin gene of human influenza virus A

Mol Biol Evol. 1999 Nov;16(11):1457-65. doi: 10.1093/oxfordjournals.molbev.a026057.

Abstract

The hemagglutinin (HA) gene of influenza viruses encodes the major surface antigen against which neutralizing antibodies are produced during infection or vaccination. We examined temporal variation in the HA1 domain of HA genes of human influenza A (H3N2) viruses in order to identify positively selected codons. Positive selection is defined for our purposes as a significant excess of nonsilent over silent nucleotide substitutions. If past mutations at positively selected codons conferred a selective advantage on the virus, then additional changes at these positions may predict which emerging strains will predominate and cause epidemics. We previously reported that a 38% excess of mutations occurred on the tip or terminal branches of the phylogenetic tree of 254 HA genes of influenza A (H3N2) viruses. Possible explanations for this excess include processes other than viral evolution during replication in human hosts. Of particular concern are mutations that occur during adaptation of viruses for growth in embryonated chicken eggs in the laboratory. Because the present study includes 357 HA sequences (a 40% increase), we were able to separately analyze those mutations assigned to internal branches. This allowed us to determine whether mutations on terminal and internal branches exhibit different patterns of selection at the level of individual codons. Additional improvements over our previous analysis include correction for a skew in the distribution of amino acid replacements across codons and analysis of a population of phylogenetic trees rather than a single tree. The latter improvement allowed us to ascertain whether minor variation in tree structure had a significant effect on our estimate of the codons under positive selection. This method also estimates that 75.6% of the nonsilent mutations are deleterious and have been removed by selection prior to sampling. Using the larger data set and the modified methods, we confirmed a large (40%) excess of changes on the terminal branches. We also found an excess of changes on branches leading to egg-grown isolates. Furthermore, 9 of the 18 amino acid codons, identified as being under positive selection to change when we used only mutations assigned to internal branches, were not under positive selection on the terminal branches. Thus, although there is overlap between the selected codons on terminal and internal branches, the codons under positive selection on the terminal branches differ from those on the internal branches. We also observed that there is an excess of positively selected codons associated with the receptor-binding site and with the antibody-combining sites. This association may explain why the positively selected codons are restricted in their distribution along the sequence. Our results suggest that future studies of positive selection should focus on changes assigned to the internal branches, as certain of these changes may have predictive value for identifying future successful epidemic variants.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Amino Acid Substitution
  • Codon
  • Hemagglutinin Glycoproteins, Influenza Virus / genetics*
  • Influenza A virus / chemistry
  • Influenza A virus / genetics*
  • Phylogeny

Substances

  • Codon
  • Hemagglutinin Glycoproteins, Influenza Virus

Associated data

  • GENBANK/AF180564
  • GENBANK/AF180565
  • GENBANK/AF180566
  • GENBANK/AF180567
  • GENBANK/AF180568
  • GENBANK/AF180569
  • GENBANK/AF180570
  • GENBANK/AF180571
  • GENBANK/AF180572
  • GENBANK/AF180573
  • GENBANK/AF180574
  • GENBANK/AF180575
  • GENBANK/AF180576
  • GENBANK/AF180577
  • GENBANK/AF180578
  • GENBANK/AF180579
  • GENBANK/AF180580
  • GENBANK/AF180581
  • GENBANK/AF180582
  • GENBANK/AF180583
  • GENBANK/AF180584
  • GENBANK/AF180585
  • GENBANK/AF180586
  • GENBANK/AF180587
  • GENBANK/AF180588
  • GENBANK/AF180589
  • GENBANK/AF180590
  • GENBANK/AF180591
  • GENBANK/AF180592
  • GENBANK/AF180593