Mapping overlapping functional elements embedded within the protein-coding regions of RNA viruses

Nucleic Acids Res. 2014 Nov 10;42(20):12425-39. doi: 10.1093/nar/gku981. Epub 2014 Oct 17.

Abstract

Identification of the full complement of genes and other functional elements in any virus is crucial to fully understand its molecular biology and guide the development of effective control strategies. RNA viruses have compact multifunctional genomes that frequently contain overlapping genes and non-coding functional elements embedded within protein-coding sequences. Overlapping features often escape detection because it can be difficult to disentangle the multiple roles of the constituent nucleotides via mutational analyses, while high-throughput experimental techniques are often unable to distinguish functional elements from incidental features. However, RNA viruses evolve very rapidly so that, even within a single species, substitutions rapidly accumulate at neutral or near-neutral sites providing great potential for comparative genomics to distinguish the signature of purifying selection. Computationally identified features can then be efficiently targeted for experimental analysis. Here we analyze alignments of protein-coding virus sequences to identify regions where there is a statistically significant reduction in the degree of variability at synonymous sites, a characteristic signature of overlapping functional elements. Having previously tested this technique by experimental verification of discoveries in selected viruses, we now analyze sequence alignments for ∼700 RNA virus species to identify hundreds of such regions, many of which have not been previously described.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Codon
  • Genetic Variation
  • Genome, Viral
  • Phylogeny
  • RNA Viruses / classification
  • RNA Viruses / genetics*
  • Recombination, Genetic
  • Sequence Alignment
  • Viral Proteins / genetics*

Substances

  • Codon
  • Viral Proteins