GiRaF: robust, computational identification of influenza reassortments via graph mining

Nucleic Acids Res. 2011 Mar;39(6):e34. doi: 10.1093/nar/gkq1232. Epub 2010 Dec 21.

Abstract

Reassortments in the influenza virus--a process where strains exchange genetic segments--have been implicated in two out of three pandemics of the 20th century as well as the 2009 H1N1 outbreak. While advances in sequencing have led to an explosion in the number of whole-genome sequences that are available, an understanding of the rate and distribution of reassortments and their role in viral evolution is still lacking. An important factor in this is the paucity of automated tools for confident identification of reassortments from sequence data due to the challenges of analyzing large, uncertain viral phylogenies. We describe here a novel computational method, called GiRaF (Graph-incompatibility-based Reassortment Finder), that robustly identifies reassortments in a fully automated fashion while accounting for uncertainties in the inferred phylogenies. The algorithms behind GiRaF search large collections of Markov chain Monte Carlo (MCMC)-sampled trees for groups of incompatible splits using a fast biclique enumeration algorithm coupled with several statistical tests to identify sets of taxa with differential phylogenetic placement. GiRaF correctly finds known reassortments in human, avian, and swine influenza populations, including the evolutionary events that led to the recent 'swine flu' outbreak. GiRaF also identifies several previously unreported reassortments via whole-genome studies to catalog events in H5N1 and swine influenza isolates.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Computational Biology / methods*
  • Data Mining
  • Influenza A Virus, H1N1 Subtype / classification
  • Influenza A Virus, H1N1 Subtype / genetics
  • Influenza A Virus, H3N2 Subtype / classification
  • Influenza A Virus, H3N2 Subtype / genetics
  • Influenza A virus / classification*
  • Influenza A virus / genetics
  • Phylogeny*
  • Reassortant Viruses / classification*