Identification of deleterious mutations within three human genomes

  1. Sung Chun1 and
  2. Justin C. Fay1,2,3
  1. 1 Computational Biology Program, Washington University, St. Louis, Missouri 63108, USA;
  2. 2 Department of Genetics, Washington University, St. Louis, Missouri 63108, USA

    Abstract

    Each human carries a large number of deleterious mutations. Together, these mutations make a significant contribution to human disease. Identification of deleterious mutations within individual genome sequences could substantially impact an individual's health through personalized prevention and treatment of disease. Yet, distinguishing deleterious mutations from the massive number of nonfunctional variants that occur within a single genome is a considerable challenge. Using a comparative genomics data set of 32 vertebrate species we show that a likelihood ratio test (LRT) can accurately identify a subset of deleterious mutations that disrupt highly conserved amino acids within protein-coding sequences, which are likely to be unconditionally deleterious. The LRT is also able to identify known human disease alleles and performs as well as two commonly used heuristic methods, SIFT and PolyPhen. Application of the LRT to three human genomes reveals 796–837 deleterious mutations per individual, ∼40% of which are estimated to be at <5% allele frequency. However, the overlap between predictions made by the LRT, SIFT, and PolyPhen, is low; 76% of predictions are unique to one of the three methods, and only 5% of predictions are shared across all three methods. Our results indicate that only a small subset of deleterious mutations can be reliably identified, but that this subset provides the raw material for personalized medicine.

    Footnotes

    • 3 Corresponding author.

      E-mail jfay{at}genetics.wustl.edu; fax (314) 632-2156.

    • [Supplemental material is available online at http://www.genome.org.]

    • Article published online before print. Article and publication date are at http://www.genome.org/cgi/doi/10.1101/gr.092619.109.

      • Received February 10, 2009.
      • Accepted July 10, 2009.
    | Table of Contents

    Preprint Server