A critical assessment of storytelling: gene ontology categories and the importance of validating genomic scans

Mol Biol Evol. 2012 Oct;29(10):3237-48. doi: 10.1093/molbev/mss136. Epub 2012 May 21.

Abstract

In the age of whole-genome population genetics, so-called genomic scan studies often conclude with a long list of putatively selected loci. These lists are then further scrutinized to annotate these regions by gene function, corresponding biological processes, expression levels, or gene networks. Such annotations are often used to assess and/or verify the validity of the genome scan and the statistical methods that have been used to perform the analyses. Furthermore, these results are frequently considered to validate "true-positives" if the identified regions make biological sense a posteriori. Here, we show that this approach can be potentially misleading. By simulating neutral evolutionary histories, we demonstrate that it is possible not only to obtain an extremely high false-positive rate but also to make biological sense out of the false-positives and construct a sensible biological narrative. Results are compared with a recent polymorphism data set from Drosophila melanogaster.

Publication types

  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Animals
  • Computer Simulation
  • Databases, Genetic
  • Drosophila melanogaster / genetics*
  • Genes, Insect / genetics*
  • Genetics, Population
  • Genomics*
  • Reproducibility of Results
  • Selection, Genetic