ABSTRACT
Comprehensive and efficient gene hit selection from high throughput assays remains a critical bottleneck in realizing the potential of genome-scale studies in biology. Widely used methods such as setting of cutoffs, prioritizing pathway enrichments, or incorporating predicted network interactions offer divergent solutions yet are associated with critical analytical trade-offs, and are often combined in an ad hoc manner. The specific limitations of these individual approaches, the lack of a systematic way by which to integrate their rankings, and the inaccessibility of complex computational approaches to many researchers, has contributed to unexpected variability and limited overlap in the reported results from comparable genome-wide studies. Using a set of three highly studied genome-wide datasets for HIV host factors that have been broadly cited for their limited number of shared candidates, we characterize the specific complementary contributions of commonly used analysis approaches and find an optimal framework by which to integrate these methods. We describe Throughput Ranking by Iterative Analysis of Genomic Enrichment (TRIAGE), an integrated, iterative approach which uses pathway and network statistical methods and publicly available databases to optimize gene prioritization. TRIAGE is accessible as a secure, rapid, user-friendly web-based application (https://triage.niaid.nih.gov).
Competing Interest Statement
The authors have declared no competing interest.
Footnotes
Proper Formatting of the page sizes for figures. Changed title.