PT - JOURNAL ARTICLE AU - Mengge Zhao AU - James M. Havrilla AU - Li Fang AU - Ying Chen AU - Jacqueline Peng AU - Cong Liu AU - Chao Wu AU - Mahdi Sarmady AU - Pablo Botas AU - Julián Isla AU - Gholson Lyon AU - Chunhua Weng AU - Kai Wang TI - Phen2Gene: Rapid Phenotype-Driven Gene Prioritization for Rare Diseases AID - 10.1101/870527 DP - 2019 Jan 01 TA - bioRxiv PG - 870527 4099 - http://biorxiv.org/content/early/2019/12/10/870527.short 4100 - http://biorxiv.org/content/early/2019/12/10/870527.full AB - Human Phenotype Ontology (HPO) terms are increasingly used in diagnostic settings to aid in the characterization of patient phenotypes. The HPO annotation database is updated frequently and can provide detailed phenotype knowledge on various human diseases, and many HPO terms are now mapped to candidate causal genes with binary relationships. To further improve the genetic diagnosis of rare diseases, we incorporated these HPO annotations, gene-disease databases, and gene-gene databases in a probabilistic model to build a novel HPO-driven gene prioritization tool, Phen2Gene. Phen2Gene accesses a database built upon this information called the HPO2Gene Knowledgebase (H2GKB), which provides weighted and ranked gene lists for every HPO term. Phen2Gene is then able to access the H2GKB for patient-specific lists of HPO terms or PhenoPackets descriptions supported by GA4GH (http://phenopackets.org/), calculate a prioritized gene list based on a probabilistic model, and output gene-disease relationships with great accuracy. Phen2Gene outperforms existing gene prioritization tools in speed, and acts as a real-time phenotype driven gene prioritization tool to aid the clinical diagnosis of rare undiagnosed diseases. In addition to a command line tool released under the MIT license (https://github.com/WGLab/Phen2Gene), we also developed a web server and web service (https://phen2gene.wglab.org/) for running the tool via web interface or RESTful API queries. Finally, we have curated a large amount of benchmarking data for phenotype-to-gene tools involving 197 patients across 76 scientific articles and 85 patients’ de-identified HPO term data from CHOP.HPOHuman Phenotype OntologyNGSNext-generation sequencingWESWhole exome sequencingWGSWhole genome sequencingCKDChronic kidney diseaseNLPnatural language processingOMIMOnline Mendelian Inheritance in ManEHRElectronic Health RecordsNCBONational Center for Biomedical OntologyHG2KBHPO2Gene Knowledgebase