PT - JOURNAL ARTICLE AU - Chunlei Wu AU - Adam Mark AU - Andrew I. Su TI - MyGene.info: gene annotation query as a service AID - 10.1101/009332 DP - 2014 Jan 01 TA - bioRxiv PG - 009332 4099 - http://biorxiv.org/content/early/2014/09/17/009332.short 4100 - http://biorxiv.org/content/early/2014/09/17/009332.full AB - Biomedical knowledge is often represented as annotations of biological entities such as genes, genetic variants, diseases, and drugs. For gene annotations, they are fragmented across data repositories like NCBI Entrez, Ensembl, UniProt, and hundreds (or more) of other specialized databases. While the volume and breadth of annotations is valuable, their fragmentation across many data silos is often frustrating and inefficient. Bioinformaticians everywhere must continuously and repetitively engage in data wrangling in an effort to comprehensively integrate knowledge from all these resources, and these uncoordinated efforts represent an enormous duplication of work. We previously released MyGene.info (http://mygene.info) to enable bioinformatics developers to gain programmatic access to gene annotation data through our high-performance web services. This article focuses on the updates to MyGene.info since our last paper (2013 database issue). With the completely re-factored system, MyGene.info now expands the support from the original nine species to over 14K species, covering >17M genes with >50 gene-specific annotation types. Two simple web service endpoints provides high-performance query access to all these aggregated gene annotations. The infrastructure underlying MyGene.info is highly scalable, which offers both high-performance and high-concurrency, and makes MyGene.info particularly suitable for the use cases of real-time applications and analysis pipelines.