TY - JOUR T1 - Developing and Evaluating Mappings of ICD-10 and ICD-10-CM Codes to PheCodes JF - bioRxiv DO - 10.1101/462077 SP - 462077 AU - Patrick Wu AU - Aliya Gifford AU - Xiangrui Meng AU - Xue Li AU - Harry Campbell AU - Tim Varley AU - Juan Zhao AU - Robert Carroll AU - Lisa Bastarache AU - Joshua C Denny AU - Evropi Theodoratou AU - Wei-Qi Wei Y1 - 2019/01/01 UR - http://biorxiv.org/content/early/2019/07/03/462077.abstract N2 - Background The PheCode system was built upon the International Classification of Diseases, Ninth Revision, Clinical Modification (ICD-9-CM) for phenome-wide association studies (PheWAS) in the electronic health record (EHR).Objective Here, we present our work on the development and evaluation of maps from ICD-10 and ICD-10-CM codes to PheCodes.Methods We mapped ICD-10 and ICD-10-CM codes to PheCodes using a number of methods and resources, such as concept relationships and explicit mappings from the Unified Medical Language System (UMLS), Observational Health Data Sciences and Informatics (OHDSI), Systematized Nomenclature of Medicine - Clinical Terms (SNOMED CT), and National Library of Medicine (NLM). We assessed the coverage of the maps in two databases: Vanderbilt University Medical Center (VUMC) using ICD-10-CM and the UK Biobank (UKBB) using ICD-10. We assessed the fidelity of the ICD-10-CM map in comparison to the gold-standard ICD-9-CM→PheCode map by investigating phenotype reproducibility and conducting a PheWAS.Results We mapped >75% of ICD-10-CM and ICD-10 codes to PheCodes. Of the unique codes observed in the VUMC (ICD-10-CM) and UKBB (ICD-10) cohorts, >90% were mapped to PheCodes. We observed 70-75% reproducibility for chronic diseases and <10% for an acute disease. A PheWAS with a lipoprotein(a) (LPA) genetic variant, rs10455872, using the ICD-9-CM and ICD-10-CM maps replicated two genotype-phenotype associations with similar effect sizes: coronary atherosclerosis (ICD-9-CM: P < .001, OR = 1.60 vs. ICD-10-CM: P < .001, OR = 1.60) and with chronic ischemic heart disease (ICD-9-CM: P < .001, OR = 1.5 vs. ICD-10-CM: P < .001, OR = 1.47).Conclusions This study introduces the initial “beta” versions of ICD-10 and ICD-10-CM to PheCode maps that will enable researchers to leverage accumulated ICD-10 and ICD-10-CM data for high-throughput PheWAS in the EHR.EHRelectronic health recordICDInternational Classification of DiseasesAHRQAgency for Healthcare Research and QualityCCSClinical Classification SoftwarePheWASphenome-wide association studiesCMClinical ModificationWHOWorld Health OrganizationNCHSNational Center for Health StatisticsUMLSUnified Medical Language SystemGEMGeneral Equivalence MappingSNOMED CTSystematized Nomenclature of Medicine Clinical TermsCUIConcept Unique IdentifierOHDSIObservational Health Data Sciences and InformaticsCDMCommon Data ModelNLMNational Library of MedicineVUMCVanderbilt University Medical CenterUKBBUK BiobankORodds ratioLPAlipoprotein(a)SNPsingle nucleotide polymorphismM:1many to oneSDstandard deviation ER -