RT Journal Article SR Electronic T1 Developing and Evaluating Mappings of ICD-10 and ICD-10-CM Codes to PheCodes JF bioRxiv FD Cold Spring Harbor Laboratory SP 462077 DO 10.1101/462077 A1 Patrick Wu A1 Aliya Gifford A1 Xiangrui Meng A1 Xue Li A1 Harry Campbell A1 Tim Varley A1 Juan Zhao A1 Robert Carroll A1 Lisa Bastarache A1 Joshua C Denny A1 Evropi Theodoratou A1 Wei-Qi Wei YR 2019 UL http://biorxiv.org/content/early/2019/07/03/462077.abstract AB Background The PheCode system was built upon the International Classification of Diseases, Ninth Revision, Clinical Modification (ICD-9-CM) for phenome-wide association studies (PheWAS) in the electronic health record (EHR).Objective Here, we present our work on the development and evaluation of maps from ICD-10 and ICD-10-CM codes to PheCodes.Methods We mapped ICD-10 and ICD-10-CM codes to PheCodes using a number of methods and resources, such as concept relationships and explicit mappings from the Unified Medical Language System (UMLS), Observational Health Data Sciences and Informatics (OHDSI), Systematized Nomenclature of Medicine - Clinical Terms (SNOMED CT), and National Library of Medicine (NLM). We assessed the coverage of the maps in two databases: Vanderbilt University Medical Center (VUMC) using ICD-10-CM and the UK Biobank (UKBB) using ICD-10. We assessed the fidelity of the ICD-10-CM map in comparison to the gold-standard ICD-9-CM→PheCode map by investigating phenotype reproducibility and conducting a PheWAS.Results We mapped >75% of ICD-10-CM and ICD-10 codes to PheCodes. Of the unique codes observed in the VUMC (ICD-10-CM) and UKBB (ICD-10) cohorts, >90% were mapped to PheCodes. We observed 70-75% reproducibility for chronic diseases and <10% for an acute disease. A PheWAS with a lipoprotein(a) (LPA) genetic variant, rs10455872, using the ICD-9-CM and ICD-10-CM maps replicated two genotype-phenotype associations with similar effect sizes: coronary atherosclerosis (ICD-9-CM: P < .001, OR = 1.60 vs. ICD-10-CM: P < .001, OR = 1.60) and with chronic ischemic heart disease (ICD-9-CM: P < .001, OR = 1.5 vs. ICD-10-CM: P < .001, OR = 1.47).Conclusions This study introduces the initial “beta” versions of ICD-10 and ICD-10-CM to PheCode maps that will enable researchers to leverage accumulated ICD-10 and ICD-10-CM data for high-throughput PheWAS in the EHR.EHRelectronic health recordICDInternational Classification of DiseasesAHRQAgency for Healthcare Research and QualityCCSClinical Classification SoftwarePheWASphenome-wide association studiesCMClinical ModificationWHOWorld Health OrganizationNCHSNational Center for Health StatisticsUMLSUnified Medical Language SystemGEMGeneral Equivalence MappingSNOMED CTSystematized Nomenclature of Medicine Clinical TermsCUIConcept Unique IdentifierOHDSIObservational Health Data Sciences and InformaticsCDMCommon Data ModelNLMNational Library of MedicineVUMCVanderbilt University Medical CenterUKBBUK BiobankORodds ratioLPAlipoprotein(a)SNPsingle nucleotide polymorphismM:1many to oneSDstandard deviation