ABSTRACT
Objective Genetic screening is the gold standard for biogeographical ancestry (i.e. race), but this information is often unavailable to those developing research studies. We assessed agreement between census- and electronic health record (EHR)-derived demographic data with genetic ancestry to determine if these sources could support selection of diverse cohorts.
Materials and Methods We identified a population of 4,837 genotyped patients and determined concordance between genetic measures of ancestry against race derived from decennial nationwide census, electronic medical records, and self-report.
Results We identified a 90% or greater concordance between the EHR-derived data and genetic ancestry. Census data had a high concordance (97%) with genetic and self-reported data for patients of European ancestry but low concordance for patients of African ancestry (64%).
Discussion and Conclusions The high concordance between EHR-derived race and genetic ancestry suggests that EHR-derived information could be an effective proxy for race when recruiting for diverse research cohorts.