PT - JOURNAL ARTICLE AU - Justin Wagner AU - Nathan D Olson AU - Lindsay Harris AU - Jennifer McDaniel AU - Haoyu Cheng AU - Arkarachai Fungtammasan AU - Yih-Chii Hwang AU - Richa Gupta AU - Aaron M Wenger AU - William J Rowell AU - Ziad M Khan AU - Jesse Farek AU - Yiming Zhu AU - Aishwarya Pisupati AU - Medhat Mahmoud AU - Chunlin Xiao AU - Byunggil Yoo AU - Sayed Mohammad Ebrahim Sahraeian AU - Danny E. Miller AU - David Jáspez AU - José M. Lorenzo-Salazar AU - Adrián Muñoz-Barrera AU - Luis A. Rubio-Rodríguez AU - Carlos Flores AU - Giuseppe Narzisi AU - Uday Shanker Evani AU - Wayne E. Clarke AU - Joyce Lee AU - Christopher E. Mason AU - Stephen E. Lincoln AU - Karen H. Miga AU - Mark T. W. Ebbert AU - Alaina Shumate AU - Heng Li AU - Chen-Shan Chin AU - Justin M Zook AU - Fritz J Sedlazeck TI - Towards a Comprehensive Variation Benchmark for Challenging Medically-Relevant Autosomal Genes AID - 10.1101/2021.06.07.444885 DP - 2021 Jan 01 TA - bioRxiv PG - 2021.06.07.444885 4099 - http://biorxiv.org/content/early/2021/06/07/2021.06.07.444885.short 4100 - http://biorxiv.org/content/early/2021/06/07/2021.06.07.444885.full AB - The repetitive nature and complexity of multiple medically important genes make them intractable to accurate analysis, despite the maturity of short-read sequencing, resulting in a gap in clinical applications of genome sequencing. The Genome in a Bottle Consortium has provided benchmark variant sets, but these excluded some medically relevant genes due to their repetitiveness or polymorphic complexity. In this study, we characterize 273 of these 395 challenging autosomal genes that have multiple implications for medical sequencing. This extended, curated benchmark reports over 17,000 SNVs, 3,600 INDELs, and 200 SVs each for GRCh37 and GRCh38. We show that false duplications in either GRCh37 or GRCh38 result in reference-specific, missed variants for short- and long-read technologies in medically important genes including CBS, CRYAA, and KCNE1. Our proposed solution improves variant recall in these genes from 8% to 100%. This benchmark will significantly improve the comprehensive characterization of these medically relevant genes and guide new method development.Competing Interest StatementAMW and WJR are employees and shareholders of Pacific Biosciences. AF and CSC are employees and shareholders of DNAnexus. SMES is an employee of Roche. JL is an employee of Bionano Genomics. FJS has sponsored travel from Pacific Biosciences and Oxford Nanopore.