PT - JOURNAL ARTICLE AU - O. L. Rodriguez AU - W. S. Gibson AU - T. Parks AU - M. Emery AU - J. Powell AU - M. Strahl AU - G. Deikus AU - K. Auckland AU - E. E. Eichler AU - W. A. Marasco AU - R. Sebra AU - A. J. Sharp AU - M. L. Smith AU - A. Bashir AU - C. T. Watson TI - A novel framework for characterizing genomic haplotype diversity in the human immunoglobulin heavy chain locus AID - 10.1101/2020.04.19.049270 DP - 2020 Jan 01 TA - bioRxiv PG - 2020.04.19.049270 4099 - http://biorxiv.org/content/early/2020/05/07/2020.04.19.049270.short 4100 - http://biorxiv.org/content/early/2020/05/07/2020.04.19.049270.full AB - An incomplete ascertainment of genetic variation within the highly polymorphic immunoglobulin heavy chain locus (IGH) has hindered our ability to define genetic factors that influence antibody and B cell mediated processes. To date, methods for locus-wide genotyping of all IGH variant types do not exist. Here, we combine targeted long-read sequencing with a novel bioinformatics tool, IGenotyper, to fully characterize genetic variation within IGH in a haplotype-specific manner. We apply this approach to eight human samples, including a haploid cell line and two mother-father-child trios, and demonstrate the ability to generate high-quality assemblies (>98% complete and >99% accurate), genotypes, and gene annotations, including 2 novel structural variants and 16 novel gene alleles. We show that multiplexing allows for scaling of the approach without impacting data quality, and that our genotype call sets are more accurate than short-read (>35% increase in true positives and >97% decrease in false-positives) and array/imputation-based datasets. This framework establishes a foundation for leveraging IG genomic data to study population-level variation in the antibody response.Competing Interest StatementE.E.E. is on the scientific advisory board (SAB) of DNAnexus, Inc.