A high-quality reference panel reveals the complexity and distribution of structural genome changes in a human population

Abstract
Structural variation (SV) represents a major source of differences between individual human genomes and has been linked to disease phenotypes. However, the majority of studies provide neither a global view of the full spectrum of these variants nor integrate them into reference panels of genetic variation.
Here, we analyse whole genome sequencing data of 769 individuals from 250 Dutch families, and provide a haplotype-resolved map of 1.9 million genome variants across 9 different variant classes, including novel forms of complex indels, and retrotransposition-mediated insertions of mobile elements and processed RNAs. A large proportion are previously under reported variants sized between 21 and 100bp. We detect 4 megabases of novel sequence, encoding 11 new transcripts. Finally, we show 191 known, trait-associated SNPs to be in strong linkage disequilibrium with SVs and demonstrate that our panel facilitates accurate imputation of SVs in unrelated individuals. Our findings are essential for genome-wide association studies.
Subject Area
- Biochemistry (10309)
- Bioengineering (7629)
- Bioinformatics (26208)
- Biophysics (13454)
- Cancer Biology (10631)
- Cell Biology (15354)
- Clinical Trials (138)
- Developmental Biology (8458)
- Ecology (12761)
- Epidemiology (2067)
- Evolutionary Biology (16777)
- Genetics (11365)
- Genomics (15411)
- Immunology (10557)
- Microbiology (25063)
- Molecular Biology (10163)
- Neuroscience (54132)
- Paleontology (398)
- Pathology (1656)
- Pharmacology and Toxicology (2878)
- Physiology (4318)
- Plant Biology (9206)
- Synthetic Biology (2543)
- Systems Biology (6757)
- Zoology (1453)