The first sheep graph-based pan-genome reveals the spectrum of structural variations and their effects on tail phenotypes

Abstract
Structural variations (SVs) are a major contributor to genetic diversity and phenotypic variations, but their prevalence and functions in domestic animals are largely unexplored. Here, we assembled 26 haplotype-resolved genome assemblies from 13 genetically diverse sheep using PacBio HiFi sequencing. We constructed a graph-based ovine pan-genome and discovered 142,422 biallelic insertions and deletions, 7,028 divergent alleles and 13,419 multiallelic variations. We then used a graph-based approach to genotype the biallelic SVs in 684 individuals from 45 domestic breeds and two wild species. Integration with RNA-seq data allows to identify candidate expression-associated SVs. We demonstrate a direct link of SVs and phenotypes by localizing the putative causative insertion in HOXB13 gene responsible for the long-tail trait and identifying multiple large SVs associated with the fat-tail. Beyond generating a benchmark resource for ovine structural variants, our study highlights that animal genetic research will greatly benefit from using a pan-genome graph rather than a single reference genome.
Competing Interest Statement
The authors have declared no competing interest.
Subject Area
- Biochemistry (9599)
- Bioengineering (7092)
- Bioinformatics (24863)
- Biophysics (12615)
- Cancer Biology (9957)
- Cell Biology (14354)
- Clinical Trials (138)
- Developmental Biology (7948)
- Ecology (12107)
- Epidemiology (2067)
- Evolutionary Biology (15989)
- Genetics (10925)
- Genomics (14743)
- Immunology (9869)
- Microbiology (23675)
- Molecular Biology (9485)
- Neuroscience (50869)
- Paleontology (369)
- Pathology (1539)
- Pharmacology and Toxicology (2683)
- Physiology (4015)
- Plant Biology (8657)
- Synthetic Biology (2397)
- Systems Biology (6436)
- Zoology (1346)