Skip to main content
bioRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search
New Results

Assembly of 43 diverse human Y chromosomes reveals extensive complexity and variation

View ORCID ProfilePille Hallast, View ORCID ProfilePeter Ebert, View ORCID ProfileMark Loftus, View ORCID ProfileFeyza Yilmaz, View ORCID ProfilePeter A. Audano, View ORCID ProfileGlennis A. Logsdon, View ORCID ProfileMarc Jan Bonder, View ORCID ProfileWeichen Zhou, View ORCID ProfileWolfram Höps, View ORCID ProfileKwondo Kim, Chong Li, View ORCID ProfilePhilip Dishuck, View ORCID ProfileDavid Porubsky, View ORCID ProfileFotios Tsetsos, View ORCID ProfileJee Young Kwon, Qihui Zhu, View ORCID ProfileKatherine M. Munson, Patrick Hasenfeld, William T. Harvey, Alexandra P. Lewis, Jennifer Kordosky, Kendra Hoekzema, The Human Genome Structural Variation Consortium (HGSVC), View ORCID ProfileJan O. Korbel, View ORCID ProfileChris Tyler-Smith, View ORCID ProfileEvan E. Eichler, Xinghua Shi, View ORCID ProfileChristine R. Beck, View ORCID ProfileTobias Marschall, Miriam K. Konkel, View ORCID ProfileCharles Lee
doi: https://doi.org/10.1101/2022.12.01.518658
Pille Hallast
1The Jackson Laboratory for Genomic Medicine, Farmington, CT, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Pille Hallast
Peter Ebert
2Institute for Medical Biometry and Bioinformatics, Medical Faculty, Heinrich Heine University, Düsseldorf, Germany
3Core Unit Bioinformatics, Medical Faculty, Heinrich Heine University, Düsseldorf, Germany
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Peter Ebert
Mark Loftus
4Clemson University, Department of Genetics & Biochemistry, Clemson, SC, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Mark Loftus
Feyza Yilmaz
1The Jackson Laboratory for Genomic Medicine, Farmington, CT, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Feyza Yilmaz
Peter A. Audano
1The Jackson Laboratory for Genomic Medicine, Farmington, CT, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Peter A. Audano
Glennis A. Logsdon
5University of Washington School of Medicine, Department of Genome Sciences, Seattle, WA, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Glennis A. Logsdon
Marc Jan Bonder
6German Cancer Research Center (DKFZ), Division of Computational Genomics and Systems Genetics, Heidelberg, Germany
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Marc Jan Bonder
Weichen Zhou
7University of Michigan Medical School, Department of Computational Medicine and Bioinformatics, Ann Arbor, MI, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Weichen Zhou
Wolfram Höps
8European Molecular Biology Laboratory (EMBL), Genome Biology Unit, Heidelberg, Germany
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Wolfram Höps
Kwondo Kim
1The Jackson Laboratory for Genomic Medicine, Farmington, CT, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Kwondo Kim
Chong Li
9Temple University, Department of Computer and Information Sciences, Philadelphia, PA, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Philip Dishuck
5University of Washington School of Medicine, Department of Genome Sciences, Seattle, WA, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Philip Dishuck
David Porubsky
5University of Washington School of Medicine, Department of Genome Sciences, Seattle, WA, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for David Porubsky
Fotios Tsetsos
1The Jackson Laboratory for Genomic Medicine, Farmington, CT, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Fotios Tsetsos
Jee Young Kwon
1The Jackson Laboratory for Genomic Medicine, Farmington, CT, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Jee Young Kwon
Qihui Zhu
1The Jackson Laboratory for Genomic Medicine, Farmington, CT, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Katherine M. Munson
5University of Washington School of Medicine, Department of Genome Sciences, Seattle, WA, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Katherine M. Munson
Patrick Hasenfeld
8European Molecular Biology Laboratory (EMBL), Genome Biology Unit, Heidelberg, Germany
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
William T. Harvey
5University of Washington School of Medicine, Department of Genome Sciences, Seattle, WA, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Alexandra P. Lewis
5University of Washington School of Medicine, Department of Genome Sciences, Seattle, WA, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Jennifer Kordosky
5University of Washington School of Medicine, Department of Genome Sciences, Seattle, WA, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Kendra Hoekzema
5University of Washington School of Medicine, Department of Genome Sciences, Seattle, WA, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Jan O. Korbel
8European Molecular Biology Laboratory (EMBL), Genome Biology Unit, Heidelberg, Germany
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Jan O. Korbel
Chris Tyler-Smith
10Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, UK
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Chris Tyler-Smith
Evan E. Eichler
5University of Washington School of Medicine, Department of Genome Sciences, Seattle, WA, USA
11Howard Hughes Medical Institute, University of Washington, Seattle, WA, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Evan E. Eichler
Xinghua Shi
9Temple University, Department of Computer and Information Sciences, Philadelphia, PA, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Christine R. Beck
1The Jackson Laboratory for Genomic Medicine, Farmington, CT, USA
12The University of Connecticut Health Center, Farmington, CT, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Christine R. Beck
Tobias Marschall
2Institute for Medical Biometry and Bioinformatics, Medical Faculty, Heinrich Heine University, Düsseldorf, Germany
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Tobias Marschall
Miriam K. Konkel
4Clemson University, Department of Genetics & Biochemistry, Clemson, SC, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Charles Lee
1The Jackson Laboratory for Genomic Medicine, Farmington, CT, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Charles Lee
  • For correspondence: charles.lee@jax.org
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Supplementary material
  • Preview PDF
Loading

Abstract

The prevalence of highly repetitive sequences within the human Y chromosome has led to its incomplete assembly and systematic omission from genomic analyses. Here, we present long-read de novo assemblies of 43 diverse Y-chromosomes, three contiguously assembled including two from deep-rooted African Y lineages. Examination of the full extent of genetic variation between Y chromosomes across 180,000 years of human evolution reveals its remarkable complexity and diversity in size and structure, in contrast with its low level of base substitution variation. The size of the Y chromosome assemblies vary extensively from 45.2 to 84.9 Mbp, with individual repeat arrays showing up to 6.7-fold difference in length across samples. Half of the male-specific euchromatic region is subject to large (up to 5.94 Mbp) inversions with a >2-fold higher recurrence rate compared to the rest of the human genome. The Y centromere, composed of 171 bp α-satellite monomer units, appears to have evolved from tandem arrays of a 36-mer ancestral higher order repeat (HOR), which has been predominantly replaced by a 34-mer HOR, and reveals a pattern of higher sequence variation towards the short-arm side. The Yq12 heterochromatic region is ubiquitously flanked by approximately 649 kbp and 472 kbp inversions that maintain the alternating arrays of DYZ1 and DYZ2 repeat units in between. While the sizes and the distribution of the DYZ1 and DYZ2 arrays vary considerably, primarily due to local expansions and contractions, the copy number ratio between the DYZ1 and DYZ2 monomer repeat units remains consistently close to 1:1. In addition, we have identified on average 65 kbp of novel sequence per Y chromosome. The availability of sequence-resolved Y chromosomes from multiple samples provides a basis for identifying new associations of specific traits with the Y chromosome and garnering novel evolutionary insights.

Competing Interest Statement

E.E.E. is a scientific advisory board (SAB) member of Variant Bio. C.L. is an SAB member of Nabsys and Genome Insight. The following authors have previously disclosed a patent application (no. EP19169090) relevant to Strand-seq: J.O.K., T.M., and D.P.; the other authors declare no competing interests.

Copyright 
The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license.
Back to top
PreviousNext
Posted December 01, 2022.
Download PDF

Supplementary Material

Email

Thank you for your interest in spreading the word about bioRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
Assembly of 43 diverse human Y chromosomes reveals extensive complexity and variation
(Your Name) has forwarded a page to you from bioRxiv
(Your Name) thought you would like to see this page from the bioRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
Assembly of 43 diverse human Y chromosomes reveals extensive complexity and variation
Pille Hallast, Peter Ebert, Mark Loftus, Feyza Yilmaz, Peter A. Audano, Glennis A. Logsdon, Marc Jan Bonder, Weichen Zhou, Wolfram Höps, Kwondo Kim, Chong Li, Philip Dishuck, David Porubsky, Fotios Tsetsos, Jee Young Kwon, Qihui Zhu, Katherine M. Munson, Patrick Hasenfeld, William T. Harvey, Alexandra P. Lewis, Jennifer Kordosky, Kendra Hoekzema, The Human Genome Structural Variation Consortium (HGSVC), Jan O. Korbel, Chris Tyler-Smith, Evan E. Eichler, Xinghua Shi, Christine R. Beck, Tobias Marschall, Miriam K. Konkel, Charles Lee
bioRxiv 2022.12.01.518658; doi: https://doi.org/10.1101/2022.12.01.518658
Reddit logo Twitter logo Facebook logo LinkedIn logo Mendeley logo
Citation Tools
Assembly of 43 diverse human Y chromosomes reveals extensive complexity and variation
Pille Hallast, Peter Ebert, Mark Loftus, Feyza Yilmaz, Peter A. Audano, Glennis A. Logsdon, Marc Jan Bonder, Weichen Zhou, Wolfram Höps, Kwondo Kim, Chong Li, Philip Dishuck, David Porubsky, Fotios Tsetsos, Jee Young Kwon, Qihui Zhu, Katherine M. Munson, Patrick Hasenfeld, William T. Harvey, Alexandra P. Lewis, Jennifer Kordosky, Kendra Hoekzema, The Human Genome Structural Variation Consortium (HGSVC), Jan O. Korbel, Chris Tyler-Smith, Evan E. Eichler, Xinghua Shi, Christine R. Beck, Tobias Marschall, Miriam K. Konkel, Charles Lee
bioRxiv 2022.12.01.518658; doi: https://doi.org/10.1101/2022.12.01.518658

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Genomics
Subject Areas
All Articles
  • Animal Behavior and Cognition (4381)
  • Biochemistry (9581)
  • Bioengineering (7086)
  • Bioinformatics (24845)
  • Biophysics (12597)
  • Cancer Biology (9952)
  • Cell Biology (14346)
  • Clinical Trials (138)
  • Developmental Biology (7944)
  • Ecology (12101)
  • Epidemiology (2067)
  • Evolutionary Biology (15984)
  • Genetics (10921)
  • Genomics (14735)
  • Immunology (9869)
  • Microbiology (23645)
  • Molecular Biology (9477)
  • Neuroscience (50838)
  • Paleontology (369)
  • Pathology (1539)
  • Pharmacology and Toxicology (2681)
  • Physiology (4013)
  • Plant Biology (8655)
  • Scientific Communication and Education (1508)
  • Synthetic Biology (2391)
  • Systems Biology (6427)
  • Zoology (1346)