Skip to main content
bioRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search
New Results

Concerning the eXclusion in human genomics: The choice of sex chromosome representation in the human genome drastically affects number of identified variants

View ORCID ProfileBrendan J. Pinto, View ORCID ProfileBrian O’Connor, View ORCID ProfileMichael C. Schatz, View ORCID ProfileSamantha Zarate, View ORCID ProfileMelissa A. Wilson
doi: https://doi.org/10.1101/2023.02.22.529542
Brendan J. Pinto
1School of Life Sciences, Arizona State University, Tempe AZ 85282 USA
2Center for Evolution and Medicine, Arizona State University, Tempe AZ 85282 USA
3Department of Zoology, Milwaukee Public Museum, Milwaukee, WI 53233 USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Brendan J. Pinto
  • For correspondence: brendan.pinto@asu.edu mwilsons@asu.edu
Brian O’Connor
4Sage Bionetworks, Seattle WA 98121 USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Brian O’Connor
Michael C. Schatz
5Department of Computer Science, Johns Hopkins University, Baltimore, MD 21218 USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Michael C. Schatz
Samantha Zarate
5Department of Computer Science, Johns Hopkins University, Baltimore, MD 21218 USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Samantha Zarate
Melissa A. Wilson
1School of Life Sciences, Arizona State University, Tempe AZ 85282 USA
2Center for Evolution and Medicine, Arizona State University, Tempe AZ 85282 USA
6The Biodesign Center for Mechanisms of Evolution, Arizona State University, Tempe AZ 85282 USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Melissa A. Wilson
  • For correspondence: brendan.pinto@asu.edu mwilsons@asu.edu
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Supplementary material
  • Preview PDF
Loading

Abstract

Over the past 30 years, a community of scientists have pieced together every base pair of the human reference genome from telomere-to-telomere. Interestingly, most human genomics studies omit more than 5% of the genome from their analyses. Under ‘normal’ circumstances, omitting any chromosome(s) from analysis of the human genome would be reason for concern—the exception being the sex chromosomes. Sex chromosomes in eutherians share an evolutionary origin as an ancestral pair of autosomes. In humans, they share three regions of high sequence identity (~98-100%), which—along with the unique transmission patterns of the sex chromosomes—introduce technical artifacts into genomic analyses. However, the human X chromosome bears numerous important genes—including more “immune response” genes than any other chromosome—which makes its exclusion irresponsible when sex differences across human diseases are widespread. To better characterize the effect that including/excluding the X chromosome may have on variants called, we conducted a pilot study on the Terra cloud platform to replicate a subset of standard genomic practices using both the CHM13 reference genome and sex chromosome complement-aware (SCC-aware) reference genome. We compared quality of variant calling, expression quantification, and allele-specific expression using these two reference genome versions across 50 human samples from the Genotype-Tissue-Expression consortium annotated as females. We found that after correction, the whole X chromosome (100%) can generate reliable variant calls—allowing for the inclusion of the whole genome in human genomics analyses as a departure from the status quo of omitting the sex chromosomes from empirical and clinical genomics studies.

Competing Interest Statement

The authors have declared no competing interest.

Copyright 
The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license.
Back to top
PreviousNext
Posted February 22, 2023.
Download PDF

Supplementary Material

Email

Thank you for your interest in spreading the word about bioRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
Concerning the eXclusion in human genomics: The choice of sex chromosome representation in the human genome drastically affects number of identified variants
(Your Name) has forwarded a page to you from bioRxiv
(Your Name) thought you would like to see this page from the bioRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
Concerning the eXclusion in human genomics: The choice of sex chromosome representation in the human genome drastically affects number of identified variants
Brendan J. Pinto, Brian O’Connor, Michael C. Schatz, Samantha Zarate, Melissa A. Wilson
bioRxiv 2023.02.22.529542; doi: https://doi.org/10.1101/2023.02.22.529542
Reddit logo Twitter logo Facebook logo LinkedIn logo Mendeley logo
Citation Tools
Concerning the eXclusion in human genomics: The choice of sex chromosome representation in the human genome drastically affects number of identified variants
Brendan J. Pinto, Brian O’Connor, Michael C. Schatz, Samantha Zarate, Melissa A. Wilson
bioRxiv 2023.02.22.529542; doi: https://doi.org/10.1101/2023.02.22.529542

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Genomics
Subject Areas
All Articles
  • Animal Behavior and Cognition (4230)
  • Biochemistry (9123)
  • Bioengineering (6767)
  • Bioinformatics (23970)
  • Biophysics (12109)
  • Cancer Biology (9511)
  • Cell Biology (13753)
  • Clinical Trials (138)
  • Developmental Biology (7623)
  • Ecology (11675)
  • Epidemiology (2066)
  • Evolutionary Biology (15492)
  • Genetics (10632)
  • Genomics (14310)
  • Immunology (9473)
  • Microbiology (22824)
  • Molecular Biology (9087)
  • Neuroscience (48920)
  • Paleontology (355)
  • Pathology (1480)
  • Pharmacology and Toxicology (2566)
  • Physiology (3841)
  • Plant Biology (8322)
  • Scientific Communication and Education (1468)
  • Synthetic Biology (2295)
  • Systems Biology (6180)
  • Zoology (1299)