Skip to main content
bioRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search
New Results

CONGA: Copy number variation genotyping in ancient genomes and low-coverage sequencing data

View ORCID ProfileArda Söylev, View ORCID ProfileSevim Seda Çokoglu, View ORCID ProfileDilek Koptekin, View ORCID ProfileCan Alkan, View ORCID ProfileMehmet Somel
doi: https://doi.org/10.1101/2021.12.17.473150
Arda Söylev
1Department of Computer Engineering, Konya Food and Agriculture University, Konya, 42080, Turkey
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Arda Söylev
  • For correspondence: asoylev@gmail.com msomel@metu.edu.tr
Sevim Seda Çokoglu
2Department of Biology, Middle East Technical University, Ankara, 06800, Turkey
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Sevim Seda Çokoglu
Dilek Koptekin
3Department of Health Informatics, Graduate School of Informatics, Middle East Technical University, Ankara, 06800, Turkey
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Dilek Koptekin
Can Alkan
4Department of Computer Engineering, Bilkent University, Ankara, 06800, Turkey
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Can Alkan
Mehmet Somel
2Department of Biology, Middle East Technical University, Ankara, 06800, Turkey
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Mehmet Somel
  • For correspondence: asoylev@gmail.com msomel@metu.edu.tr
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Supplementary material
  • Preview PDF
Loading

Abstract

To date, ancient genome analyses have been largely confined to the study of single nucleotide polymorphisms (SNPs). Copy number variants (CNVs) are a major contributor of disease and of evolutionary adaptation, but identifying CNVs in ancient shotgun-sequenced genomes is hampered by typical low coverage (<1 ×) and short fragments (<80 bps), precluding standard CNV detection software to be effectively applied to ancient genomes. Here we present CONGA, tailored for genotyping CNVs at low coverage. Simulations and down-sampling experiments suggest that CONGA can genotype deletions >1 kbps with F-scores >0.75 at ≥1×, and distinguish between heterozygous and homozygous states. We applied CONGA to genotype 10,002 outgroup-ascertained deletions across a heterogenous set of 71 ancient human genomes spanning the last 50,000 years, produced using variable experimental protocols. A fraction of these (21/71) display divergent deletion profiles unrelated to their population origin, but attributable to technical factors such as coverage and read length. The majority of the sample (50/71), despite originating from nine different laboratories and having coverages 0.44×-26× (median 4×) and read lengths 52-121 bp (median 69), exhibit coherent deletion frequencies. Across these 50 genomes, inter-individual genetic diversity measured using SNPs and CONGA-genotyped deletions are strongly correlated. CONGA-genotyped deletions also display purifying selection signatures, as expected. CONGA thus paves the way for systematic CNV analyses in ancient genomes, despite the technical challenges posed by low and variable genome coverage.

Competing Interest Statement

The authors have declared no competing interest.

Footnotes

  • Results updated

Copyright 
The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-ND 4.0 International license.
Back to top
PreviousNext
Posted August 03, 2022.
Download PDF

Supplementary Material

Email

Thank you for your interest in spreading the word about bioRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
CONGA: Copy number variation genotyping in ancient genomes and low-coverage sequencing data
(Your Name) has forwarded a page to you from bioRxiv
(Your Name) thought you would like to see this page from the bioRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
CONGA: Copy number variation genotyping in ancient genomes and low-coverage sequencing data
Arda Söylev, Sevim Seda Çokoglu, Dilek Koptekin, Can Alkan, Mehmet Somel
bioRxiv 2021.12.17.473150; doi: https://doi.org/10.1101/2021.12.17.473150
Digg logo Reddit logo Twitter logo Facebook logo Google logo LinkedIn logo Mendeley logo
Citation Tools
CONGA: Copy number variation genotyping in ancient genomes and low-coverage sequencing data
Arda Söylev, Sevim Seda Çokoglu, Dilek Koptekin, Can Alkan, Mehmet Somel
bioRxiv 2021.12.17.473150; doi: https://doi.org/10.1101/2021.12.17.473150

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Genomics
Subject Areas
All Articles
  • Animal Behavior and Cognition (3686)
  • Biochemistry (7773)
  • Bioengineering (5668)
  • Bioinformatics (21244)
  • Biophysics (10563)
  • Cancer Biology (8160)
  • Cell Biology (11909)
  • Clinical Trials (138)
  • Developmental Biology (6737)
  • Ecology (10388)
  • Epidemiology (2065)
  • Evolutionary Biology (13842)
  • Genetics (9694)
  • Genomics (13056)
  • Immunology (8123)
  • Microbiology (19955)
  • Molecular Biology (7831)
  • Neuroscience (42970)
  • Paleontology (318)
  • Pathology (1276)
  • Pharmacology and Toxicology (2256)
  • Physiology (3350)
  • Plant Biology (7208)
  • Scientific Communication and Education (1309)
  • Synthetic Biology (1999)
  • Systems Biology (5528)
  • Zoology (1126)