Skip to main content
bioRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search
New Results

Omics-informed CNV calls reduce false positive rate and improve power for CNV-trait associations

View ORCID ProfileMaarja Lepamets, View ORCID ProfileChiara Auwerx, Margit Nõukas, View ORCID ProfileAnnique Claringbould, View ORCID ProfileEleonora Porcu, View ORCID ProfileMart Kals, Tuuli Jürgenson, Estonian Biobank Research Team, View ORCID ProfileAndrew Paul Morris, View ORCID ProfileUrmo Võsa, View ORCID ProfileMurielle Bochud, View ORCID ProfileSilvia Stringhini, View ORCID ProfileCisca Wijmenga, View ORCID ProfileLude Franke, View ORCID ProfileHedi Peterson, View ORCID ProfileJaak Vilo, View ORCID ProfileKaido Lepik, View ORCID ProfileReedik Mägi, View ORCID ProfileZoltán Kutalik
doi: https://doi.org/10.1101/2022.02.07.479374
Maarja Lepamets
1Estonian Genome Centre, Institute of Genomics, University of Tartu, Tartu, Estonia
2Institute of Molecular and Cell Biology, University of Tartu, Tartu, Estonia
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Maarja Lepamets
  • For correspondence: maarja.lepamets@ut.ee reedik.magi@ut.ee zoltan.kutalik@unil.ch
Chiara Auwerx
3Center for Integrative Genomics, University of Lausanne, Lausanne, Switzerland
4Department of Computational Biology, University of Lausanne, Lausanne, Switzerland
5Swiss Institute of Bioinformatics, Lausanne, Switzerland
6Center for Primary Care and Public Health (Unisanté), Department of epidemiology and health systems, University of Lausanne, Lausanne, Switzerland
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Chiara Auwerx
Margit Nõukas
1Estonian Genome Centre, Institute of Genomics, University of Tartu, Tartu, Estonia
2Institute of Molecular and Cell Biology, University of Tartu, Tartu, Estonia
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Annique Claringbould
7Structural and Computational Biology Unit, EMBL, Heidelberg, Germany
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Annique Claringbould
Eleonora Porcu
3Center for Integrative Genomics, University of Lausanne, Lausanne, Switzerland
5Swiss Institute of Bioinformatics, Lausanne, Switzerland
6Center for Primary Care and Public Health (Unisanté), Department of epidemiology and health systems, University of Lausanne, Lausanne, Switzerland
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Eleonora Porcu
Mart Kals
1Estonian Genome Centre, Institute of Genomics, University of Tartu, Tartu, Estonia
8Institute for Molecular Medicine Finland, FIMM, HiLIFE, University of Helsinki, Helsinki, Finland
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Mart Kals
Tuuli Jürgenson
1Estonian Genome Centre, Institute of Genomics, University of Tartu, Tartu, Estonia
9Institute of Mathematics and Statistics, University of Tartu, Tartu, Estonia
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
1Estonian Genome Centre, Institute of Genomics, University of Tartu, Tartu, Estonia
Andrew Paul Morris
1Estonian Genome Centre, Institute of Genomics, University of Tartu, Tartu, Estonia
10Centre for Genetics and Genomics Versus Arthritis, Centre for Musculoskeletal Research, Division of Musculoskeletal and Dermatological Sciences, The University of Manchester, Manchester, UK
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Andrew Paul Morris
Urmo Võsa
1Estonian Genome Centre, Institute of Genomics, University of Tartu, Tartu, Estonia
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Urmo Võsa
Murielle Bochud
6Center for Primary Care and Public Health (Unisanté), Department of epidemiology and health systems, University of Lausanne, Lausanne, Switzerland
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Murielle Bochud
Silvia Stringhini
11Unit of Population Epidemiology, Division of Primary Care, Geneva, Switzerland
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Silvia Stringhini
Cisca Wijmenga
12University of Groningen, University Medical Center Groningen, Department of Genetics, Groningen, The Netherlands
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Cisca Wijmenga
Lude Franke
12University of Groningen, University Medical Center Groningen, Department of Genetics, Groningen, The Netherlands
13Oncode Institute, Utrecht, The Netherlands
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Lude Franke
Hedi Peterson
14Institute of Computer Science, University of Tartu, Tartu, Estonia
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Hedi Peterson
Jaak Vilo
14Institute of Computer Science, University of Tartu, Tartu, Estonia
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Jaak Vilo
Kaido Lepik
4Department of Computational Biology, University of Lausanne, Lausanne, Switzerland
5Swiss Institute of Bioinformatics, Lausanne, Switzerland
6Center for Primary Care and Public Health (Unisanté), Department of epidemiology and health systems, University of Lausanne, Lausanne, Switzerland
14Institute of Computer Science, University of Tartu, Tartu, Estonia
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Kaido Lepik
Reedik Mägi
1Estonian Genome Centre, Institute of Genomics, University of Tartu, Tartu, Estonia
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Reedik Mägi
  • For correspondence: maarja.lepamets@ut.ee reedik.magi@ut.ee zoltan.kutalik@unil.ch
Zoltán Kutalik
4Department of Computational Biology, University of Lausanne, Lausanne, Switzerland
5Swiss Institute of Bioinformatics, Lausanne, Switzerland
6Center for Primary Care and Public Health (Unisanté), Department of epidemiology and health systems, University of Lausanne, Lausanne, Switzerland
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Zoltán Kutalik
  • For correspondence: maarja.lepamets@ut.ee reedik.magi@ut.ee zoltan.kutalik@unil.ch
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Supplementary material
  • Preview PDF
Loading

Abstract

Copy number variations (CNV) are believed to play an important role in a wide range of complex traits but discovering such associations remains challenging. Whilst whole genome sequencing (WGS) is the gold standard approach for CNV detection, there are several orders of magnitude more samples with available genotyping microarray data. Such array data can be exploited for CNV detection using dedicated software (e.g., PennCNV), however these calls suffer from elevated false positive and negative rates. In this study, we developed a CNV quality score that weights PennCNV calls (pCNV) based on their likelihood of being true positive. First, we established a measure of pCNV reliability by leveraging evidence from multiple omics data (WGS, transcriptomics and methylomics) obtained from the same samples. Next, we built a predictor of omics-confirmed pCNVs, termed omics-informed quality score (OQS), using only PennCNV software output parameters. Promisingly, OQS assigned to pCNVs detected in close family members was up to 35% higher than the OQS of pCNVs not carried by other relatives (P < 3.0−10−90), outperforming other scores. Finally, in an association study of four anthropometric traits in 89,516 Estonian Biobank samples, the use of OQS led to a relative increase in the trait variance explained by CNVs of up to 34% compared to raw pCNVs or previous quality scores. Overall, we put forward a flexible framework to improve any CNV detection method leveraging multi-omics evidence, applied it to improve PennCNV calls and demonstrated its utility by improving the statistical power for downstream association analyses.

Competing Interest Statement

The authors have declared no competing interest.

Copyright 
The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY 4.0 International license.
Back to top
PreviousNext
Posted February 10, 2022.
Download PDF

Supplementary Material

Email

Thank you for your interest in spreading the word about bioRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
Omics-informed CNV calls reduce false positive rate and improve power for CNV-trait associations
(Your Name) has forwarded a page to you from bioRxiv
(Your Name) thought you would like to see this page from the bioRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
Omics-informed CNV calls reduce false positive rate and improve power for CNV-trait associations
Maarja Lepamets, Chiara Auwerx, Margit Nõukas, Annique Claringbould, Eleonora Porcu, Mart Kals, Tuuli Jürgenson, Estonian Biobank Research Team, Andrew Paul Morris, Urmo Võsa, Murielle Bochud, Silvia Stringhini, Cisca Wijmenga, Lude Franke, Hedi Peterson, Jaak Vilo, Kaido Lepik, Reedik Mägi, Zoltán Kutalik
bioRxiv 2022.02.07.479374; doi: https://doi.org/10.1101/2022.02.07.479374
Digg logo Reddit logo Twitter logo Facebook logo Google logo LinkedIn logo Mendeley logo
Citation Tools
Omics-informed CNV calls reduce false positive rate and improve power for CNV-trait associations
Maarja Lepamets, Chiara Auwerx, Margit Nõukas, Annique Claringbould, Eleonora Porcu, Mart Kals, Tuuli Jürgenson, Estonian Biobank Research Team, Andrew Paul Morris, Urmo Võsa, Murielle Bochud, Silvia Stringhini, Cisca Wijmenga, Lude Franke, Hedi Peterson, Jaak Vilo, Kaido Lepik, Reedik Mägi, Zoltán Kutalik
bioRxiv 2022.02.07.479374; doi: https://doi.org/10.1101/2022.02.07.479374

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Bioinformatics
Subject Areas
All Articles
  • Animal Behavior and Cognition (3592)
  • Biochemistry (7562)
  • Bioengineering (5508)
  • Bioinformatics (20762)
  • Biophysics (10309)
  • Cancer Biology (7967)
  • Cell Biology (11627)
  • Clinical Trials (138)
  • Developmental Biology (6602)
  • Ecology (10190)
  • Epidemiology (2065)
  • Evolutionary Biology (13594)
  • Genetics (9532)
  • Genomics (12834)
  • Immunology (7917)
  • Microbiology (19525)
  • Molecular Biology (7651)
  • Neuroscience (42027)
  • Paleontology (307)
  • Pathology (1254)
  • Pharmacology and Toxicology (2196)
  • Physiology (3263)
  • Plant Biology (7029)
  • Scientific Communication and Education (1294)
  • Synthetic Biology (1949)
  • Systems Biology (5422)
  • Zoology (1114)