Skip to main content
bioRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search
New Results

User-guided local and global copy-number segmentation for tumor sequencing data

Zubair Lalani, View ORCID ProfileGillian Chu, Silas Hsu, View ORCID ProfileSimone Zaccaria, View ORCID ProfileMohammed El-Kebir
doi: https://doi.org/10.1101/2022.01.15.476457
Zubair Lalani
1Department of Computer Science, University of Illinois Urbana-Champaign, Urbana, IL, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Gillian Chu
1Department of Computer Science, University of Illinois Urbana-Champaign, Urbana, IL, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Gillian Chu
Silas Hsu
1Department of Computer Science, University of Illinois Urbana-Champaign, Urbana, IL, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Simone Zaccaria
2Computational Cancer Genomics Research Group, University College London Cancer Institute, London, UK
3Cancer Research UK Lung Cancer Centre of Excellence, University College London Cancer Institute, London, UK
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Simone Zaccaria
  • For correspondence: melkebir@illinois.edu
Mohammed El-Kebir
1Department of Computer Science, University of Illinois Urbana-Champaign, Urbana, IL, USA
4Cancer Center at Illinois, University of Illinois Urbana-Champaign, Urbana, IL, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Mohammed El-Kebir
  • For correspondence: melkebir@illinois.edu
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Data/Code
  • Preview PDF
Loading

Abstract

Motivation Copy-number aberrations (CNA) are genetic alterations that amplify or delete the number of copies of large genomic segments. Although they are ubiquitous in cancer and subsequently a critical area of current cancer research, CNA identification from DNA sequencing data is challenging because it requires partitioning of the genome into complex segments that may not be contiguous. Existing segmentation algorithms address these challenges either by leveraging the local information among neighboring genomic regions, or by globally grouping genomic regions that are affected by similar CNAs across the entire genome. However, both approaches have limitations: overclustering in the case of local segmentation, or the omission of clusters corresponding to focal CNAs in the case of global segmentation. Importantly, inaccurate segmentation will lead to inaccurate identification of important CNAs.

Results We introduce CNAViz, a web-based tool that enables the user to simultaneously perform local and global segmentation, thus overcoming the limitations of each approach. Using simulated data, we demonstrate that by several metrics, CNAViz yields more accurate segmentations relative to existing local and global segmentation methods. Moreover, we analyze six bulk DNA sequencing samples from three breast cancer patients. By validating with parallel single-cell DNA sequencing data from the same samples, we show that CNAViz’s more accurate segmentation improves accuracy in downstream copy-number calling.

Availability and implementation https://github.com/elkebir-group/cnaviz

Contact s.zaccaria{at}ucl.ac.uk, melkebir{at}illinois.edu

Competing Interest Statement

The authors have declared no competing interest.

Footnotes

  • ↵† Joint first authorship.

  • https://github.com/elkebir-group/cnaviz

  • https://github.com/elkebir-group/cnaviz-paper

  • 1 https://github.com/d3/d3

  • 2 https://github.com/d3fc/d3fc

  • 3 https://github.com/crossfilter/crossfilter

Copyright 
The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC 4.0 International license.
Back to top
PreviousNext
Posted January 17, 2022.
Download PDF
Data/Code
Email

Thank you for your interest in spreading the word about bioRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
User-guided local and global copy-number segmentation for tumor sequencing data
(Your Name) has forwarded a page to you from bioRxiv
(Your Name) thought you would like to see this page from the bioRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
User-guided local and global copy-number segmentation for tumor sequencing data
Zubair Lalani, Gillian Chu, Silas Hsu, Simone Zaccaria, Mohammed El-Kebir
bioRxiv 2022.01.15.476457; doi: https://doi.org/10.1101/2022.01.15.476457
Reddit logo Twitter logo Facebook logo LinkedIn logo Mendeley logo
Citation Tools
User-guided local and global copy-number segmentation for tumor sequencing data
Zubair Lalani, Gillian Chu, Silas Hsu, Simone Zaccaria, Mohammed El-Kebir
bioRxiv 2022.01.15.476457; doi: https://doi.org/10.1101/2022.01.15.476457

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Genomics
Subject Areas
All Articles
  • Animal Behavior and Cognition (4237)
  • Biochemistry (9152)
  • Bioengineering (6789)
  • Bioinformatics (24037)
  • Biophysics (12142)
  • Cancer Biology (9550)
  • Cell Biology (13808)
  • Clinical Trials (138)
  • Developmental Biology (7649)
  • Ecology (11719)
  • Epidemiology (2066)
  • Evolutionary Biology (15522)
  • Genetics (10654)
  • Genomics (14337)
  • Immunology (9495)
  • Microbiology (22872)
  • Molecular Biology (9113)
  • Neuroscience (49070)
  • Paleontology (355)
  • Pathology (1485)
  • Pharmacology and Toxicology (2572)
  • Physiology (3851)
  • Plant Biology (8341)
  • Scientific Communication and Education (1473)
  • Synthetic Biology (2299)
  • Systems Biology (6199)
  • Zoology (1302)