Skip to main content
bioRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search
New Results

LEVIATHAN: efficient discovery of large structural variants by leveraging long-range information from Linked-Reads data

Pierre Morisse, Fabrice Legeai, View ORCID ProfileClaire Lemaitre
doi: https://doi.org/10.1101/2021.03.25.437002
Pierre Morisse
1Univ Rennes, Inria, CNRS, IRISA, 35000, Rennes, France
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • For correspondence: pierre.morisse@inria.fr
Fabrice Legeai
1Univ Rennes, Inria, CNRS, IRISA, 35000, Rennes, France
2INRAE, UMR 1349 INRAE/Agrocampus Ouest/Université Rennes 1, Institut de Génétique, Environnement et Protection des Plantes (IGEPP), F-35653 Le Rheu, France
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Claire Lemaitre
1Univ Rennes, Inria, CNRS, IRISA, 35000, Rennes, France
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Claire Lemaitre
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Preview PDF
Loading

Abstract

Linked-Reads technologies, popularized by 10x Genomics, combine the high-quality and low cost of short-reads sequencing with a long-range information by adding barcodes that tag reads originating from the same long DNA fragment. Thanks to their high-quality and long-range information, such reads are thus particularly useful for various applications such as genome scaffolding and structural variant calling. As a result, multiple structural variant calling methods were developed within the last few years. However, these methods were mainly tested on human data, and do not run well on non-human organisms, for which reference genomes are highly fragmented, or sequencing data display high levels of heterozygosity. Moreover, even on human data, most tools still require large amounts of computing resources. We present LEVIATHAN, a new structural variant calling tool that aims to address these issues, and especially better scale and apply to a wide variety of organisms. Our method relies on a barcode index, that allows to quickly compare the similarity of all possible pairs of regions in terms of amount of common barcodes. Region pairs sharing a sufficient number of barcodes are then considered as potential structural variants, and complementary, classical short reads methods are applied to further refine the breakpoint coordinates. Our experiments on simulated data underline that our method compares well to the state-of-the-art, both in terms of recall and precision, and also in terms of resource consumption. Moreover, LEVIATHAN was successfully applied to a real dataset from a non-model organism, while all other tools either failed to run or required unreasonable amounts of resources. LEVIATHAN is implemented in C++, supported on Linux platforms, and available under AGPL-3.0 License at https://github.com/morispi/LEVIATHAN.

Competing Interest Statement

The authors have declared no competing interest.

Copyright 
The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-ND 4.0 International license.
Back to top
PreviousNext
Posted March 25, 2021.
Download PDF
Email

Thank you for your interest in spreading the word about bioRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
LEVIATHAN: efficient discovery of large structural variants by leveraging long-range information from Linked-Reads data
(Your Name) has forwarded a page to you from bioRxiv
(Your Name) thought you would like to see this page from the bioRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
LEVIATHAN: efficient discovery of large structural variants by leveraging long-range information from Linked-Reads data
Pierre Morisse, Fabrice Legeai, Claire Lemaitre
bioRxiv 2021.03.25.437002; doi: https://doi.org/10.1101/2021.03.25.437002
Digg logo Reddit logo Twitter logo CiteULike logo Facebook logo Google logo Mendeley logo
Citation Tools
LEVIATHAN: efficient discovery of large structural variants by leveraging long-range information from Linked-Reads data
Pierre Morisse, Fabrice Legeai, Claire Lemaitre
bioRxiv 2021.03.25.437002; doi: https://doi.org/10.1101/2021.03.25.437002

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Bioinformatics
Subject Areas
All Articles
  • Animal Behavior and Cognition (2638)
  • Biochemistry (5231)
  • Bioengineering (3653)
  • Bioinformatics (15744)
  • Biophysics (7225)
  • Cancer Biology (5604)
  • Cell Biology (8059)
  • Clinical Trials (138)
  • Developmental Biology (4744)
  • Ecology (7475)
  • Epidemiology (2059)
  • Evolutionary Biology (10534)
  • Genetics (7707)
  • Genomics (10091)
  • Immunology (5168)
  • Microbiology (13843)
  • Molecular Biology (5360)
  • Neuroscience (30623)
  • Paleontology (213)
  • Pathology (873)
  • Pharmacology and Toxicology (1520)
  • Physiology (2236)
  • Plant Biology (4990)
  • Scientific Communication and Education (1039)
  • Synthetic Biology (1382)
  • Systems Biology (4135)
  • Zoology (808)