Skip to main content
bioRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search
New Results

Mapping short reads, faithfully

View ORCID ProfileEduard Valera Zorita, View ORCID ProfileRuggero Cortini, View ORCID ProfileGuillaume J. Filion
doi: https://doi.org/10.1101/2020.02.10.942599
Eduard Valera Zorita
1Center for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr. Aiguader 88, Barcelona 08003, Spain
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Eduard Valera Zorita
Ruggero Cortini
1Center for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr. Aiguader 88, Barcelona 08003, Spain
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Ruggero Cortini
Guillaume J. Filion
1Center for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr. Aiguader 88, Barcelona 08003, Spain
2University Pompeu Fabra (UPF), Barcelona, Spain
3Department of Biological Sciences, University of Toronto Scarborough, Toronto, ON, Canada
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Guillaume J. Filion
  • For correspondence: guillaume.filion@gmail.com
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Supplementary material
  • Preview PDF
Loading

Abstract

Mapping is the process of finding the original location of a DNA read in a reference sequence, typically a genome. Short read mappers are software tools used in most applications that involve high-throughput sequencing. As such, they must be continuously improved to keep up with increasing needs. Modern mappers rely on seeding heuristics, making them fast but inexact. For lack of a method to compute the reliability of their own output, mappers have so far used approximations of variable quality. Here we focus on faithfulness, the capacity to provide accurate mapping confidence, and we devise a strategy to map short reads faithfully. The key is to estimate the repetitiveness of the target reference, which is the dominant factor for the reliability of the mapping process. This approach highlights the existence of a class of reads that can be mapped with unprecedented confidence. We exploit this strategy in a prototype mapper that is competitive with state-of-the-art mappers BWA-MEM and Bowtie2, with the benefit of faithfulness. The software is open-source and available for download at https://github.com/gui11aume/mmp.

Copyright 
The copyright holder has placed this preprint in the Public Domain. It is no longer restricted by copyright. Anyone can legally share, reuse, remix, or adapt this material for any purpose without crediting the original authors.
Back to top
PreviousNext
Posted February 11, 2020.
Download PDF

Supplementary Material

Email

Thank you for your interest in spreading the word about bioRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
Mapping short reads, faithfully
(Your Name) has forwarded a page to you from bioRxiv
(Your Name) thought you would like to see this page from the bioRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
Mapping short reads, faithfully
Eduard Valera Zorita, Ruggero Cortini, Guillaume J. Filion
bioRxiv 2020.02.10.942599; doi: https://doi.org/10.1101/2020.02.10.942599
Digg logo Reddit logo Twitter logo Facebook logo Google logo LinkedIn logo Mendeley logo
Citation Tools
Mapping short reads, faithfully
Eduard Valera Zorita, Ruggero Cortini, Guillaume J. Filion
bioRxiv 2020.02.10.942599; doi: https://doi.org/10.1101/2020.02.10.942599

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Bioinformatics
Subject Areas
All Articles
  • Animal Behavior and Cognition (4119)
  • Biochemistry (8828)
  • Bioengineering (6532)
  • Bioinformatics (23484)
  • Biophysics (11805)
  • Cancer Biology (9223)
  • Cell Biology (13336)
  • Clinical Trials (138)
  • Developmental Biology (7442)
  • Ecology (11425)
  • Epidemiology (2066)
  • Evolutionary Biology (15173)
  • Genetics (10453)
  • Genomics (14056)
  • Immunology (9187)
  • Microbiology (22199)
  • Molecular Biology (8823)
  • Neuroscience (47626)
  • Paleontology (351)
  • Pathology (1431)
  • Pharmacology and Toxicology (2493)
  • Physiology (3736)
  • Plant Biology (8090)
  • Scientific Communication and Education (1438)
  • Synthetic Biology (2224)
  • Systems Biology (6042)
  • Zoology (1254)