Skip to main content
bioRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search
New Results

AStarix: Fast and Optimal Sequence-to-Graph Alignment

View ORCID ProfilePesho Ivanov, View ORCID ProfileBenjamin Bichsel, View ORCID ProfileHarun Mustafa, View ORCID ProfileAndré Kahles, View ORCID ProfileGunnar Rätsch, View ORCID ProfileMartin Vechev
doi: https://doi.org/10.1101/2020.01.22.915496
Pesho Ivanov
Department of Computer Science, ETH Zurich, Switzerland
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Pesho Ivanov
  • For correspondence: pesho@inf.ethz.ch
Benjamin Bichsel
Department of Computer Science, ETH Zurich, Switzerland
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Benjamin Bichsel
Harun Mustafa
Department of Computer Science, ETH Zurich, Switzerland
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Harun Mustafa
André Kahles
Department of Computer Science, ETH Zurich, Switzerland
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for André Kahles
Gunnar Rätsch
Department of Computer Science, ETH Zurich, Switzerland
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Gunnar Rätsch
Martin Vechev
Department of Computer Science, ETH Zurich, Switzerland
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Martin Vechev
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Data/Code
  • Preview PDF
Loading

Abstract

We present an algorithm for the optimal alignment of sequences to genome graphs. It works by phrasing the edit distance minimization task as finding a shortest path on an implicit alignment graph. To find a shortest path, we instantiate the A⋆ paradigm with a novel domain-specific heuristic function that accounts for the upcoming subsequence in the query to be aligned, resulting in a provably optimal alignment algorithm called AStarix.

Experimental evaluation of AStarix shows that it is 1–2 orders of magnitude faster than state-of-the-art optimal algorithms on the task of aligning Illumina reads to reference genome graphs. Implementations and evaluations are available at https://github.com/eth-sri/astarix.

Competing Interest Statement

The authors have declared no competing interest.

Footnotes

  • pesho.ivanov{at}inf.ethz.ch, benjamin.bichsel{at}inf.ethz.ch, harun.mustafa{at}inf.ethz.ch, andre.kahles{at}inf.ethz.ch, gunnar.ratsch{at}inf.ethz.ch, martin.vechev{at}inf.ethz.ch

  • The optimal algorithm from the GraphAligner tool is referred to as BitParallel.

  • https://github.com/eth-sri/astarix

Copyright 
The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY 4.0 International license.
Back to top
PreviousNext
Posted June 11, 2020.
Download PDF
Data/Code
Email

Thank you for your interest in spreading the word about bioRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
AStarix: Fast and Optimal Sequence-to-Graph Alignment
(Your Name) has forwarded a page to you from bioRxiv
(Your Name) thought you would like to see this page from the bioRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
AStarix: Fast and Optimal Sequence-to-Graph Alignment
Pesho Ivanov, Benjamin Bichsel, Harun Mustafa, André Kahles, Gunnar Rätsch, Martin Vechev
bioRxiv 2020.01.22.915496; doi: https://doi.org/10.1101/2020.01.22.915496
Reddit logo Twitter logo Facebook logo LinkedIn logo Mendeley logo
Citation Tools
AStarix: Fast and Optimal Sequence-to-Graph Alignment
Pesho Ivanov, Benjamin Bichsel, Harun Mustafa, André Kahles, Gunnar Rätsch, Martin Vechev
bioRxiv 2020.01.22.915496; doi: https://doi.org/10.1101/2020.01.22.915496

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Bioinformatics
Subject Areas
All Articles
  • Animal Behavior and Cognition (4688)
  • Biochemistry (10380)
  • Bioengineering (7695)
  • Bioinformatics (26373)
  • Biophysics (13551)
  • Cancer Biology (10729)
  • Cell Biology (15464)
  • Clinical Trials (138)
  • Developmental Biology (8509)
  • Ecology (12844)
  • Epidemiology (2067)
  • Evolutionary Biology (16887)
  • Genetics (11416)
  • Genomics (15493)
  • Immunology (10638)
  • Microbiology (25258)
  • Molecular Biology (10241)
  • Neuroscience (54597)
  • Paleontology (402)
  • Pathology (1671)
  • Pharmacology and Toxicology (2899)
  • Physiology (4355)
  • Plant Biology (9263)
  • Scientific Communication and Education (1588)
  • Synthetic Biology (2561)
  • Systems Biology (6789)
  • Zoology (1472)