Skip to main content
bioRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search
New Results

Optimal gap-affine alignment in O(s) space

Santiago Marco-Sola, Jordan M. Eizenga, Andrea Guarracino, Benedict Paten, Erik Garrison, Miquel Moreto
doi: https://doi.org/10.1101/2022.04.14.488380
Santiago Marco-Sola
1Computer Sciences Department, Barcelona Supercomputing Center, Barcelona, 08034, Spain
2Departament d’Arquitectura de Computadors i Sistemes Operatius, Universitat Autònoma de Barcelona, Barcelona, 08193, Spain
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • For correspondence: santiagomsola@gmail.com
Jordan M. Eizenga
3Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Andrea Guarracino
4Genomics Research Centre, Human Technopole, Viale Rita Levi-Montalcini 1, Milan, 20157, Italy
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Benedict Paten
3Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Erik Garrison
5Department of Genetics, Genomics and Informatics, University of Tennessee Health Science Center, Memphis, TN 38163, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Miquel Moreto
1Computer Sciences Department, Barcelona Supercomputing Center, Barcelona, 08034, Spain
6Departament d’Arquitectura de Computadors, Universitat Politècnica de Catalunya, Barcelona, 08034, Spain
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Supplementary material
  • Preview PDF
Loading

Abstract

Motivation Pairwise sequence alignment remains a fundamental problem in computational biology and bioinformatics. Recent advances in genomics and sequencing technologies demand faster and scalable algorithms that can cope with the ever-increasing sequence lengths. Classical pairwise alignment algorithms based on dynamic programming are strongly limited by quadratic requirements in time and memory. The recently proposed wavefront alignment algorithm (WFA) introduced an efficient algorithm to perform exact gap-affine alignment in O(ns) time, where s is the optimal score and n is the sequence length. Notwithstanding these bounds, WFA’s O(s2) memory requirements become computationally impractical for genome-scale alignments, leading to a need for further improvement.

Results In this paper, we present the bidirectional WFA algorithm (BiWFA), the first gap-affine algorithm capable of computing optimal alignments in O(s) memory while retaining WFA’s time complexity of O(ns). As a result, this work improves the lowest known memory bound O(n) to compute gap-affine alignments. In practice, our implementation never requires more than a few hundred MBs aligning noisy Oxford Nanopore Technologies reads up to 1 Mbp long while maintaining competitive execution times.

Availability All code is publicly available at https://github.com/smarco/BiWFA-paper

Contact santiagomsola{at}gmail.com

Competing Interest Statement

The authors have declared no competing interest.

Footnotes

  • Added an extended experimental evaluation and a BiWFA example diagram.

Copyright 
The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY 4.0 International license.
Back to top
PreviousNext
Posted August 17, 2022.
Download PDF

Supplementary Material

Email

Thank you for your interest in spreading the word about bioRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
Optimal gap-affine alignment in O(s) space
(Your Name) has forwarded a page to you from bioRxiv
(Your Name) thought you would like to see this page from the bioRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
Optimal gap-affine alignment in O(s) space
Santiago Marco-Sola, Jordan M. Eizenga, Andrea Guarracino, Benedict Paten, Erik Garrison, Miquel Moreto
bioRxiv 2022.04.14.488380; doi: https://doi.org/10.1101/2022.04.14.488380
Twitter logo Facebook logo LinkedIn logo Mendeley logo
Citation Tools
Optimal gap-affine alignment in O(s) space
Santiago Marco-Sola, Jordan M. Eizenga, Andrea Guarracino, Benedict Paten, Erik Garrison, Miquel Moreto
bioRxiv 2022.04.14.488380; doi: https://doi.org/10.1101/2022.04.14.488380

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Bioinformatics
Subject Areas
All Articles
  • Animal Behavior and Cognition (5976)
  • Biochemistry (13543)
  • Bioengineering (10326)
  • Bioinformatics (32910)
  • Biophysics (16968)
  • Cancer Biology (14036)
  • Cell Biology (19901)
  • Clinical Trials (138)
  • Developmental Biology (10752)
  • Ecology (15899)
  • Epidemiology (2067)
  • Evolutionary Biology (20220)
  • Genetics (13317)
  • Genomics (18536)
  • Immunology (13627)
  • Microbiology (31841)
  • Molecular Biology (13279)
  • Neuroscience (69445)
  • Paleontology (518)
  • Pathology (2167)
  • Pharmacology and Toxicology (3715)
  • Physiology (5809)
  • Plant Biology (11913)
  • Scientific Communication and Education (1801)
  • Synthetic Biology (3337)
  • Systems Biology (8115)
  • Zoology (1833)