Reduced space sequence alignment

J A Grice; R Hughey; D Speck

doi:10.1093/bioinformatics/13.1.45

Reduced space sequence alignment

Comput Appl Biosci. 1997 Feb;13(1):45-53. doi: 10.1093/bioinformatics/13.1.45.

Authors

J A Grice¹, R Hughey, D Speck

Affiliation

¹ University of California, Santa Cruz 95064, USA.

PMID: 9088708
DOI: 10.1093/bioinformatics/13.1.45

Abstract

Motivation: Sequence alignment is the problem of finding the optimal character-by-character correspondence between two sequences. It can be readily solved in O(n2) time and O(n2) space on a serial machine, or in O(n) time with O(n) space per O(n) processing elements on a parallel machine. Hirschberg's divide-and-conquer approach for finding the single best path reduces space use by a factor of n while inducing only a small constant slowdown to the serial version.

Results: This paper presents a family of methods for computing sequence alignments with reduced memory that are well suited to serial or parallel implementation. Unlike the divide-and-conquer approach, they can be used in the forward-backward (Baum-Welch) training of linear hidden Markov models, and they avoid data-dependent repartitioning, making them easier to parallelize. The algorithms feature, for an arbitrary integer L, a factor proportional to L slowdown in exchange for reducing space requirement from O(n2) to O(n1 square root of n). A single best path member of this algorithm family matches the quadratic time and linear space of the divide-and-conquer algorithm. Experimentally, the O(n1.5)-space member of the family is 15-40% faster than the O(n)-space divide-and-conquer algorithm.

Publication types

Research Support, Non-U.S. Gov't
Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

Algorithms*
Computer Systems
Computers*
Evaluation Studies as Topic
Markov Chains
Sequence Alignment / methods*
Sequence Alignment / statistics & numerical data
Software