Using traveling salesman problem algorithms for evolutionary tree construction

Bioinformatics. 2000 Jul;16(7):619-27. doi: 10.1093/bioinformatics/16.7.619.

Abstract

Motivation: The construction of evolutionary trees is one of the major problems in computational biology, mainly due to its complexity.

Results: We present a new tree construction method that constructs a tree with minimum score for a given set of sequences, where the score is the amount of evolution measured in PAM distances. To do this, the problem of tree construction is reduced to the Traveling Salesman Problem (TSP). The input for the TSP algorithm are the pairwise distances of the sequences and the output is a circular tour through the optimal, unknown tree plus the minimum score of the tree. The circular order and the score can be used to construct the topology of the optimal tree. Our method can be used for any scoring function that correlates to the amount of changes along the branches of an evolutionary tree, for instance it could also be used for parsimony scores, but it cannot be used for least squares fit of distances. A TSP solution reduces the space of all possible trees to 2n. Using this order, we can guarantee that we reconstruct a correct evolutionary tree if the absolute value of the error for each distance measurement is smaller than f2.gif" BORDER="0">, where f3.gif" BORDER="0">is the length of the shortest edge in the tree. For data sets with large errors, a dynamic programming approach is used to reconstruct the tree. Finally simulations and experiments with real data are shown.

MeSH terms

  • Algorithms*
  • Amino Acid Sequence
  • Animals
  • Cytochrome P-450 CYP1A1 / classification*
  • Cytochrome P-450 CYP1A1 / genetics
  • Cytochrome P-450 CYP1A2 / classification*
  • Cytochrome P-450 CYP1A2 / genetics
  • Evolution, Molecular*
  • Glycated Hemoglobin / classification*
  • Glycated Hemoglobin / genetics
  • Molecular Sequence Data
  • Phylogeny*
  • Sequence Analysis / methods

Substances

  • Glycated Hemoglobin A
  • Cytochrome P-450 CYP1A1
  • Cytochrome P-450 CYP1A2