Fast optimal leaf ordering for hierarchical clustering

Bioinformatics. 2001:17 Suppl 1:S22-9. doi: 10.1093/bioinformatics/17.suppl_1.s22.

Abstract

We present the first practical algorithm for the optimal linear leaf ordering of trees that are generated by hierarchical clustering. Hierarchical clustering has been extensively used to analyze gene expression data, and we show how optimal leaf ordering can reveal biological structure that is not observed with an existing heuristic ordering method. For a tree with n leaves, there are 2(n-1) linear orderings consistent with the structure of the tree. Our optimal leaf ordering algorithm runs in time O(n(4)), and we present further improvements that make the running time of our algorithm practical.

MeSH terms

  • Algorithms*
  • Cell Cycle / genetics
  • Cluster Analysis*
  • Computational Biology*
  • Databases, Genetic
  • Gene Expression Profiling / statistics & numerical data
  • Genes, Fungal
  • Multigene Family
  • Saccharomyces cerevisiae / cytology
  • Saccharomyces cerevisiae / genetics