Whole-Genome Discovery of Transcription Factor Binding Sites by Network-Level Conservation

  1. Moshe Pritsker1,
  2. Yir-Chung Liu1,2,
  3. Michael A. Beer1,2, and
  4. Saeed Tavazoie1,2,3
  1. 1 Department of Molecular Biology, Princeton University, Princeton, New Jersey 08544, USA
  2. 2 The Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, New Jersey 08544, USA

Abstract

Comprehensive identification of DNA cis-regulatory elements is crucial for a predictive understanding of transcriptional network dynamics. Strong evidence suggests that these DNA sequence motifs are highly conserved between related species, reflecting strong selection on the network of regulatory interactions that underlie common cellular behavior. Here, we exploit a systems-level aspect of this conservation—the network-level topology of these interactions—to map transcription factor (TF) binding sites on a genomic scale. Using network-level conservation as a constraint, our algorithm finds 71% of known TF binding sites in the yeast Saccharomyces cerevisiae, using only 12% of the sequence of a phylogenetic neighbor. Most of the novel predicted motifs show strong features of known TF binding sites, such as functional category and/or expression profile coherence of their corresponding genes. Network-level conservation should provide a powerful constraint for the systematic mapping of TF binding sites in the larger genomes of higher eukaryotes.

Footnotes

  • [Supplemental material is available online at www.genome.org.]

  • Article and publication are at http://www.genome.org/cgi/doi/10.1101/gr.1739204. Article published online before print in December 2003.

  • 3 Corresponding author. E-MAIL tavazoie{at}princeton.edu; FAX (609) 258-1701.

    • Accepted October 20, 2003.
    • Received July 9, 2003.
| Table of Contents

Preprint Server