Evolution of regulatory sequences in 12 Drosophila species

PLoS Genet. 2009 Jan;5(1):e1000330. doi: 10.1371/journal.pgen.1000330. Epub 2009 Jan 9.

Abstract

Characterization of the evolutionary constraints acting on cis-regulatory sequences is crucial to comparative genomics and provides key insights on the evolution of organismal diversity. We study the relationships among orthologous cis-regulatory modules (CRMs) in 12 Drosophila species, especially with respect to the evolution of transcription factor binding sites, and report statistical evidence in favor of key evolutionary hypotheses. Binding sites are found to have position-specific substitution rates. However, the selective forces at different positions of a site do not act independently, and the evidence suggests that constraints on sites are often based on their exact binding affinities. Binding site loss is seen to conform to a molecular clock hypothesis. The rate of site loss is transcription factor-specific and depends on the strength of binding and, in some cases, the presence of other binding sites in close proximity. Our analysis is based on a novel computational method for aligning orthologous CRMs on a tree, which rigorously accounts for alignment uncertainties and exploits binding site predictions through a unified probabilistic framework. Finally, we report weak purifying selection on short deletions, providing important clues about overall spatial constraints on CRMs. Our results present a complex picture of regulatory sequence evolution, with substantial plasticity that depends on a number of factors. The insights gained in this study will help us to understand the combinatorial control of gene regulation and how it evolves. They will pave the way for theoretical models that are cognizant of the important determinants of regulatory sequence evolution and will be critical in genome-wide identification of non-coding sequences under purifying or positive selection.

Publication types

  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Algorithms
  • Animals
  • Binding Sites
  • Cluster Analysis
  • Conserved Sequence
  • Drosophila / genetics*
  • Evolution, Molecular
  • Gene Deletion
  • Gene Expression Regulation
  • Genomics / methods*
  • Linear Models
  • Mutagenesis, Insertional
  • Regulatory Sequences, Nucleic Acid / genetics*
  • Transcription Factors / genetics

Substances

  • Transcription Factors