An atlas of over 90,000 conserved noncoding sequences provides insight into crucifer regulatory regions

Nat Genet. 2013 Aug;45(8):891-8. doi: 10.1038/ng.2684. Epub 2013 Jun 30.

Abstract

Despite the central importance of noncoding DNA to gene regulation and evolution, understanding of the extent of selection on plant noncoding DNA remains limited compared to that of other organisms. Here we report sequencing of genomes from three Brassicaceae species (Leavenworthia alabamica, Sisymbrium irio and Aethionema arabicum) and their joint analysis with six previously sequenced crucifer genomes. Conservation across orthologous bases suggests that at least 17% of the Arabidopsis thaliana genome is under selection, with nearly one-quarter of the sequence under selection lying outside of coding regions. Much of this sequence can be localized to approximately 90,000 conserved noncoding sequences (CNSs) that show evidence of transcriptional and post-transcriptional regulation. Population genomics analyses of two crucifer species, A. thaliana and Capsella grandiflora, confirm that most of the identified CNSs are evolving under medium to strong purifying selection. Overall, these CNSs highlight both similarities and several key differences between the regulatory DNA of plants and other species.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Arabidopsis / genetics
  • Brassicaceae / classification
  • Brassicaceae / genetics*
  • Cluster Analysis
  • Computational Biology
  • Conserved Sequence*
  • Evolution, Molecular
  • Gene Deletion
  • Gene Duplication
  • Gene Expression Regulation, Plant
  • Genome, Plant
  • Genomics
  • High-Throughput Nucleotide Sequencing
  • Molecular Sequence Annotation
  • Nucleotide Motifs
  • Phylogeny
  • Regulatory Sequences, Nucleic Acid*
  • Selection, Genetic