The transcription start site landscape of C. elegans

  1. Shinichi Morishita1,8
  1. 1Department of Computational Biology, Graduate School of Frontier Sciences, The University of Tokyo, Kashiwa 277-0882, Japan;
  2. 2Department of Laboratory Medicine, Faculty of Medicine, Kanazawa University, Kanazawa, 920-8641 Japan;
  3. 3Department of Pathology, School of Medicine, Stanford University, Stanford, California 94305-5324, USA;
  4. 4Molecular, Cellular, and Developmental Biology, University of Colorado, Boulder, Colorado 80309-0347, USA;
  5. 5Departments of Pathology and Genetics, School of Medicine, Stanford University, Stanford, California 94305-5324, USA
    • Present addresses: 6Department of Molecular Biology and Biochemistry, Rutgers University, Piscataway, NJ 08854, USA;

    • 7 Division of Medical Oncology, Department of Medicine, University of Colorado School of Medicine, Aurora, CO 80045, USA.

    Abstract

    More than half of Caenorhabditis elegans pre-mRNAs lose their original 5′ ends in a process termed “trans-splicing” in which the RNA extending from the transcription start site (TSS) to the site of trans-splicing of the primary transcript, termed the “outron,” is replaced with a 22-nt spliced leader. This complicates the mapping of TSSs, leading to a lack of available TSS mapping data for these genes. We used growth at low temperature and nuclear isolation to enrich for transcripts still containing outrons, applying a modified SAGE capture procedure and high-throughput sequencing to characterize 5′ termini in this transcript population. We report from this data both a landscape of 5′-end utilization for C. elegans and a representative collection of TSSs for 7351 trans-spliced genes. TSS distributions for individual genes were often dispersed, with a greater average number of TSSs for trans-spliced genes, suggesting that trans-splicing may remove selective pressure for a single TSS. Upstream of newly defined TSSs, we observed well-known motifs (including TATAA-box and SP1) as well as novel motifs. Several of these motifs showed association with tissue-specific expression and/or conservation among six worm species. Comparing TSS features between trans-spliced and non-trans-spliced genes, we found stronger signals among outron TSSs for preferentially positioning of flanking nucleosomes and for downstream Pol II enrichment. Our data provide an enabling resource for both experimental and theoretical analysis of gene structure and function in C. elegans.

    Footnotes

    • 8 Corresponding authors

      E-mail afire{at}stanford.edu

      E-mail moris{at}cb.k.u-tokyo.ac.jp

    • [Supplemental material is available for this article.]

    • Article published online before print. Article, supplemental material, and publication date are at http://www.genome.org/cgi/doi/10.1101/gr.151571.112.

      Freely available online through the Genome Research Open Access option.

    • Received November 2, 2012.
    • Accepted April 18, 2013.

    This article, published in Genome Research, is available under a Creative Commons License (Attribution-NonCommercial 3.0 Unported), as described at http://creativecommons.org/licenses/by-nc/3.0/.

    | Table of Contents
    OPEN ACCESS ARTICLE

    Preprint Server