RNA editing in the human ENCODE RNA-seq data

  1. Ali Mortazavi1,2,5
  1. 1Department of Developmental and Cell Biology, University of California Irvine, Irvine, California 92697, USA;
  2. 2Center for Complex Biological Systems, University of California Irvine, Irvine, California 92697, USA;
  3. 3Division of Biology, California Institute of Technology, Pasadena, California 91125, USA;
  4. 4Beckman Institute, California Institute of Technology, Pasadena, California 91125, USA

    Abstract

    RNA-seq data can be mined for sequence differences relative to the reference genome to identify both genomic SNPs and RNA editing events. We analyzed the long, polyA-selected, unstranded, deeply sequenced RNA-seq data from the ENCODE Project across 14 human cell lines for candidate RNA editing events. On average, 43% of the RNA sequencing variants that are not in dbSNP and are within gene boundaries are A-to-G(I) RNA editing candidates. The vast majority of A-to-G(I) edits are located in introns and 3′ UTRs, with only 123 located in protein-coding sequence. In contrast, the majority of non–A-to-G variants (60%–80%) map near exon boundaries and have the characteristics of splice-mapping artifacts. After filtering out all candidates with evidence of private genomic variation using genome resequencing or ChIP-seq data, we find that up to 85% of the high-confidence RNA variants are A-to-G(I) editing candidates. Genes with A-to-G(I) edits are enriched in Gene Ontology terms involving cell division, viral defense, and translation. The distribution and character of the remaining non–A-to-G variants closely resemble known SNPs. We find no reproducible A-to-G(I) edits that result in nonsynonymous substitutions in all three lymphoblastoid cell lines in our study, unlike RNA editing in the brain. Given that only a fraction of sites are reproducibly edited in multiple cell lines and that we find a stronger association of editing and specific genes suggests that the editing of the transcript is more important than the editing of any individual site.

    Footnotes

    • 5 Corresponding author

      E-mail ali.mortazavi{at}uci.edu

    • [Supplemental material is available for this article.]

    • Article and supplemental material are at http://www.genome.org/cgi/doi/10.1101/gr.134957.111.

      Freely available online through the Genome Research Open Access option.

    • Received November 16, 2011.
    • Accepted May 1, 2012.

    This article is distributed exclusively by Cold Spring Harbor Laboratory Press for the first six months after the full-issue publication date (see http://genome.cshlp.org/site/misc/terms.xhtml). After six months, it is available under a Creative Commons License (Attribution-NonCommercial 3.0 Unported License), as described at http://creativecommons.org/licenses/by-nc/3.0/.

    Related Articles

    | Table of Contents
    OPEN ACCESS ARTICLE

    Preprint Server