Theory on the coupled stochastic dynamics of transcription and splice-site recognition

PLoS Comput Biol. 2012;8(11):e1002747. doi: 10.1371/journal.pcbi.1002747. Epub 2012 Nov 1.

Abstract

Eukaryotic genes are typically split into exons that need to be spliced together to form the mature mRNA. The splicing process depends on the dynamics and interactions among transcription by the RNA polymerase II complex (RNAPII) and the spliceosomal complex consisting of multiple small nuclear ribonucleo proteins (snRNPs). Here we propose a biophysically plausible initial theory of splicing that aims to explain the effects of the stochastic dynamics of snRNPs on the splicing patterns of eukaryotic genes. We consider two different ways to model the dynamics of snRNPs: pure three-dimensional diffusion and a combination of three- and one-dimensional diffusion along the emerging pre-mRNA. Our theoretical analysis shows that there exists an optimum position of the splice sites on the growing pre-mRNA at which the time required for snRNPs to find the 5' donor site is minimized. The minimization of the overall search time is achieved mainly via the increase in non-specific interactions between the snRNPs and the growing pre-mRNA. The theory further predicts that there exists an optimum transcript length that maximizes the probabilities for exons to interact with the snRNPs. We evaluate these theoretical predictions by considering human and mouse exon microarray data as well as RNAseq data from multiple different tissues. We observe that there is a broad optimum position of splice sites on the growing pre-mRNA and an optimum transcript length, which are roughly consistent with the theoretical predictions. The theoretical and experimental analyses suggest that there is a strong interaction between the dynamics of RNAPII and the stochastic nature of snRNP search for 5' donor splicing sites.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Animals
  • Computational Biology / methods*
  • Computer Simulation
  • Gene Expression Profiling
  • Humans
  • Introns
  • Mice
  • Models, Genetic*
  • Oligonucleotide Array Sequence Analysis
  • RNA Precursors / genetics
  • RNA Splice Sites*
  • RNA Splicing
  • Reproducibility of Results
  • Ribonucleoproteins, Small Nuclear / genetics
  • Stochastic Processes
  • Transcription, Genetic*

Substances

  • RNA Precursors
  • RNA Splice Sites
  • Ribonucleoproteins, Small Nuclear