TY - JOUR T1 - Bayesian inference and comparison of transcription elongation stochastic models JF - bioRxiv DO - 10.1101/499277 SP - 499277 AU - Jordan Douglas AU - Richard Kingston AU - Alexei J. Drummond Y1 - 2018/01/01 UR - http://biorxiv.org/content/early/2018/12/17/499277.abstract N2 - Transcription elongation can be modelled as a three step process, involving polymerase translocation, NTP binding, and nucleotide incorporation into the nascent mRNA. This cycle of events can be simulated at the single-molecule level as a continuous-time Markov process using parameters derived from single-molecule experiments. Previously developed models differ in the way they are parameterised, and in their incorporation of partial equilibrium approximations.We have formulated a hierarchical network comprised of 12 sequence-dependent transcription elongation models. The simplest model has two parameters and assumes that both translocation and NTP binding can be modelled as equilibrium processes. The most complex model has six parameters makes no partial equilibrium assumptions. We systematically compared the ability of these models to explain published force-velocity data, using approximate Bayesian computation. This analysis was performed using data for the RNA polymerases of E. coli, S. cerevisiae and Bacteriophage T7.Our analysis indicates that the polymerases differ significantly in their translocation rates, with the rates in T7 pol being fast compared to E. coli RNAP and S. cerevisiae pol II. Different models are applicable in different cases. We also show that all three RNA polymerases have an energetic preference for the posttranslocated state over the pretranslocated state. A Bayesian inference and model selection framework, like the one presented in this publication, should be routinely applicable to the interrogation of single-molecule datasets.Author summary Transcription is a critical biological process which occurs in all living organisms. It involves copying the organism’s genetic material into messenger RNA (mRNA) which directs protein synthesis on the ribosome. Transcription is performed by RNA polymerases which have been extensively studied using both ensemble and single-molecule techniques (see reviews: [1, 2]). Single-molecule data provides unique insights into the molecular behaviour of RNA polymerase. Transcription at the single-molecule level can be computationally simulated as a continuous-time Markov process and the model outputs compared with experimental data. In this study we use Bayesian techniques to perform a systematic comparison of 12 stochastic models of transcriptional elongation. We demonstrate how equilibrium approximations can strengthen or weaken the model, and show how Bayesian techniques can identify necessary or unnecessary model parameters. We describe publicly accessible and open-source software that can a) simulate, b) perform inference on, and c) compare models of transcription elongation. ER -