Skip to main content
bioRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search
New Results

Reparametrizing the Sigmoid Model of Gene Regulation for Bayesian Inference

View ORCID ProfileMartin Modrák
doi: https://doi.org/10.1101/352070
Martin Modrák
Institute of Microbiology of the Czech Academy of Sciences, Prague, Czech Republic
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Martin Modrák
  • For correspondence: martin.modrak@biomed.cas.cz
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Preview PDF
Loading

Abstract

This poster describes a novel reparametrization of a fre-quently used non-linear ordinary differential equation (ODE) model of gene regulation. We show that in its commonly used form, the model cannot reliably distinguish between both quantitatively and qualitatively different parameter combinations. The proposed reparametrization makes inference over the model stable and amenable to fully Bayesian treatment with state of the art Hamiltonian Monte Carlo methods.

Complete source code and a more detailed explanation of the model is available at https://github.com/cas-bioinf/genexpi-stan.

1 Introduction

Transcriptional regulation has historically been treated with a vast range of models from Boolean networks, to full Michaelis-Menten simulation of individual chemical reactions. A middle ground is occupied by linear and non-linear ODE models that approximate Michaelis-Menten kinetics. In our attempts to reimplement a Bayesian version of a popular sigmoid ODE model [1] we noticed that very different parameter values - including different sign of the regulatory effect - may result in almost identical behavior, which introduces computational difficulties and questions the interpretation of previous results achieved with the model. Here we describe novel reparametrization of the model implemented in the Stan probabilistic programming language [2] that mitigates these problems.

2 The Model

The ODE model we use is based on [3]. For a bacterial regulon with a single regulator y and targets x1…xn the model takes form Embedded Image

Where si, wi, bi and di are parameters to be fit. While bacterial regulons are our primary interest, the model also works with multiple regulators. The model assumes that both the regulator and targets are measured at multiple time points, letting us to solve the ODE. When using microarray data we assume normal observation noise where the standard deviation has two components: one constant and one proportional to the expression. The regulator is fit using B-splines where the spline coefficients are also treated as parameters of the model which let us to both handle uncertainty in regulator measurements and obtain the regulator at arbitrarily small time resolution required for solving the ODE numerically. We note that the model is only weakly identifiable in its direct form (1), as very different parameter sets can result in similar solutions for xi. Weak identifiability poses computational problems for Bayesian inference, reduces stability of maximum-likelihood estimates and limits interpretability of the model results. Critically, the sign of wi (whether the regulator is an activator or a repressor) is not well determined for certain target profiles (Fig. 1a) and it may be impossible to determine whether wi ≃ 0 (Fig. 1b). Negligible effect on xi can also be observed under linear transformations of (wi, bi) when |wiy+bi| ⪢ 0 (Fig. 1c) and under approximately linear transformations of (si,di) (Fig. 1d).

Fig. 1.
  • Download figure
  • Open in new tab
Fig. 1.

Simulated examples of weakly identified parameters in the sigmoid model. Expression of the regulator (red, solid line) and multiple significantly different parameter values that give raise to similar solutions of the ODE (dashed lines) are shown together with possible measured target profiles (dots) that are insufficient to distinguish between the solutions. a) (w,b,s,d) ∈ {(5, −1, 3, 2), (−5,10, 2.5,1.4)} b) (w,b,s,d) ∈ {5, −1, 3, 3), (0, 0, 6, 3.4), (0, 0,16,10) c) s = 1, Embedded Image and (w,b) ∈ {(5, −3), (10, −6), (50, −30)} d) w = 1,b =-1and(s,d) ∈ {(10, 5), (19,10), (180,100)}

The aforementioned identifiability problems can be mitigated by a) fixing Ii = sgn(wi) and running separate fits when both signs have to be tested and b) replacing wi,bi and si with Embedded Image, the mean regulatory input Embedded Image, the std. deviation of the regulatory input and ai - normalized ratio of si to di. The original parameters can then be recovered as: Embedded Image Embedded Image Embedded Image where E and sd correspond to the sample mean and standard deviation and Embedded Image is the observed expression of gene i.

To our knowledge, this is the first time the identifiability issues are reported and handled for the sigmoid ODE model. Caution should therefore be exercised when interpreting the parameter values found in previous works using this model.

3 Workflow and Remarks

A model fit is in itself insufficient to determine whether a regulation is plausible. To assess plausibility, we test whether the model improves over two baselines: a) constant synthesis (wi = 0) and b) a model where the regulator spline is not constrained by regulator measurements data. We use the LOO-IC criterion [4] to compare the models.

When there are known regulations, the model can be trained by fitting those known regulations first. The posterior estimates of the expression of the regulator can then be used to decrease uncertainty when fitting putative novel targets. Performance-wise, the training phase takes several minutes and fitting 881 putative targets individually on an 8-core machine takes 1 hour which promises faster performance than [1].

Footnotes

  • * Supported by C4Sys research infrastructure (MEYS project No: LM20150055).

References

  1. 1.↵
    Titsias, M. et al: Identifying targets of multiple co-regulating transcription factors from expression time-series by Bayesian model comparison. BMC Systems Biology 6:53 (2012).
    OpenUrl
  2. 2.↵
    Carpenter, B. et al: Stan: A probabilistic programming language. Journal of Statistical Software 76(1) (2017).
  3. 3.↵
    Vohradsk, J.: Neural Model of the Genetic Network. Journal of Biological Chemicstry 276(39), 36168–36173 (2001).
    OpenUrl
  4. 4.↵
    Vehtari, A. et al: loo: Efficient leave-one-out cross-validation and WAIC for Bayesian models, https://cran.r-project.org/package=loo, (2018).
Back to top
PreviousNext
Posted June 22, 2018.
Download PDF
Email

Thank you for your interest in spreading the word about bioRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
Reparametrizing the Sigmoid Model of Gene Regulation for Bayesian Inference
(Your Name) has forwarded a page to you from bioRxiv
(Your Name) thought you would like to see this page from the bioRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
Reparametrizing the Sigmoid Model of Gene Regulation for Bayesian Inference
Martin Modrák
bioRxiv 352070; doi: https://doi.org/10.1101/352070
Digg logo Reddit logo Twitter logo Facebook logo Google logo LinkedIn logo Mendeley logo
Citation Tools
Reparametrizing the Sigmoid Model of Gene Regulation for Bayesian Inference
Martin Modrák
bioRxiv 352070; doi: https://doi.org/10.1101/352070

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Bioinformatics
Subject Areas
All Articles
  • Animal Behavior and Cognition (3686)
  • Biochemistry (7780)
  • Bioengineering (5671)
  • Bioinformatics (21250)
  • Biophysics (10565)
  • Cancer Biology (8164)
  • Cell Biology (11915)
  • Clinical Trials (138)
  • Developmental Biology (6740)
  • Ecology (10388)
  • Epidemiology (2065)
  • Evolutionary Biology (13845)
  • Genetics (9695)
  • Genomics (13058)
  • Immunology (8129)
  • Microbiology (19970)
  • Molecular Biology (7839)
  • Neuroscience (42991)
  • Paleontology (318)
  • Pathology (1276)
  • Pharmacology and Toxicology (2257)
  • Physiology (3350)
  • Plant Biology (7208)
  • Scientific Communication and Education (1309)
  • Synthetic Biology (2000)
  • Systems Biology (5529)
  • Zoology (1126)