Theory on the looping mediated stochastic propulsion of transcription factors along DNA

R. Murugan

doi:10.1101/418947

ABSTRACT

We demonstrate that DNA-loops can stochastically propel the site-specifically bound transcription factors towards the promoters. The gradual release of elastic energy stored on the DNA-loops is the source of propulsion. The speed of looping mediated interaction of transcription factors with promoters is several times faster than the sliding mode. Elastic and entropic energy barriers associated with the looping of DNA actually shape up the distribution of distances between transcription factor binding sites and promoters. The commonly observed multiprotein binding in gene regulation is acquired through evolution to overcome the looping energy barrier. Presence of nucleosomes on the genomic DNA of eukaryotes is required to reduce the entropy barriers associated with the looping.

INTRODUCTION

Site-specific binding of transcription factors (TFs) at their cis-regulatory motifs (CRMs) on the genomic DNA in the presence enormous amount of nonspecific binding sites is essential for the activation and regulation of several genes across prokaryotes to eukaryotes (1–3). Binding of TFs with their CRMs was initially thought as a single-step three-dimensional (3D) diffusion-controlled collision process. Kinetic experiments on lac-repressor-Operator system revealed a bimolecular rate in the order of ~10⁹-10¹⁰ M⁻¹s⁻¹ that is ~10-10² times faster than the Smolochowski type 3D diffusion-controlled rate limit. Berg et.al. (4, 5) successfully explained this inconsistency using a two-step mechanism by establishing the key concept that TFs first bind with DNA in a nonspecific manner via 3D diffusion and then search for their cognate sites via various one-dimensional (1D) facilitating processes such as sliding, hopping and intersegmental transfers. Here 1D diffusion with unit base-pair step-size of TFs is the sliding, few base-pairs (bp, 1 bp = l_d ~ 3.4 × 10⁻¹⁰ m) step-size is called hopping and few hundred to thousand bps step-size is called intersegmental-transfer. Intersegmental transfers occur whenever two distal segments of the same DNA polymer come in nearby over 3D space via ring closure events (6–8).

Specific binding of TFs with DNA is affected by several factors (8) viz. a) conformational state of DNA (8, 9) b) spatial organization of various functionally related combinatorial CRMs along the genomic DNA (10, 11), c) presence of similar or other dynamic roadblock proteins (12) and semi-stationary roadblocks such as nucleosomes in eukaryotes (13–17), d) naturally occurring sequence mediated kinetic traps on DNA (18, 19), e) conformational fluctuations in the DNA binding domains of TFs (20–22) and f) the nonspecific electrostatic attractive forces and the counteracting shielding effects of other solvent ions and water molecules acting at the DNA-protein interface (23). Several theoretical models (7, 8, 18, 21, 24), computational (25–28) and experimental studies have been carried out to understand the effects of factors a-f on the kinetics of site-specific DNA-protein interactions.

In general, the searching efficiency of TFs depends on the relative amount of times spent by them on the 3D and 1D diffusions (10, 21). Clearly, neither pure 1D nor 3D diffusion is an efficient mode of searching (8, 10). Under ideal situation, maximum searching efficiency can be achieved only when TFs spend equal amount of times in both 1D and 3D diffusions (7, 10). This trade off balance between the times spent on different modes of diffusions will be modulated by the factors a-f. For example, presence of nucleosome roadblocks warrants more dissociations and 3D excursions of TFs rather than 1D sliding (29). Sequence specific fast conformational switching of DNA binding domains between stationary and mobile states helps TFs to overcome the sequence traps (29). Relaxed conformational state of DNA enhances more sliding rather than hopping and intersegmental transfers and so on (8). Conformational dynamics of DNA also modulates the speed of gene activation and regulation. In this context, looping of DNA is critical for the activation and expression of various genes across prokaryotes to eukaryotes (3, 30–34). Combinatorial binding of TFs with their specific CRMs on the genomic DNA activates the downstream promoters of genes via looping of the intervening DNA segment to form a synaptosome type complex (1, 35). In most of the molecular biological processes, DNA-loops are warranted for the precise protein-protein interactions which are the prerequisites for transcription and recombination (36).

The statistical mechanics of looping and cyclization of linear DNA has been studied extensively in the literature (33, 37, 38). However, it is still not clear why DNA-loops have evolved as an integral part of the activation and repression of transcription and recombination although such underlying site-specific protein-protein and protein-DNA interactions can also be catered straightforwardly via a combination of 1D and 3D diffusions of TFs (4, 5, 21, 39). That is to say, upon arrival at the CRMs, TFs can directly slide or hop along the DNA polymer to reach the promoters. Schleif (31) had argued that the looping of DNA can simplify the evolution of the genomic architecture of eukaryotes by not imposing strict conditions on the spacing between the TF binding sites and the promoters. This is logical since a given set of TFs need to regulate several different genes across the genome. Therefore, placement of TF binding sites near a specific gene can be a disadvantage for other genes along the genomic evolution. Similarly, placement of TF binding sites near every gene is not an efficient genome design. The DNA loops also play critical roles in the transcription bursting (40) and memory (41). It is not clear how exactly the DNA-loop is formed between the CRMs and promoters via TFs though Rippe et.al., (32) had already taken several snapshots of the looping intermediates. In this paper, we will show that the DNA-looping combined with an asymmetric binding energy profile can stochastically propel TFs towards the promoters along DNA. Using computational tools, we further demonstrate that the looping mediated propulsion or tethered sliding of TFs along DNA can actually help in finding the direction of the promoter region and also shape up the genomic architecture.

THEORETICAL FORMULATION

Let us first list out the basic facts observed on the mechanism of distal action of CRMs-TFs system on the downstream promoters in the process of transcription activation. Firstly, both theoretical investigations (4, 5, 7, 8, 21) and experimental observations (42, 43) suggest that TFs recognize their CRMs via a combination of 1D and 3D diffusions. The key idea here is that TFs scan a random piece of DNA via 1D diffusion after each of the 3D diffusion mediated nonspecific collisions (4, 8, 21). On the contrary, the reacting molecules dissociate immediately upon each of their unfruitful collisions in the standard Smolochowski model. When the dynamics of TFs is confined within the Onsager radius of the DNA-protein interface, then it is categorized as the 1D diffusion. When TFs escape out of the Onsager radius and perform free 3D excursions, then we classify it as the 3D diffusion (8). The Onsager radius connected with the DNA-protein interface is defined (8) as the distance between the positively charged DNA binding domains of TFs and the negatively charged phosphate backbone of DNA at which the overall electrostatic energy is same as that of the background thermal energy (equals to ~1 k_BT) (Section 1, Supporting Materials). Secondly, transcription activation is achieved upon the distal communication between the CRMs-TFs complex with the RNAP-promoter complex (1–3). Thirdly, binding of TFs at CRMs locally bends the DNA and the DNA-loops connecting CRMs-TFs with the promoters are observed in most of the transcriptionally active genes of eukaryotes (3, 44).

Clearly, TFs activate transcription via two sequential steps viz. they bind their CRMs in the first step and then distally communicate with the promoter-RNAP complex in the second step to initiate the transcription event. To understand the role of DNA-loops in the transcription activation, we consider two possible scenarios viz. looping mediated versus a hypothetical pure 3D1D diffusion mediated distal communication between the CRMs-TFs and the promoters. In both these scenarios, TFs locate their respective CRMs via a combination of 1D and 3D diffusion in the first step. They differ only in the second step where TFs dissociate from their CRMs and communicate with the promoters via a combination of 1D and 3D diffusions in the second case whereas the distal communication will be through the DNA-loops in the first case. We denote the search time required by TFs to locate their CRMs in the first step of transcription activation as τ_S. Clearly, those factors a-f listed out in the introduction section significantly modulate this quantity. We will not recalculate this here since enormous amount of literature already exists (see Section 1 of the Supporting Materials) on the derivation of this quantity under various conditions (8, 18, 21, 45). In the following sections, we will compute the mean time required by CRMs-TFs complex to communicate with the promoter via DNA-loops in the second step of the transcription activation.

Preliminary assumptions

Upon observing the open synaptic complexes of transcriptionally active genes of eukaryotes with DNA loops, one can conclude that TFs which activate transcription via DNA-loops have at least two different DNA binding domains (DBDs) viz. one corresponds to the CRM (DBD1) (Fig. 1) and the another corresponds to the binding site that is located proximal to the promoter (DBD2) region. For example, the tetrameric Lac I complex binds two different Operator regions that induces looping of DNA (2, 3). However, in this case the tetramers of repressor molecules bound at these two different binding sites communicate via protein-protein interactions among them. The DNA-loop is stabilized by an octamer form of the Lac I repressor protein. Such mechanisms are common in case of multiprotein mediated DNA-looping and transcription activation. We further assume that TF reaches its specific binding site in the first step via a combination of 3D and 1D diffusions (4, 5, 7, 21, 46) in line with two-step DNA-protein interaction model and subsequently bends the DNA upon site-specifically binding their CRMs (32, 33).

FIGURE 1.

A. Looping mediated stochastic propulsion of TF with radius of gyration of r_P along DNA. Here TF has two binding sites corresponding to viz. its cis-regulatory module (DBD1) located between S1 and S2 and the promoter (DBD2). Binding of TF with its specific site (that spans for a length of X₀ from S1 (X = 0) to S2 (X=X₀)) bends the DNA segment into a loop around it such that X₀ = 2πr_P. The bending energy stored in the site-specific complex will be incrementally released via bulging of DNA around the TF. B. When the binding energy near S1 is stronger than S2, then the TF can be stochastically propelled towards the promoter (P) that is located at L. Upon reaching there, DBD2 of TF interacts with the promoter to form a specific synaptic complex. C. DNA-loop configuration utilized for gene silencing. D. Synaptosome where TF is bound with both its specific binding site and the promoter via DNA-loop.

Energetics of the site-specific binding of TFs and bending of DNA

Let us assume that the radius of gyration of the TF of interest is r_P. Upon binding its cognate stretch of DNA with size of X₀ bp located in between S1 to S2, the TF bends the DNA segment into a circle around its spherical solvent shell surface such that X₀ = 2πr_P as shown in Fig. 1A. We set X = 0 at S1 and X = X₀ at S2 where Xis the current location of the DBD2 of TF on DNA. S1 is the specific site for DBD1 and P is the specific site for DBD2 by definition. Here DNA under consideration spans over the range (0, L) as in Fig. 1B and Xis the current loop-length. The total energy required to bend a linear DNA will be the sum E_bend = E_elastic + E_entropy. For the radius of curvature r_P, one finds that (measured in k_BT units) where a is the persistence length of DNA (37, 47). Clearly, E_elastic required to bend the DNA segment of length X into a circle will be E_elastic ≃ 2π²a/X. This energy has to be derived either solely from the site-specific binding energy of TFs or via an external energy input in the form of ATP hydrolysis (48). Noting that E_entropy ≃ (3/2) ln (πX/6) (Eq. A1 of Appendix A) one finally arrives at the following expression for the overall bending energy.

Clearly, E_bend attains a minimum value as at X_C = 4π²a/3. In the later sections, we will show that this non-monotonic behavior of the bending energy profile will restrict the possible distances between the CRMs and their corresponding promoters.

Looping mediated communication between CRMs-TFs and promoters

When TFs bind their CRMs in the first step of transcription activation, then the site-specific binding energy (E_bind) released at the DNA-TF interface dissipates partially as the elastic energy required to bend the DNA chain (E_elastic), partially to form specific non-covalent bonds (E_bond, the enthalpic component) and partially as the energy required to compensate the chain entropy loss (E_entropy) at the specific binding site. Clearly, E_bind = E_bond + E_bend where E_bend E_elastic + E_entropy. Therefore, the overall free energy stored by the site-specific CRM-TF complex is given by E ≃ E_bond + E_elastic. This is the overall potential energy barrier which acts on any kind of distortion or dissociation of the site-specific CRMs-TFs complex. Conversely, E_bend is the potential energy barrier that resists the formation of loops out of linear DNA.

The free-energy stored in the site-specific DNA-TF complex (E) can undergo three different modes of dissipation viz. 1) thermal induced physical dissociation of TF from DNA in which both bonding and elastic energies dissipate into the heat bath along with increase in the chain entropy, 2) physical dissociation of only DBD2 from S2 and its re-association somewhere via looping over 3D space (which is resisted by the loop-length dependent potential energy barrier E_bend) while S1-DBD1 is still intact as modelled by Shvets and Kolomeisky (49), 3) stochastic propulsion of TF on DNA via sliding of DBD2 towards the promoter which can be achieved by gradual increase in the value of X from X₀ towards L and 4) tethered sliding of DBD2 with intact DNA-loop and DBD1-S1 interactions. In the propulsion mechanism, mainly the elastic energy dissipates that causes bulging of the DNA-loop around TF. The chain entropy does not increase much here since the intervening DNA is still under loop conformation. This is similar to the sliding of nucleosomes via bulge induced reptation dynamics of DNA (29, 50, 51). The probability associated with the spontaneous dissociation will be inversely correlated with E_bond and positively correlated with E_elastic. Generally, dissociation will be an endothermic process since E_bond > E_elastic. Clearly, physical dissociation will not be the most probable route of dissipation of the energy stored in the site-specific DNA-TF complex.

With this background, the DBD2 of TF needs to distally interact with the promoter in the second step and activate the transcription via looping of the intervening DNA segment that connects the CRMs and the promoter. There are two different possibilities viz. tethered sliding of DBD2 of TF with intact DBD1-S1 and a stochastic propulsion of TFs with intact DBD1-S1. Shvets and Kolomeisky (49) have recently studied another interesting model with repeated binding-unbinding of DBD2 with intact DBD1-S1. However, in their model sliding of DBD2 of TF was not allowed. All the symbols used in this paper are listed in Table S1 of the Supporting Material. In the following sections, we will develop our stochastic propulsion and tethered sliding models in detail.

Stochastic propulsion model

When the binding energy profile of TF is such that the bonding energy near S1 is much higher than S2, then the bending energy stored in the site-specific TF-DNA complex can be gradually released via bulging of the DNA-loop around TF which in turn stochastically propels the sliding DBD2 of TF towards the promoter located at L as shown in Fig. 1B. There is no straightforward experimental evidence for this model. However, one can construe this idea indirectly from various other experimental studies. Particularly, Rippe et.al (32) have studied NtrC (Nitrogen regulatory protein C) system using the scanning force microscopy. In this study, they had taken snapshots of various intermediary states along the process of transcription activation from the closed to the open promoter complex. In their model system, binding of NtrC at its specific site (CRM) activates the downstream closed complex of glnA promoter-RNAP-σ⁵⁴ via looping out of the intervening DNA segment. They have shown that the transition from the inactive-closed form to an active-open promoter complex involved a gradual increase in the bending angle of the intervening DNA. This in turn is positively correlated with an increase in the radius of curvature of the intervening DNA segment which is represented as bulging of the DNA-loop in our propulsion model. Therefore, our assumption that the propulsion of TFs via increase in the radius of curvature of the bent DNA is a logical one. Here the asymmetric binding energy profile is essential to break the symmetry of the stochastic force acting on the sliding TFs (52). This is also a logical assumption since S1-DBD1 is a strong site-specific interaction and S2-DBD2 is approximately nonspecific interaction by definition. Fig. 1C shows another possibility in the formation of DNA-loop which is common in case of silencing mode of TFs. Based on these, the dynamical position X of TF on DNA obeys the following Langevin type stochastic differential equation (53–55).

In Eq. 2, F(X) = −dE/dX = 2π²a/X² (bp⁻¹) is the force acting on TF that is generated by the bending potential E ~ E_elastic + E_bond upon bulging of the DNA-loop, Γ_t is the Δ-correlated Gaussian white noise and Dc (bp²/s) is the 1D diffusion coefficient of the sliding of TF. The energy involved in the bonding interactions will be a constant one so that it will not contribute to the force term. Here we ignore the energy dissipation via chain entropy of bulging DNA-loop mainly because binding of TFs at their specific sites attenuates the conformational fluctuations at the DNA-TF interface (7, 20, 21). The Fokker-Planck equation describing the probability of observing a given X at time t with the condition that X = X₀ at t = t₀ can be written as follows (53, 54).

The form of F(X) suggests that it can propel the DBD2 of TF only for short distances since lim_X→∞ F(X) = 0 although such limit will be meaningless for X> 2π²a where E_elastic will be close to the background thermal energy. Initial condition for Eq. 3 will be P(X, t₀ | X₀, t₀) = δ(X − X₀) where X₀ = 2πr_P and the boundary conditions are given as follows.

Here X₀ acts as a reflecting boundary for a given size of TF and L is the absorbing boundary where the promoter is located. The asymmetric energy profile with respect to S1 and S2 is required for the validity of the reflecting boundary condition at X₀. Upon reaching the promoter via loop-expansion of the intervening DNA segment, TFs subsequently activate the transcription. The mean first passage time T_B(X) associated with the DBD2 of TF to reach the promoter location L starting from arbitrary X ∈ (X₀, L) obeys the following backward type Fokker-Planck equation along with the appropriate boundary conditions (6, 7).

The integral solution of Eqs. 5 can be expressed as follows.

Here (56) and interestingly lim_L→∞ T_B(X) = T_N(X). Here T_N(X) is the mean first passage time required by the DBD2 of TF to reach L via pure 1D sliding in the absence of DBD1 which is a solution of the following differential equation (6, 7, 21).

To obtain the target finding time, one needs to set X = X₀ in Eqs. 6 and 7. One can define the number of times the target finding rate of TF can be accelerated by the looping mediated propulsion of TF over 1D sliding as η_P = [T_N(X₀)/T_B(X₀)] (here the subscript ‘P’ denotes the propulsion model) which is clearly independent of D_C of TF and solely depends on (L, a, and X₀). Explicitly one can write it as,

Detailed numerical analysis (see Section 2 of the Supporting Material) suggests that there exists a maximum of η_P at which ∂η_P/∂L = 0 with L = L_opt and clearly, we have lim_L→∞ η_P = 1 (Figs. 2A and B). This is logical since when L > L_opt then η_P → 1 and when L < L_opt then the stored energy is not completely utilized to propel the DBD2 of TF. Further, lim_L→X₀ η_P = 0 since its numerator part goes to zero much faster than the denominator (Fig. S1). The total time required by the TFs to form a synaptosome complex via propulsion mechanism will be τ_P = τ_S + T_B(X).

FIGURE 2.

A. Relative efficiency of looping mediated stochastic propulsion of TFs versus normal 1D sliding along DNA. T_N(X₀) is the mean first passage time that is required by TFs to reach the promoter that is located at L, starting from X₀ via 1D sliding. T_B(X₀) is the mean first passage time required by TFs to reach L starting from X₀ via looping mediated stochastic propulsion mechanism. X₀ was iterated as (25, 50, 75, 100, 125, 150, 200) along the arrow while iterating L from X₀ to 1000. The efficiency of looping mediated sliding is strongly dependent on the persistence length of DNA (a), L and X₀ and it is a maximum at L_opt ~ 3X₀. B. Plot of dη_P/dL with respect to L. Here the settings are a ~ 150 bp and X₀ ~ 50 bp and L was iterated from 50 to 1000 bp. Upon solving dη_P/dL = 0 for L numerically one finds that L_opt ~ 142.2 bp. C. Variation of L_opt with respect to X₀. Clearly L_opt ~ 3X₀, is slightly dependent on the persistent length a. Here we have iterated X₀ from 50 to 100 bp and a = (150, 250) bp. The solution for L was searched within the interval (50, 1000) bp. D. Variation of L_opt with respect to changes in a. Here we have iterated a from 100 to 200 bp and X₀ = (125, 150, 200) bp. The solution for L was searched within the interval (50, 1000) bp. The error in the approximation L_opt ~ 3X₀ seems to be < 10% over wide range of a values.

Predictions of the propulsion model

The persistence length of DNA under in vitro conditions is a ~ 150 bp and the radius of gyration for most of the eukaryotic TFs will be in the range r_P ~ 10-15 bp. Therefore, one can set the initial X = 2πr_P ~ 50-100 bp (57, 58). Simulations (Fig. 2A) of the expression for η_P (Eq. 7) at different values of X₀ and, L from X₀ to 10⁵ suggested that L_opt ~ 3X₀ (see Figs. 2C and 2D). When a ~150 bp and X₀ ~ 50-100 bp, then L_opt ~ 150-300 bp. Remarkably, this is the most probable range of the distances between the CRMs and promoters of various genes observed across several genomes (59). The efficiency of the stochastic propulsion will be maximum at L_opt. Although L_opt is not much affected by a, the maximum of ηP is positively correlated with a. This is logical since the stored elastic energy is directly proportional to the persistence length of the polymer. Remarkably, at the optimum L_opt the speed of interactions between CRM-TFs complex with the promoters will be ~10-25 times faster than the normal 1D sliding.

Tethered sliding model

In this model, the tethered DBD2 of TF searches for the promoter region with intact site-specific bonding interactions at DBD1-S1. Actually, DBD2-S2 is a nonspecific type binding interaction by definition and the corresponding specific interactions occur whenever DBD2 finds the promoter region (P) and forms the site-specific DBD2-P complex. Here the tethered random walker (DBD2, which is actually tied with the DNA thread at DBD1-S1) wanders over 3D space and randomly forms nonspecific contacts with other segments of same DNA polymer analogous to the ring-closure events of intersegmental transfers. Before dissociation, there is always a possibility for the DBD2 to scan the DNA of random length for the presence of its specific site P. When the length of DNA connecting DBD1 and DBD2 is X for an arbitrary nonspecific contact of DBD2, then the potential energy barrier acting on such random scanning will be E ≃ (2π²a/X) + (3/2) ln(πX/6). Interestingly, this potential energy barrier attains a minimum as E_min = [3/2](1 + ln(2π³a/9)) at X_C = 4π²a/3. Forward and reverse movement of such tethered random walker drives X to X + 1 or X − 1. Contrasting from the propulsion model, here we have not ignored the entropy component of the potential E since the interconnecting DNA segment is in free loop form. The force generated by such potential will be F(X) = 2π²a/X² − 3/2X. Upon inserting this force term in to Eq. 5 one finally obtains the following result.

Here is the error function integral (56), and T_U(X) is the MFPT required by a tethered random walker to find its specific site located at L starting from X (this is the initial loop length) anywhere within (X₀, L) where X₀ is a reflecting boundary and L is an absorbing boundary. Since the potential function has a minimum at X_C, one can consider the following two different limiting regimes.

One can define the number of times the target finding rate of TF can be accelerated by the tethered sliding of TF as η_S = T_N(X)/T_U(X) (here the subscript ‘S’ denotes the tethered sliding model) which is clearly independent of D_C of TF and solely depends on (L, a, and X₀). Contrasting from the propulsion model, one finds thatlim_L→∞ η_S = 0. In these calculations we have not included the looping mediated nonspecific association time required by the DBD2 of TF. This in fact further increases the overall MFPT of the tethered sliding model. The rate associated with the formation of the initial (nonspecific contact) loop with length X can be written as k_NL ≃ k_t exp (−E) where k_t (s⁻¹) is the maximum achievable rate under zero potential. Clearly, k_NL will be a maximum at X_C which is the most probable initial landing position of the tethered DBD2 via DNA-looping. The total time required by the CRMs-TFs system to form the synaptosome complex in this model will be τ_TS = τ_S + 1/k_NL + T_U(X) which will attain a local minimum approximately at X = X_C. One can also define η_NL = k_NL/k_t which will attain the maximum value η_NL ~ 6.7 at X_C.

Predictions of the tethered sliding model

Tethered sliding model predicts the most probable distance of the CRMs of TFs i.e. S1 from the transcription start sites as X_C. At this distance, the rate of looping mediated synaptosome complex formation of TFs will be at maximum. Upon setting X = X_C in η_S and numerically iterating L from 3000 to 10000 bp with a ~ 150 bp. When the left reflecting boundary was at X₀, then one finds the critical distance L_C such that η_S > 1 when L < L_C and approximately η_S < 1 when L > L_C. Particularly when X₀ < 100 bp, one can define the critical distance of TSS from CRM in the tethered sliding model as L_C ~ 3X_C. This critical distance decreases with increase in X₀. These numerical results are demonstrated in Fig. S3 of the Supporting Materials.

COMPUTATIONAL ANALYSIS

The core assumptions of the propulsion model are 1) TFs have two different DNA binding domains (DBD1 and DBD2), 2) correspondingly there should be two different binding sites (S1 and S2) in the upstream region (CRMs) of the transcription start site (TSS), 3) out of which one that is closer to TSS should be weaker in binding strength than the one that is far away from TSS. This in turn creates the required asymmetry in the binding energy profile of TFs with the CRMs. The main prediction of the propulsion model is 4) that L_opt ~ 3X₀ where L_opt is the optimum distance between the CRMs and the promoters and X₀ is the distance between the two different binding-sites of TFs (S1 and S2) within the cis-regulatory module. 5) Tethered sliding model predicted the most probable distance of the CRMs (S1 corresponding to DBD1) of TFs from the transcription start site as X_C = 4π²a/3 ~ 2000 bp for a ~ 150 bp.

Datasets and analysis

To check whether such TF s-CRMs systems with properties 1) to 5) exist, we analyzed the upstream 5000 bp sequences of various genes of human and mouse genome. We used the position weight matrices of various transcription factors of human and mouse available with the JASPAR database and scanned upstream sequences of all the genes in the respective genome. The upstream 5000 bps sequences of various genes of human and mouse genomes were obtained from UCSC genome database (February 2009 assembly, hg19 version for human genome and December 2011 assembly, mm10 version of mouse genome) and position weight matrices (PWMs) (60, 61) of various TFs of mouse and human were obtained from the publicly available JASPAR database (62, 63). There were 21929 sequences from mouse genome and 28824 sequences from the human genome. Using the PWMs of various available TFs we generated the score table for various upstream sequences based on the following equation (60).

In this equation S_v,i is the score value of PWM at i^th position upstream of the transcription start site on v^th sequence, q is the length of binding stretch of the corresponding TF, fb is the background probability of observing base b in the corresponding genome, and f_{b, w} is the probability of observing base b at position w of the specific binding sites of TFs. Here f_b was calculated from the random sequences of the given genome available with the UCSC database. We considered only those TFs showing two different putative binding sites upstream of the promoters of various genes. This will prove our second assumption. The binding site close to TSS is S2 and the one away from TSS is S1 by definition. The distance between these sites is X₀. We also constructed the distribution of the distances of S1 and S2 from the transcription start site. There is a strong positive correlation between the score value and the binding energy of TFs (60). Therefore, the sign of the differences in the score values of these two putative binding sites of a given TF will give the information regarding the direction of the asymmetry of the binding energy profile that is required to prove our third assumption. Here the absolute distance between these binding sites will be X₀ and the distance between them from the transcription start site will be the L of our model. Checking for the relationship L_opt ~ 3X₀ will prove the forth proposition of our model. Computing the distribution of the distances of S1 from the TSS will confirm the validity of the fifth proposition predicted by the tethered sliding model.

In parallel, we also generated score table for random sequences using the same PWM from which we obtained the score distribution and the cutoff score value for the given weight matrix corresponding to a given p-value. In our calculations, we have set the p-value < 10⁻⁶ for defining the putative specific binding sites of TFs. We used the random sequences associated with each genome that is available at UCSC database to compute the probability of occurrence of putative binding sites by chance. We considered random sequences of size 5 × 10⁶ bps and fragmented it into 10³ number of sequences with length of 5000 bps. Then we scanned each random sequence with the same PWM and obtained the number of putative CRMs (false positives). The probability of observing a CRM site by chance will be calculated as p_NF = number of false positives / 1000.

RESULTS AND DISCUSSION

The main limitation of the propulsion model is the requirement of huge energy input involved in the initial bending of DNA around the TF of interest. This needs to be derived either in the form of ATP hydrolysis or in the form of binding energy derived from the combinatorial multiprotein TFs. For example, bending of a linear DNA with size of 50-100 bp into loop requires the hydrolysis of at least 3-5 ATPs (using E_bend = E_elastic + E_entropy, 1 ATP ~ 12 k_BT). Investment of such energy input is required by CRM-TF system to actively slide in a directional dependent manner towards the promoter. On the other hand, tethered sliding of TFs does not require such huge energy input since there is no restriction on the initial loop length. As a result, directional dependent movement of TFs is not possible in the tethered sliding model. However, the probability density function associated with the initial loop length will be dictated by the bending energy profile. Actually, E_bend will be a minimum at X_C ≃ 4π²a/3 where the average search time required to form the synaptosome complex will be at minimum (49). When X < X_C then E_bend ∝ X⁻¹. When X > X_C then E_bend ∝ ln(X). When a ~ 150 bp and X_C ~ 2000 bp then the minimum of E_bend ~ 13 k_BT which requires the hydrolysis of at least 1 ATP. These results are demonstrated in Fig. 3. Including the models presented in this paper, one can consider the following four possible modes.

Propulsion mechanism. This requires huge free energy input in the initial loop formation with a possibility of directional dependent movement of TFs towards the promoter.
Tethered sliding mechanism. This required minimal free energy input in the formation of initial loop. Although the directional dependent movement of TFs is not possible here, the free energy barrier involved in the initial loop formation stage restricts the initial landing position of DBD2 of TFs close to the promoters.
Repeated binding-unbinding mode. This mechanism is similar to tethered sliding mode with restrictions on the sliding dynamics. Here the searching for the promoters is achieved via repeated binding-unbinding of the tethered TFs. Directional dependent movement of TFs along DNA is not possible in this mode.
Parallel searching of two DBDs of TFs. Here two different DBDs of TFs (DBD1 and DBD2) search for their cognate sites on DNA (S1 and P respectively) independently through a combination of 1D and 3D diffusions. When these DBDs binds their cognate sites simultaneously, then the looping of the intervening DNA segment occurs as a result. However, this mechanism works well only for the single TF based transcription activation such as Lac I system and it is almost improbable for the combinatorial binding of TFs in the gene regulation of eukaryotes. However, this mode can be a parallel (but slow) pathway of loop formation for the above said mechanisms.

FIGURE 3.

Variation of the propulsion efficiency η_P and bending energy with respect to changes in the initial loop length X₀. Here the settings are a ~ 150 bp, L = 3X₀ ~ L_opt. We computed η_P = T_N(X₀)/T_B(X₀) with L = 3X₀ so that η_P will be close to its maximum. Here the subscript Z = (entropy, bend, elastic, enthalpy). E_bend = E_elastic + E_entropy where E_elastic ≃ 3000/X₀ and E_entropy ≃ 3ln(πX₀/6)/2 which is ~12 k_BT at X₀ ~ 2000 bp. E_elastic ≤ 1 k_BT when X₀ ≥ 3000 bp. Clearly, the bending energy of linear DNA is always ≥ 12 k_BT irrespective of the length. Shaded regions are the most probable X₀ values observed in the natural systems where the optimum distance between the transcription factor binding sites and promoters L_opt ~ 3X₀ ~ 150-300 bp.

The analysis results on the upstream sequences of various genes of human and mouse are shown in Figs. 4A and B. Clearly, there are several TFs with two different putative binding sites (S1, S2) upstream of the transcription start sites. Out of these, S1 is away and S2 is close to the TSS. The distributions of the distances of S1 and S2 from the respective TSS are shown in Fig. 5A1-2, B1-2. The distributions of the distances between S1 and S2 are shown in Figs. A3 and B3. The distributions of the asymmetry in the binding energy profiles of S1 and S2 are shown in Figs. A4 and B4. Although our computational analysis suggested that L ~ 3X₀ is not a strict rule applicable to all the genes, several such CRMs-TFs systems follow the prediction of the propulsion model i.e. L_opt ~ 3X₀ where X₀ is the distance between S1 and S2, and L_opt is the optimum distance of S2 from the transcription start site. Although the most probable location of S2 is close to the promoter region, the most probable location of S1 seems to be around ~2500 bp away from the promoter in both mouse and human genome. This is in line with the tethered sliding model which predicted the critical distance of CRMs from the promoter to be around X_C ~ 2000 bp. The asymmetry in the relative binding strengths of these sites seems to be equally probable both towards as well as away from the transcription start site.

FIGURE 4.

Blue dots are the computed distances between the putative cis-regulatory modules and L is the distance of them from the transcription start site. We considered the position weight matrices of human and mouse available with the JASPAR database and scanned the upstream 5000 bp sequences of various human and mouse genes. When there are two such putative CRMs, then the distances between them were computed along with their distances from the transcription start sites. Propulsion model suggested that L_opt ~ 3X₀. Although our computational analysis suggested that this is not a strict rule applicable to all the genes, there exist several such CRMs-TFs systems which follow the prediction of the propulsion model. A. Mouse. B. Human.

FIGURE 5.

Computational data analysis results. The tethered sliding model predicted the most probable distance of the CRMs of TFs from the transcription start sites as X_C ~ 2000 bp for a persistence length of DNA as a ~ 150 bp. The distribution of the upstream location of CRMs of various TFs shows a maximum around ~2500 bp (A1, B1). Putative binding sites were defined with a p-value < 10⁻⁶. This is in line with the tethered sliding model. Though the asymmetry in binding energy profiles of S1 and S2 is observed, there is no preferential directionality which is evident from the symmetry in the sign of the differences of the PWM score values of S1 and S2. The probability of observing a CRM site by chance will be calculated as p_NF < 10⁻³. A1-4. Mouse. B1-4. Human.

Limitations of the models

In multiprotein mediated DNA looping, there is always a possibility for two different TFs interact with S1 and P respectively and the looping is mediated via protein-protein interactions among these TFs. In both propulsion and tethered sliding models, we have assumed that the nonspecifically bound DBD2 of TF does not dissociate until reaching the promoter. Nevertheless, earlier studies suggested that this assumption is valid only for the average sliding length of TF where k_r is the dissociation rate constant (7) that is defined as where and μ_NS is the average nonspecific binding energy associated with DBD2 of the TF of interest. Clearly μ_NS > 12 k_BT is required to attain L_S ~ 300 bp which can be achieved via multiprotein binding.

In the absence of energy input, biological systems can overcome the looping energy barrier via three possible ways viz. 1) multiprotein binding (38) which could be the origin of the combinatorial TFs in the process of evolution, 2) placing sequence mediated kinetic traps corresponding to DBD2 in between CRMs and promoters (18) and, 3) placing nucleosomes all over the genomic DNA to decrease the E_entropy component. All these aspects are observed in the natural systems. In multiprotein binding, the free energies associated with the DNA-protein and protein-protein interactions among TFs will be utilized in a cooperative manner for the looping of DNA. Here DBD1 and DBD2 may come from different proteins. Vilar and Saiz (38) had shown that the looping of DNA would be possible even with small concentrations of TFs when the number TFs in a combination is sufficiently large. Multiprotein binding eventually increases X₀ values. However, increasing X₀ will eventually decreases both the maximum possible acceleration of TF search dynamics and the energy barrier associated with the DNA-looping. As a result, natural systems have optimized X₀ between these two-opposing factors for maximum efficiency via manipulating the number of TFs in the combinatorial binding.

CONCLUSIONS

In summary, for the first time we have shown that DNA-loops can stochastically propel the transcription factors along DNA from their specific binding sites towards the promoters. We have shown that the source of propulsion is the elastic energy stored on the specific looped DNA-protein complex. Actually, elastic and entropic energy barriers associated with the looping of DNA shape up the distribution of distances between TF binding sites and promoters in the process of evolution. We argued that the commonly observed multiprotein binding in gene regulation might have been acquired over evolution to overcome the looping energy barrier. Presence of nucleosomes on the genomic DNA of eukaryotes is required to reduce the entropy barrier associated with the looping.

APPENDIX A

The energy component E_entropy that is required to compensate the chain entropy loss for a Gaussian chain can be computed as follows. Let us assume that the looping of DNA occurs when where is the end-to-end distance vector, ξ is the minimum looping-distance (in m) and Xl_d is the maximum length of the DNA polymer. The probability density function of the vector (64, 65) where X is the number of monomers in the polymer and b is the average distance between the monomers. The entropy loss upon looping of DNA is ΔS_loop ≃ ln (P_l/P_Ω) (measured in k_B units) where is the probability of finding the loops and is the probability of finding all the configurations including loops. Explicitly one can write down as,

Here Erf is the error function (56). When ξ ≃ b ≃ l_d is very small then for large values of X (49). This expression for the entropy is closely linked with the Jacobson-Stockmayer factor, or J-factor associated with polymer looping (37). One finally obtains E_bend ≃ 2π²a/X + (3/2)ln (πX/6).

REFERENCES

1.↵
Alberts, B. 2002. Molecular biology of the cell. Garland Science, New York.
2.↵
Ptashne, M. 1986. A genetic switch: gene control and phage [lambda]. Cell Press; Blackwell Scientific Publications, Cambridge, Mass.; Palo Alto, Calif.
3.↵
Ptashne, M., and A. Gann. 2002. Genes & signals. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, New York.
4.↵
Berg, O. G., R. B. Winter, and P. H. von Hippel. 1981. Diffusion-driven mechanisms of protein translocation on nucleic acids. 1. Models and theory. Biochemistry 20:6929–6948.
OpenUrl CrossRef PubMed Web of Science
5.↵
Winter, R. B., O. G. Berg, and P. H. von Hippel. 1981. Diffusion-driven mechanisms of protein translocation on nucleic acids. 3. The Escherichia coli lac repressor--operator interaction: kinetic measurements and conclusions. Biochemistry 20:6961–6977.
OpenUrl CrossRef PubMed Web of Science
6.↵
Murugan, R. 2004. DNA-protein interactions under random jump conditions. Phys Rev E Stat Nonlin Soft Matter Phys 69:011911.
OpenUrl PubMed
7.↵
Murugan, R. 2007. Generalized theory of site-specific DNA-protein interactions. Phys Rev E Stat Nonlin Soft Matter Phys 76:011901.
OpenUrl PubMed
8.↵
Niranjani, G., and R. Murugan. 2016. Generalized theory on the mechanism of site-specific DNA-protein interactions. Journal of Statistical Mechanics: Theory and Experiment 2016:053501.
OpenUrl
9.↵
Koslover, E. F., M. A. Diaz de la Rosa, and A. J. Spakowitz. 2011. Theoretical and computational modeling of target-site search kinetics in vitro and in vivo. Biophys J 101:856–865.
OpenUrl CrossRef PubMed Web of Science
10.↵
Slutsky, M., and L. A. Mirny. 2004. Kinetics of protein-DNA interaction: facilitated target location in sequence-dependent potential. Biophys J 87:4021–4035.
OpenUrl CrossRef PubMed Web of Science
11.↵
Wunderlich, Z., and L. A. Mirny. 2008. Spatial effects on the speed and reliability of protein-DNA search. Nucleic Acids Res 36:3570–3578.
OpenUrl CrossRef PubMed Web of Science
12.↵
Shvets, A. A., and A. B. Kolomeisky. 2016. Crowding on DNA in Protein Search for Targets. J Phys Chem Lett 7:2502–2506.
OpenUrl
13.↵
Beshnova, D. A., A. G. Cherstvy, Y. Vainshtein, and V. B. Teif. 2014. Regulation of the nucleosome repeat length in vivo by the DNA sequence, protein concentrations and long-range interactions. PLoS Comput Biol 10:e1003698.
OpenUrl CrossRef PubMed
14.
Parmar, J. J., D. Das, and R. Padinhateeri. 2016. Theoretical estimates of exposure timescales of protein binding sites on DNA regulated by nucleosome kinetics. Nucleic Acids Res 44:1630–1641.
OpenUrl CrossRef PubMed
15.
Parmar, J. J., J. F. Marko, and R. Padinhateeri. 2014. Nucleosome positioning and kinetics near transcription-start-site barriers are controlled by interplay between active remodeling and DNA sequence. Nucleic Acids Res 42:128–136.
OpenUrl CrossRef PubMed Web of Science
16.
Teif, V. B., and K. Rippe. 2011. Nucleosome mediated crosstalk between transcription factors at eukaryotic enhancers. Phys Biol 8:044001.
OpenUrl CrossRef PubMed
17.↵
Shvets, A., M. Kochugaeva, and A. B. Kolomeisky. 2016. Role of Static and Dynamic Obstacles in the Protein Search for Targets on DNA. J Phys Chem B 120:5802–5809.
OpenUrl
18.↵
Niranjani, G., and R. Murugan. 2016. Theory on the mechanism of site-specific DNA-protein interactions in the presence of traps. Physical Biology 13:046003.
OpenUrl
19.↵
Lange, M., M. Kochugaeva, and A. B. Kolomeisky. 2015. Dynamics of the Protein Search for Targets on DNA in the Presence of Traps. J Phys Chem B 119:12410–12416.
OpenUrl
20.↵
Kalodimos, C. G., N. Biris, A. M. Bonvin, M. M. Levandoski, M. Guennuegues, R. Boelens, and R. Kaptein. 2004. Structure and flexibility adaptation in nonspecific and specific protein-DNA complexes. Science 305:386–389.
OpenUrl Abstract/FREE Full Text
21.↵
Murugan, R. 2010. Theory of site-specific DNA-protein interactions in the presence of conformational fluctuations of DNA binding domains. Biophys J 99:353–359.
OpenUrl CrossRef PubMed Web of Science
22.↵
Zhou, H. X. 2011. Rapid search for specific sites on DNA through conformational switch of nonspecifically bound proteins. Proc Natl Acad Sci U S A 108:8651–8656.
OpenUrl Abstract/FREE Full Text
23.↵
Ando, T., and J. Skolnick. 2014. Sliding of proteins non-specifically bound to DNA: Brownian dynamics studies with coarse-grained protein and DNA models. PLoS Comput Biol 10:e1003990.
OpenUrl CrossRef PubMed
24.↵
Murugan, R. 2009. Packaging effects on site-specific DNA-protein interactions. Phys Rev E Stat Nonlin Soft Matter Phys 79:061920.
OpenUrl PubMed
25.↵
Khazanov, N., A. Marcovitz, and Y. Levy. 2013. Asymmetric DNA-search dynamics by symmetric dimeric proteins. Biochemistry 52:5335–5344.
OpenUrl CrossRef PubMed Web of Science
26.
Marcovitz, A., and Y. Levy. 2013. Obstacles may facilitate and direct DNA search by proteins. Biophys J 104:2042–2050.
OpenUrl CrossRef PubMed Web of Science
27.
Marcovitz, A., and Y. Levy. 2013. Weak frustration regulates sliding and binding kinetics on rugged protein-DNA landscapes. J Phys Chem B 117:13005–13014.
OpenUrl
28.↵
Vuzman, D., Y. Hoffman, and Y. Levy. 2012. Modulating protein-DNA interactions by post-translational modifications at disordered regions. Pac Symp Biocomput:188–199.
29.↵
Murugan, R. 2018. Theory of Site-Specific DNA-Protein Interactions in the Presence of Nucleosome Roadblocks. Biophysical Journal 114:2516–2529.
OpenUrl
30.↵
Schleif, R. 1988. DNA looping. Science 240:127–128.
OpenUrl FREE Full Text
31.↵
Schleif, R. 1992. DNA looping. Annu Rev Biochem 61:199–223.
OpenUrl CrossRef PubMed Web of Science
32.↵
Rippe, K., M. Guthold, P. H. von Hippel, and C. Bustamante. 1997. Transcriptional activation via DNA-looping: visualization of intermediates in the activation pathway of E. coli RNA polymerase x sigma 54 holoenzyme by scanning force microscopy. J Mol Biol 270:125–138.
OpenUrl CrossRef PubMed Web of Science
33.↵
Mulligan, P. J., Y. J. Chen, R. Phillips, and A. J. Spakowitz. 2015. Interplay of Protein Binding Interactions, DNA Mechanics, and Entropy in DNA Looping Kinetics. Biophys J 109:618–629.
OpenUrl
34.↵
Lewin, R. A., D. M. Crothers, D. L. Correll, and B. E. Reimann. 1964. A Phage Infecting Saprospira Grandis. Can J Microbiol 10:75–85.
OpenUrl PubMed
35.↵
Murugan, R. 2010. Theory on the mechanism of distal action of transcription factors: looping of DNA versus tracking along DNA. Journal of Physics A: Mathematical and Theoretical 43:415002.
OpenUrl
36.↵
Grindley, N. D., K. L. Whiteson, and P. A. Rice. 2006. Mechanisms of site-specific recombination. Annu Rev Biochem 75:567–605.
OpenUrl CrossRef PubMed Web of Science
37.↵
Zhang, Y., A. E. McEwen, D. M. Crothers, and S. D. Levene. 2006. Statistical-mechanical theory of DNA looping. Biophys J 90:1903–1912.
OpenUrl CrossRef PubMed Web of Science
38.↵
Vilar, J. M., and L. Saiz. 2006. Multiprotein DNA looping. Phys Rev Lett 96:238103.
OpenUrl CrossRef PubMed
39.↵
Murugan, R. 2011. Theory on thermodynamic coupling of site-specific DNA-protein interactions with fluctuations in DNA-binding domains. Journal of Physics A: Mathematical and Theoretical 44:505002.
OpenUrl
40.↵
Mitarai, N., I. B. Dodd, M. T. Crooks, and K. Sneppen. 2008. The generation of promoter-mediated transcriptional noise in bacteria. PLoS Comput Biol 4:e1000109.
OpenUrl CrossRef PubMed
41.↵
Murugan, R. 2011. Theory on the dynamic memory in the transcription-factor-mediated transcription activation. Phys Rev E Stat Nonlin Soft Matter Phys 83:041926.
OpenUrl PubMed
42.↵
Elf, J., G. W. Li, and X. S. Xie. 2007. Probing transcription factor dynamics at the single-molecule level in a living cell. Science 316:1191–1194.
OpenUrl Abstract/FREE Full Text
43.↵
Hammar, P., M. Wallden, D. Fange, F. Persson, O. Baltekin, G. Ullman, P. Leroy, and J. Elf. 2014. Direct measurement of transcription factor dissociation excludes a simple operator occupancy model for gene regulation. Nat Genet 46:405–408.
OpenUrl CrossRef PubMed
44.↵
Lewin, B., J. E. Krebs, S. T. Kilpatrick, E. S. Goldstein, and B. Lewin. 2011. Lewin’s genes X. Jones and Bartlett, Sudbury, Mass.
45.↵
Murugan, R. 2010. Theory of site-specific interactions of the combinatorial transcription factors with DNA. Journal of Physics A: Mathematical and Theoretical 43:195003.
OpenUrl
46.↵
Murugan, R. 2009. Directional dependent dynamics of protein molecules on DNA. Phys Rev E Stat Nonlin Soft Matter Phys 79:041913.
OpenUrl PubMed
47.↵
Zhang, Y., and D. M. Crothers. 2003. Statistical mechanics of sequence-dependent circular DNA and its application for DNA cyclization. Biophys J 84:136–153.
OpenUrl CrossRef PubMed Web of Science
48.↵
Spirin, A. S. 2009. How does a scanning ribosomal particle move along the 5’-untranslated region of eukaryotic mRNA? Brownian Ratchet model. Biochemistry 48:10688–10692.
OpenUrl CrossRef PubMed Web of Science
49.↵
Shvets, A. A., and A. B. Kolomeisky. 2016. The Role of DNA Looping in the Search for Specific Targets on DNA by Multisite Proteins. J Phys Chem Lett 7:5022–5027.
OpenUrl
50.↵
Schiessel, H., J. Widom, R. F. Bruinsma, and W. M. Gelbart. 2001. Polymer reptation and nucleosome repositioning. Phys Rev Lett 86:4414–4417.
OpenUrl CrossRef PubMed Web of Science
51.↵
Kulic, I. M., and H. Schiessel. 2003. Nucleosome repositioning via loop formation. Biophys J 84:3197–3211.
OpenUrl CrossRef PubMed Web of Science
52.↵
Lee, Y., A. Allison, D. Abbott, and H. E. Stanley. 2003. Minimal Brownian ratchet: an exactly solvable model. Phys Rev Lett 91:220601.
OpenUrl PubMed
53.↵
Gardiner, C. W. 1985. Handbook of stochastic methods for physics, chemistry, and the natural sciences. Springer-Verlag, Berlin; New York.
54.↵
Risken, H. 1989. The Fokker-Planck equation: methods of solution and applications. Springer-Verlag, Berlin; New York.
55.↵
Kampen, N. G. v. 1981. Stochastic processes in physics and chemistry. North-Holland; Sole distributors for the USA and Canada, Elsevier North-Holland, Amsterdam; New York; New York.
56.↵
Abramowitz, M., and I. A. Stegun. 1965. Handbook of mathematical functions, with formulas, graphs, and mathematical tables. Dover Publications, New York.
57.↵
Wingender, E., P. Dietze, H. Karas, and R. Knuppel. 1996. TRANSFAC: a database on transcription factors and their DNA binding sites. Nucleic Acids Res 24:238–241.
OpenUrl CrossRef PubMed Web of Science
58.↵
Khan, A., O. Fornes, A. Stigliani, M. Gheorghe, J. A. Castro-Mondragon, R. van der Lee, A. Bessy, J. Cheneby, S. R. Kulkarni, G. Tan, D. Baranasic, D. J. Arenillas, A. Sandelin, K. Vandepoele, B. Lenhard, B. Ballester, W. W. Wasserman, F. Parcy, and A. Mathelier. 2018. JASPAR 2018: update of the open-access database of transcription factor binding profiles and its web framework. Nucleic Acids Res 46:D1284.
OpenUrl CrossRef PubMed
59.↵
Koudritsky, M., and E. Domany. 2008. Positional distribution of human transcription factor binding sites. Nucleic Acids Res 36:6795–6805.
OpenUrl CrossRef PubMed Web of Science
60.↵
Stormo, G. D. 2000. DNA binding sites: representation and discovery. Bioinformatics 16:16–23.
OpenUrl CrossRef PubMed Web of Science
61.↵
Kreiman, G. 2004. Identification of sparsely distributed clusters of cis-regulatory elements in sets of co-expressed genes. Nucleic Acids Research 32:2889–2900.
OpenUrl CrossRef PubMed Web of Science
62.↵
Mathelier, A., O. Fornes, D. J. Arenillas, C. Y. Chen, G. Denay, J. Lee, W. Shi, C. Shyr, G. Tan, R. Worsley-Hunt, A. W. Zhang, F. Parcy, B. Lenhard, A. Sandelin, and W. W. Wasserman. 2016. JASPAR 2016: a major expansion and update of the open-access database of transcription factor binding profiles. Nucleic Acids Res 44:D110–115.
OpenUrl CrossRef PubMed
63.↵
Sandelin, A., W. Alkema, P. Engstrom, W. W. Wasserman, and B. Lenhard. 2004. JASPAR: an open-access database for eukaryotic transcription factor binding profiles. Nucleic Acids Res 32:D91–94.
OpenUrl CrossRef PubMed Web of Science
64.↵
Doi, M., and S. F. Edwards. 1988. The theory of polymer dynamics. Clarendon Press, Oxford [Oxfordshire].
65.↵
Gennes, P.-G. d. 1979. Scaling concepts in polymer physics. Cornell University Press, Ithaca, N.Y.

REFERENCES

1.↵
Niranjani, G., and R. Murugan. 2016. Generalized theory on the mechanism of site-specific DNA-protein interactions. Journal of Statistical Mechanics: Theory and Experiment 2016:053501.
OpenUrl
2.↵
Murugan, R. 2018. Theory of Site-Specific DNA-Protein Interactions in the Presence of Nucleosome Roadblocks. Biophysical Journal 114:2516–2529.
OpenUrl
3.↵
Atkins, P. W. 1978. Physical chemistry. W.H. Freeman, San Francisco.
4.↵
Niranjani, G., and R. Murugan. 2016. Theory on the mechanism of site-specific DNA-protein interactions in the presence of traps. Physical Biology 13:046003.
OpenUrl
5.↵
Gardiner, C. W. 1985. Handbook of stochastic methods for physics, chemistry, and the natural sciences. Springer-Verlag, Berlin; New York.
6.↵
Murugan, R. 2007. Generalized theory of site-specific DNA-protein interactions. Phys Rev E Stat Nonlin Soft Matter Phys 76:011901.
OpenUrl PubMed
7.↵
Yang, W. Y., and M. Gruebele. 2003. Folding at the speed limit. Nature 423:193–197.
OpenUrl CrossRef PubMed Web of Science
8.↵
Hammar, P., M. Wallden, D. Fange, F. Persson, O. Baltekin, G. Ullman, P. Leroy, and J. Elf. 2014. Direct measurement of transcription factor dissociation excludes a simple operator occupancy model for gene regulation. Nat Genet 46:405–408.
OpenUrl CrossRef PubMed
9.↵
Li, G.-W., O. G. Berg, and J. Elf. 2009. Effects of macromolecular crowding and DNA looping on gene regulation kinetics. Nat Phys 5:294–297.
OpenUrl CrossRef Web of Science
10.↵
Amitai, A. 2018. Chromatin Configuration Affects the Dynamics and Distribution of a Transiently Interacting Protein. Biophys J 114:766–771.
OpenUrl
11.↵
Abramowitz, M., and I. A. Stegun. 1965. Handbook of mathematical functions, with formulas, graphs, and mathematical tables. Dover Publications, New York.

View the discussion thread.

Posted March 15, 2019.

Download PDF

Citation Tools

Subject Area

Biophysics

Subject Areas

All Articles

Animal Behavior and Cognition (5213)
Biochemistry (11744)
Bioengineering (8751)
Bioinformatics (29193)
Biophysics (14968)
Cancer Biology (12094)
Cell Biology (17411)
Clinical Trials (138)
Developmental Biology (9421)
Ecology (14178)
Epidemiology (2067)
Evolutionary Biology (18303)
Genetics (12244)
Genomics (16801)
Immunology (11866)
Microbiology (28082)
Molecular Biology (11592)
Neuroscience (60959)
Paleontology (451)
Pathology (1870)
Pharmacology and Toxicology (3238)
Physiology (4957)
Plant Biology (10427)
Scientific Communication and Education (1683)
Synthetic Biology (2885)
Systems Biology (7339)
Zoology (1651)

[1] 1.↵
Alberts, B. 2002. Molecular biology of the cell. Garland Science, New York.

[2] 2.↵
Ptashne, M. 1986. A genetic switch: gene control and phage [lambda]. Cell Press; Blackwell Scientific Publications, Cambridge, Mass.; Palo Alto, Calif.

[3] 3.↵
Ptashne, M., and A. Gann. 2002. Genes & signals. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, New York.

[4] 4.↵
Berg, O. G., R. B. Winter, and P. H. von Hippel. 1981. Diffusion-driven mechanisms of protein translocation on nucleic acids. 1. Models and theory. Biochemistry 20:6929–6948.
OpenUrl CrossRef PubMed Web of Science

[5] 5.↵
Winter, R. B., O. G. Berg, and P. H. von Hippel. 1981. Diffusion-driven mechanisms of protein translocation on nucleic acids. 3. The Escherichia coli lac repressor--operator interaction: kinetic measurements and conclusions. Biochemistry 20:6961–6977.
OpenUrl CrossRef PubMed Web of Science

[6] 6.↵
Murugan, R. 2004. DNA-protein interactions under random jump conditions. Phys Rev E Stat Nonlin Soft Matter Phys 69:011911.
OpenUrl PubMed

[7] 7.↵
Murugan, R. 2007. Generalized theory of site-specific DNA-protein interactions. Phys Rev E Stat Nonlin Soft Matter Phys 76:011901.
OpenUrl PubMed

[8] 8.↵
Niranjani, G., and R. Murugan. 2016. Generalized theory on the mechanism of site-specific DNA-protein interactions. Journal of Statistical Mechanics: Theory and Experiment 2016:053501.
OpenUrl

[9] 9.↵
Koslover, E. F., M. A. Diaz de la Rosa, and A. J. Spakowitz. 2011. Theoretical and computational modeling of target-site search kinetics in vitro and in vivo. Biophys J 101:856–865.
OpenUrl CrossRef PubMed Web of Science

[10] 10.↵
Slutsky, M., and L. A. Mirny. 2004. Kinetics of protein-DNA interaction: facilitated target location in sequence-dependent potential. Biophys J 87:4021–4035.
OpenUrl CrossRef PubMed Web of Science

[11] 11.↵
Wunderlich, Z., and L. A. Mirny. 2008. Spatial effects on the speed and reliability of protein-DNA search. Nucleic Acids Res 36:3570–3578.
OpenUrl CrossRef PubMed Web of Science

[12] 12.↵
Shvets, A. A., and A. B. Kolomeisky. 2016. Crowding on DNA in Protein Search for Targets. J Phys Chem Lett 7:2502–2506.
OpenUrl

[13] 13.↵
Beshnova, D. A., A. G. Cherstvy, Y. Vainshtein, and V. B. Teif. 2014. Regulation of the nucleosome repeat length in vivo by the DNA sequence, protein concentrations and long-range interactions. PLoS Comput Biol 10:e1003698.
OpenUrl CrossRef PubMed

[14] 14.
Parmar, J. J., D. Das, and R. Padinhateeri. 2016. Theoretical estimates of exposure timescales of protein binding sites on DNA regulated by nucleosome kinetics. Nucleic Acids Res 44:1630–1641.
OpenUrl CrossRef PubMed

[15] 15.
Parmar, J. J., J. F. Marko, and R. Padinhateeri. 2014. Nucleosome positioning and kinetics near transcription-start-site barriers are controlled by interplay between active remodeling and DNA sequence. Nucleic Acids Res 42:128–136.
OpenUrl CrossRef PubMed Web of Science

[16] 16.
Teif, V. B., and K. Rippe. 2011. Nucleosome mediated crosstalk between transcription factors at eukaryotic enhancers. Phys Biol 8:044001.
OpenUrl CrossRef PubMed

[17] 17.↵
Shvets, A., M. Kochugaeva, and A. B. Kolomeisky. 2016. Role of Static and Dynamic Obstacles in the Protein Search for Targets on DNA. J Phys Chem B 120:5802–5809.
OpenUrl

[18] 18.↵
Niranjani, G., and R. Murugan. 2016. Theory on the mechanism of site-specific DNA-protein interactions in the presence of traps. Physical Biology 13:046003.
OpenUrl

[19] 19.↵
Lange, M., M. Kochugaeva, and A. B. Kolomeisky. 2015. Dynamics of the Protein Search for Targets on DNA in the Presence of Traps. J Phys Chem B 119:12410–12416.
OpenUrl

[20] 20.↵
Kalodimos, C. G., N. Biris, A. M. Bonvin, M. M. Levandoski, M. Guennuegues, R. Boelens, and R. Kaptein. 2004. Structure and flexibility adaptation in nonspecific and specific protein-DNA complexes. Science 305:386–389.
OpenUrl Abstract/FREE Full Text

[21] 21.↵
Murugan, R. 2010. Theory of site-specific DNA-protein interactions in the presence of conformational fluctuations of DNA binding domains. Biophys J 99:353–359.
OpenUrl CrossRef PubMed Web of Science

[22] 22.↵
Zhou, H. X. 2011. Rapid search for specific sites on DNA through conformational switch of nonspecifically bound proteins. Proc Natl Acad Sci U S A 108:8651–8656.
OpenUrl Abstract/FREE Full Text

[23] 23.↵
Ando, T., and J. Skolnick. 2014. Sliding of proteins non-specifically bound to DNA: Brownian dynamics studies with coarse-grained protein and DNA models. PLoS Comput Biol 10:e1003990.
OpenUrl CrossRef PubMed

[24] 24.↵
Murugan, R. 2009. Packaging effects on site-specific DNA-protein interactions. Phys Rev E Stat Nonlin Soft Matter Phys 79:061920.
OpenUrl PubMed

[25] 25.↵
Khazanov, N., A. Marcovitz, and Y. Levy. 2013. Asymmetric DNA-search dynamics by symmetric dimeric proteins. Biochemistry 52:5335–5344.
OpenUrl CrossRef PubMed Web of Science

[26] 26.
Marcovitz, A., and Y. Levy. 2013. Obstacles may facilitate and direct DNA search by proteins. Biophys J 104:2042–2050.
OpenUrl CrossRef PubMed Web of Science

[27] 27.
Marcovitz, A., and Y. Levy. 2013. Weak frustration regulates sliding and binding kinetics on rugged protein-DNA landscapes. J Phys Chem B 117:13005–13014.
OpenUrl

[28] 28.↵
Vuzman, D., Y. Hoffman, and Y. Levy. 2012. Modulating protein-DNA interactions by post-translational modifications at disordered regions. Pac Symp Biocomput:188–199.

[29] 29.↵
Murugan, R. 2018. Theory of Site-Specific DNA-Protein Interactions in the Presence of Nucleosome Roadblocks. Biophysical Journal 114:2516–2529.
OpenUrl

[30] 30.↵
Schleif, R. 1988. DNA looping. Science 240:127–128.
OpenUrl FREE Full Text

[31] 31.↵
Schleif, R. 1992. DNA looping. Annu Rev Biochem 61:199–223.
OpenUrl CrossRef PubMed Web of Science

[32] 32.↵
Rippe, K., M. Guthold, P. H. von Hippel, and C. Bustamante. 1997. Transcriptional activation via DNA-looping: visualization of intermediates in the activation pathway of E. coli RNA polymerase x sigma 54 holoenzyme by scanning force microscopy. J Mol Biol 270:125–138.
OpenUrl CrossRef PubMed Web of Science

[33] 33.↵
Mulligan, P. J., Y. J. Chen, R. Phillips, and A. J. Spakowitz. 2015. Interplay of Protein Binding Interactions, DNA Mechanics, and Entropy in DNA Looping Kinetics. Biophys J 109:618–629.
OpenUrl

[34] 34.↵
Lewin, R. A., D. M. Crothers, D. L. Correll, and B. E. Reimann. 1964. A Phage Infecting Saprospira Grandis. Can J Microbiol 10:75–85.
OpenUrl PubMed

[35] 35.↵
Murugan, R. 2010. Theory on the mechanism of distal action of transcription factors: looping of DNA versus tracking along DNA. Journal of Physics A: Mathematical and Theoretical 43:415002.
OpenUrl

[36] 36.↵
Grindley, N. D., K. L. Whiteson, and P. A. Rice. 2006. Mechanisms of site-specific recombination. Annu Rev Biochem 75:567–605.
OpenUrl CrossRef PubMed Web of Science

[37] 37.↵
Zhang, Y., A. E. McEwen, D. M. Crothers, and S. D. Levene. 2006. Statistical-mechanical theory of DNA looping. Biophys J 90:1903–1912.
OpenUrl CrossRef PubMed Web of Science

[38] 38.↵
Vilar, J. M., and L. Saiz. 2006. Multiprotein DNA looping. Phys Rev Lett 96:238103.
OpenUrl CrossRef PubMed

[39] 39.↵
Murugan, R. 2011. Theory on thermodynamic coupling of site-specific DNA-protein interactions with fluctuations in DNA-binding domains. Journal of Physics A: Mathematical and Theoretical 44:505002.
OpenUrl

[40] 40.↵
Mitarai, N., I. B. Dodd, M. T. Crooks, and K. Sneppen. 2008. The generation of promoter-mediated transcriptional noise in bacteria. PLoS Comput Biol 4:e1000109.
OpenUrl CrossRef PubMed

[41] 41.↵
Murugan, R. 2011. Theory on the dynamic memory in the transcription-factor-mediated transcription activation. Phys Rev E Stat Nonlin Soft Matter Phys 83:041926.
OpenUrl PubMed

[42] 42.↵
Elf, J., G. W. Li, and X. S. Xie. 2007. Probing transcription factor dynamics at the single-molecule level in a living cell. Science 316:1191–1194.
OpenUrl Abstract/FREE Full Text

[43] 43.↵
Hammar, P., M. Wallden, D. Fange, F. Persson, O. Baltekin, G. Ullman, P. Leroy, and J. Elf. 2014. Direct measurement of transcription factor dissociation excludes a simple operator occupancy model for gene regulation. Nat Genet 46:405–408.
OpenUrl CrossRef PubMed

[44] 44.↵
Lewin, B., J. E. Krebs, S. T. Kilpatrick, E. S. Goldstein, and B. Lewin. 2011. Lewin’s genes X. Jones and Bartlett, Sudbury, Mass.

[45] 45.↵
Murugan, R. 2010. Theory of site-specific interactions of the combinatorial transcription factors with DNA. Journal of Physics A: Mathematical and Theoretical 43:195003.
OpenUrl

[46] 46.↵
Murugan, R. 2009. Directional dependent dynamics of protein molecules on DNA. Phys Rev E Stat Nonlin Soft Matter Phys 79:041913.
OpenUrl PubMed

[47] 47.↵
Zhang, Y., and D. M. Crothers. 2003. Statistical mechanics of sequence-dependent circular DNA and its application for DNA cyclization. Biophys J 84:136–153.
OpenUrl CrossRef PubMed Web of Science

[48] 48.↵
Spirin, A. S. 2009. How does a scanning ribosomal particle move along the 5’-untranslated region of eukaryotic mRNA? Brownian Ratchet model. Biochemistry 48:10688–10692.
OpenUrl CrossRef PubMed Web of Science

[49] 49.↵
Shvets, A. A., and A. B. Kolomeisky. 2016. The Role of DNA Looping in the Search for Specific Targets on DNA by Multisite Proteins. J Phys Chem Lett 7:5022–5027.
OpenUrl

[50] 50.↵
Schiessel, H., J. Widom, R. F. Bruinsma, and W. M. Gelbart. 2001. Polymer reptation and nucleosome repositioning. Phys Rev Lett 86:4414–4417.
OpenUrl CrossRef PubMed Web of Science

[51] 51.↵
Kulic, I. M., and H. Schiessel. 2003. Nucleosome repositioning via loop formation. Biophys J 84:3197–3211.
OpenUrl CrossRef PubMed Web of Science

[52] 52.↵
Lee, Y., A. Allison, D. Abbott, and H. E. Stanley. 2003. Minimal Brownian ratchet: an exactly solvable model. Phys Rev Lett 91:220601.
OpenUrl PubMed

[53] 53.↵
Gardiner, C. W. 1985. Handbook of stochastic methods for physics, chemistry, and the natural sciences. Springer-Verlag, Berlin; New York.

[54] 54.↵
Risken, H. 1989. The Fokker-Planck equation: methods of solution and applications. Springer-Verlag, Berlin; New York.

[55] 55.↵
Kampen, N. G. v. 1981. Stochastic processes in physics and chemistry. North-Holland; Sole distributors for the USA and Canada, Elsevier North-Holland, Amsterdam; New York; New York.

[56] 56.↵
Abramowitz, M., and I. A. Stegun. 1965. Handbook of mathematical functions, with formulas, graphs, and mathematical tables. Dover Publications, New York.

[57] 57.↵
Wingender, E., P. Dietze, H. Karas, and R. Knuppel. 1996. TRANSFAC: a database on transcription factors and their DNA binding sites. Nucleic Acids Res 24:238–241.
OpenUrl CrossRef PubMed Web of Science

[58] 58.↵
Khan, A., O. Fornes, A. Stigliani, M. Gheorghe, J. A. Castro-Mondragon, R. van der Lee, A. Bessy, J. Cheneby, S. R. Kulkarni, G. Tan, D. Baranasic, D. J. Arenillas, A. Sandelin, K. Vandepoele, B. Lenhard, B. Ballester, W. W. Wasserman, F. Parcy, and A. Mathelier. 2018. JASPAR 2018: update of the open-access database of transcription factor binding profiles and its web framework. Nucleic Acids Res 46:D1284.
OpenUrl CrossRef PubMed

[59] 59.↵
Koudritsky, M., and E. Domany. 2008. Positional distribution of human transcription factor binding sites. Nucleic Acids Res 36:6795–6805.
OpenUrl CrossRef PubMed Web of Science

[60] 60.↵
Stormo, G. D. 2000. DNA binding sites: representation and discovery. Bioinformatics 16:16–23.
OpenUrl CrossRef PubMed Web of Science

[61] 61.↵
Kreiman, G. 2004. Identification of sparsely distributed clusters of cis-regulatory elements in sets of co-expressed genes. Nucleic Acids Research 32:2889–2900.
OpenUrl CrossRef PubMed Web of Science

[62] 62.↵
Mathelier, A., O. Fornes, D. J. Arenillas, C. Y. Chen, G. Denay, J. Lee, W. Shi, C. Shyr, G. Tan, R. Worsley-Hunt, A. W. Zhang, F. Parcy, B. Lenhard, A. Sandelin, and W. W. Wasserman. 2016. JASPAR 2016: a major expansion and update of the open-access database of transcription factor binding profiles. Nucleic Acids Res 44:D110–115.
OpenUrl CrossRef PubMed

[63] 63.↵
Sandelin, A., W. Alkema, P. Engstrom, W. W. Wasserman, and B. Lenhard. 2004. JASPAR: an open-access database for eukaryotic transcription factor binding profiles. Nucleic Acids Res 32:D91–94.
OpenUrl CrossRef PubMed Web of Science

[64] 64.↵
Doi, M., and S. F. Edwards. 1988. The theory of polymer dynamics. Clarendon Press, Oxford [Oxfordshire].

[65] 65.↵
Gennes, P.-G. d. 1979. Scaling concepts in polymer physics. Cornell University Press, Ithaca, N.Y.

Theory on the looping mediated stochastic propulsion of transcription factors along DNA

ABSTRACT

INTRODUCTION

THEORETICAL FORMULATION

Preliminary assumptions

Energetics of the site-specific binding of TFs and bending of DNA

Looping mediated communication between CRMs-TFs and promoters

Stochastic propulsion model

Predictions of the propulsion model

Tethered sliding model

Predictions of the tethered sliding model

COMPUTATIONAL ANALYSIS

Datasets and analysis

RESULTS AND DISCUSSION

Limitations of the models

CONCLUSIONS

APPENDIX A

REFERENCES

REFERENCES

Citation Manager Formats

Subject Area