Evaluation of origin of driving force for loop formation in a chromatin fiber

Chromosome condensation results from the formation of consecutive chromatin loops in which excluded volume interactions lead to chromosome stiffness. Formation of chromatin loops requires energy, but the source of such energy remains controversial. Here, we quantified the energy balance during chromatin loop formation by calculating the free energies of unlooped and looped chromatins using a lattice model of polymer chains. We tested two hypothetical energy sources: thermal fluctuation and ATP hydrolysis. We evaluated the free energy difference of the chain loop model without accounting for excluded volume interactions (phantom loop model), and integrated those interactions by employing the mean-field theory (interacting loop model), where we introduced the parameter of excluded volume interaction within a single loop vex. Using our strategy, we confirmed that loop-growth efficiency calculated by the phantom loop model is too high to explain the experimental data. Comparing loop-growth efficiencies for each energy source, and using the interacting loop model, we found that excluded volume interaction is essential for chromatin’s resistance to looping, regardless of the energy source. We predict that the quantitative measurement of vex determines which energy source is more plausible. Author summary Before mitosis, the chromatin fibers of eukaryotic cells fold into consecutive loop structures and condense into rod-like chromosomes. Chromosome stiffness results from the interaction of the excluded volume between chromatin loops. The driving force of loop formation and growth is still controversial, despite the many efforts undertaken to clarify it. Two possible origins can be considered: the energy provided by thermal fluctuations or the energy gained from ATP hydrolysis. To discuss the validity of each, we constructed a theoretical model of chromatin loop formation that includes excluded volume interactions. Using this model, we calculated the free energy difference before and after chromatin loop formation, which corresponds to the energy that fuels chromatin looping. By comparing the results for each energy source, we conclude that the spatial distribution of chromatin loops should be relatively wide, given the large excluded volume interaction within a single loop, irrespective of which energy source is valid. Moreover, our results imply that intra-loop interactions are key to determine the driving force of chromatin loop formation.


Author summary
Before mitosis, the chromatin fibers of eukaryotic cells fold into consecutive loop structures and condense into rod-like chromosomes. Chromosome stiffness results from the interaction of the excluded volume between chromatin loops. The driving force of loop formation and growth is still controversial, despite the many efforts undertaken to clarify it. Two possible origins can be considered: the energy provided by thermal fluctuations or the energy gained from ATP hydrolysis. To discuss the validity of each, we constructed a theoretical model of chromatin loop formation that includes excluded volume interactions. Using this model, we calculated the free energy difference before and after chromatin loop formation, which corresponds to the energy that fuels chromatin looping. By comparing the results for each energy source, we conclude that the spatial distribution of chromatin loops should be relatively wide, given the large excluded volume interaction within a single loop, irrespective of which energy source is Introduction 1 Prior to mitosis, the chromatin fibers of eukaryotic cells fold into consecutive loop 2 structures and condense into rod-like chromosomes (see Fig. 1 (a)). The stiffness of 3 chromosomes results from the interaction of the excluded volume between each 4 chromatin loop. One of the essential molecules for chromatin loop formation is a 5 five-subunit protein complex named condensin, which belongs to the highly conserved 6 family of SMC complexes. 7 Chromosome assembly is an attractive topic of study, both for theoretical [1][2][3] and 8 experimental [4][5][6] researches. The mechanism and detailed dynamics of loop 9 formation have been extensively studied yet they remain incompletely understood. 10 One of the most promising hypotheses is that of loop extrusion, proposed by Alipour 11 et al. [2]. In this hypothesis condensin binds at two neighboring sites in the chromatin 12 fiber, and extrudes (pushes) it to form and enlarge a DNA loop. Loop extrusion has 13 been theoretically modeled based on stochastic [7] and coarse-grained molecular 14 dynamics simulations [1,3]. Moreover, the loop extrusion activity of condensin on bare 15 DNA was observed experimentally by Ganji et al. [6]. 16 The driving force of loop formation and growth is also an important topic and a 17 matter of debate. In general, two possible driving forces are considered. One is the 18 direct power stroke of condensin as a motor protein coupled with its ATP hydrolysis. 19 Recent experiments revealed the ATP-dependent translocation and loop formation 20 activity of condensins along DNA [4,6] and Terakawa et al. [4] proposed some motor 21 activity models of condensins. 22 The other driving force candidate is thermal fluctuation. The model proposed by 23 Marko et al. [8] outlines how thermal fluctuation can act as the driving force for loop 24 formation and growth. According to this model, the DNA fiber and condensin undergo 25 a cyclic reaction involving conformational changes in condensin and several DNA 26 forms captured by condensin. During this reaction, condensin binds to DNA and 27 additionally captures a loop that was stochastically formed nearby. Then, it releases 28 the former DNA binding site and rebinds to one of the looped DNA sites as it did 29 initially. The role of ATP hydrolysis is to prevent the reverse reaction and achieve a 30 unidirectional movement along DNA. The net rate of this cyclic reaction k cycle 31 depends on physical quantities, including the rate of ATP hydrolysis by condensin.

32
Although the model proposed by Marko et al. [8] describes the translocation of the 33 bacterial SMC complex along DNA, the basic idea is applicable to chromatin loop 34 formation by other condensins. Hereafter, we refer to these driving force candidates as 35 motor pulling scenario and thermal driving scenario. 36 For each scenario, the different energy sources provide different orders of energy to 37 the chromatin fiber for loop formation. As shown in Fig. 1  rod-like segments with the length b on a face-centered cubic lattice [9]. Each segment's 61 direction corresponds to one of the 12 unit vectors of the face-centered cubic lattice ; and (b)).

65
Here, we present the microscopic bending energy for the chain model and we  Transfer matrix method 79 To calculate the partition function of the chain model with the bending energy, we used the transfer matrix method. We defined and we introduced the transfer matrix [9] as follows where η and ξ (η, ξ = 1, 2, · · · , 12) are the indexes of orientation of two consecutive segments. For example, T 11 = 1 represents the statistical weight of the parallel state where both the n-th and n + 1-th segments point to the direction parallel to e (1) . Using eqn.
(2), we calculated the partition function of the chain composed of N + 1 segments in the free space as follows The prefactor 12 means that the orientation of the initial segment can be directed to 80 any of the 12 vectors e (1) , e (2) · · · , e (12) . Factor 1 + 4δ derives from the fact that two free energy significantly. Here, we included these two modifications on the calculations 87 of polymer chain free energy described above. First, we introduced the loop constraint 88 on the lattice polymer model and developed a calculation method for the free energy 89 difference before and after the loop formation, which is called simply free energy 90 difference. Then, we integrated the effect of the excluded volume interaction using the 91 mean-field theory.

92
Loop constraint 93 We calculated the partition function of a looped polymer chain whose end-to-end distance is equal to zero. Since the conformational free energy calculated by the transfer matrix only refers to segments orientation, we had to integrate the degrees of freedom of each segment s position in order to evaluate the end-to-end distance. As the end-to-end vector of the chain is obtained by the sum of the N + 1 segment vectors, we calculated the statistical weight where the 0-th segment at the position r (0) and the N -th segment at the position r (N ) point to e (η) and e (ξ) , respectively. By using the Fourier expression of δ function, eqn. (4) is rewritten as where,T The loop structure of the chain is obtained from the chain conformation under a constrained condition where each end of the chain has the same position. Therefore, the partition function of the loop structure of a chain composed of N + 1 segments, Z loop (T, N + 1), is calculated as The free energy difference of a phantom loop, ∆F 0 , is calculated as The free energy difference of multiple phantom loops is also calculated as where each loop is assumed to be composed of N + 1 segments and α all is the number 94 of loops in the system.

95
Excluded volume interaction 96 We integrated the excluded volume interaction into the free energy of the polymer model. We adopted the mean-field theory, in which the statistical properties of multiple interacting loops were determined using approximate calculations of a single loop under a potential field. In general, to introduce the excluded volume interaction between segments using the mean-field theory, we start with a Hamiltonian where the first term describes the Hamiltonian of a single polymer without excluded volume interaction (an ideal system) and the other term refers to the interaction between segments. Moreover,· · · means that the quantity · · · is a function of phase (10) is rewritten aŝ where ϕ(r) is the segment density field defined aŝ We assume that the spatial correlation in the interaction term of Hamiltonian (11) is negligible, June 18, 2020 5/19 where v is the excluded volume parameter and the prefactor 1/2 avoids double-counting of the interaction. Then, Hamiltonian (11) becomeŝ The partition function of this Hamiltonian is where βv is regarded as the perturbation parameter. ⟨· · · ⟩ ideal means the ensemble average value of · · · over the ensembles of the ideal system and Z ideal is the partition function of the ideal system. Free energy is derived from eqn. (15) as follows: Here, for simplicity, we assumed that the segment density is spatially uniform: Here, R g is the gyration radius of the polymer chain, defined as where r cm is the center of mass of the chain: Note that in the mean-field theory, the gyration radius in (17) is averaged over the 97 ensemble of the ideal system.

98
By replacing each of the Z loop and Z free by Z ideal in eqn. (16) , we obtained the free energies of the interacting loop and the chain model in the free space, respectively. Then, we calculated the free energy difference ∆F as follows.
where v , V, and R (free) g are the excluded volume parameter of a single loop, the number of loops interacting with each other, the gyration radius of the phantom loop, the excluded volume parameter of the model chain in the free space, the volume of the system, and the gyration radius of the model chain in the free space, respectively. It should be noted that the integration interval of the interaction term in eqn. (16) gives us the volume α × R 3 g . For simplicity, we have set Thus, we obtained the free energy difference of the multi-loop system, In the discussion, for the sake of simplicity, v Here, the chromatin/DNA fiber is characterized by the chain stiffness and stiffness is described by the persistence length l p as follows This expression is obtained by fitting the segment orientation correlation along the 101 chain. The details of the fitting are shown in Material and Method. We selected a 102 persistence length of chromatin of 30nm [10]. Following [3], we set the segment size to 103 b = 10 nm, which specifies the minimum spatial scale of the system. Additionally, bare 104 DNA can also be described by our model if we select l p = 50nm, which is the value of 105 DNA [11]. 106 We calculated the free energy difference of a single phantom loop, which corresponds to a chromatin loop. As shown by the purple line in Fig. 3, ∆F 0 is the increasing function of N and the slope decreases with N . In fact, the free energy difference between the two chromatin states is well fitted by where the non-linear least squares method is used for fitting. We estimated the efficiencies of the two driving force scenarios for loop growth, under the phantom loop model. One of the scenarios is that of thermal driving, where the source of the driving force is thermal fluctuation. Under thermal fluctuation, a physical model typically gains on an average the energy 1k B T from thermal noise. Here, we supposed that a chromatin loop with contour length N is held by condensin and its length is forced to increase owing to thermal fluctuation. The typical increase in length x (therm) ph (N ) is calculated based on the thermal energy gained 1k B T (see the two purple dashed lines in Fig. 3).
Note that the contribution of the constant to ∆F 0 disappears in eqn. (25). Using eqn. (23), we obtained which is the loop-growth efficiency of the thermal driving scenario. In the case of N = 30 the typical increase in length is x (therm) ph (30) = 30 . In the motor pulling scenario, loop growth is coupled with ATP hydrolysis, which releases about 12.2k B T . Experiments in the chicken cell line DT40 showed that the typical contour length of loops in matured chromosome is about 24 µm and that the duration of loop formation (from the beginning of prophase until the end of prometaphase) is about 60 min (= 3600 s). [5]. The ATP hydrolysis rate of condensin in a Xenopus cell is k ATP = 0.90 [12]. Thus, the number of ATP molecules used during loop formation is Considering the estimations above, one can easily notice that the motor pulling scenario is not suitable for describing the real loops that form in chromosomes. Based on such estimations and the mitotic conditions known in DT40 cells, we examined the validity of the loop-growth efficiency of the thermal driving scenario. According to this scenario, ATP hydrolysis should occur within one reaction cycle. Therefore, the typical time scale of one cycle can be approximated as where we used the ATP hydrolysis rate of condensin measured in a Xenopus cell i.e. k ATP = 0.90 [12]. Thus, the number of cycles during loop formation is given by  Fig. 3), the considerations made above also apply to DNA.

116
Interacting loop structure 117 We showed the free energy difference of the multi-loop system including the 118 excluded volume interaction by employing the mean-field theory (interacting loop 119 model). In this model, the free energy difference ∆F depends not only on N and l p 120 but also on the parameters α and v ex , in contrast with the phantom loop model.

121
To obtain the essential behavior of the free energy difference, we first calculated free energy difference considering fixed α = 20 and βv ex = 5.00 × 10 −2 . α is estimated by the number of loops overlapping each other according to the chromosome conformation reported by Gibcus et al. [5] (see Materials and Methods for more details). As shown in Fig. 4 (purple line) the free energy difference of the interacting chromatin loop increases linearly with the number of segments within a large N range(N > 30).
June 18, 2020 8/19 We determined the linear coefficient as follows In the case of α = 20 and βv ex = 5.00 × 10 −2 , λ(α, βv ex ) ≃ 15. By using the free energy difference of the interacting loop model, we evaluated the validity of the thermal driving and the motor pulling scenario. As the free energy difference behaves as a linear function of N , the increase in segments of the chromatin interacting loop using thermal fluctuation, x (therm) int , was estimated from the slope of the free energy difference plot. Here, x (therm) int is defined as The typical increase in length depends on the excluded volume parameter βv ex , which 123 is shown in Fig. 5. Based on these results (Figs. 5 and 6), we evaluated the excluded volume parameters βv ex for both scenarios. For the thermal driving scenario, as shown in eqn. (31), the loop is estimated to grow 7.47 [nm] (corresponding to 0.747 × b) during each cycle. By applying such estimation to the interacting loop model (Fig. 5), the excluded volume parameter is expected to be higher than one βv ex > 1. (36) According to the motor pulling scenario, the number of segments in the matured chromatin loop is 2,400 (24×10 3 [nm]). Therefore, the excluded volume parameter (Fig. 6) is expected to be much higher than one Note that for both scenarios, the excluded volume parameters are estimated to exceed 126 one. This represents a break of the perturbation theory as this is an expansion 127 parameter. Such possibility will be addressed further in the discussion.

129
In our study, we employed the lattice polymer model to describe a chromatin fiber 130 and determine the free energy difference between its unlooped and looped state.

131
Moreover, the effect of the excluded volume was included using the mean-field theory.

132
Using this strategy we evaluated the validity of two possible scenarios for the driving 133 force of loop formation: motor pulling scenario and thermal driving scenario. First, we 134 confirmed that the excluded volume interaction is an essential resistance effect during

151
This result implies a spatially wide distribution of segments in the chromatin loop due 152 to the repulsive forces associated with loop growth.

153
Although our perturbation theory on the excluded volume interaction takes a two-body correlation into account, a multi-body correlation (higher than two) can be incorporated to provide an essential effect on this interaction. Supposing that most interactions between segments are repulsive, assuming a multi-body correlation results in a wider spatial distribution of segments and increases the resistance effect against loop growth. By incorporating the multi-body correlation the loop-growth efficiencies, x (therm) int (βv ex ) and x (ATP) int (βv ex ), would become more accurate and lower than those previously determined (Figs. 5 and 6). The difference between the calculated and the most accurate loop-growth efficiencies should be slight in the region βv ex ≪ 1.0 and considerably large in the region βv ex ≥ 1.0. Since βv ex is the expansion parameter, the large difference in the latter region indicates the breaking of the perturbation theory. Therefore, we used the perturbation theory to estimate βv ex , which leads to the breaking of the perturbation. However, the previous remarks about the effect of the multi-body correlation can lead us to another valid conjecture. If in the thermal driving scenario the accurate loop-growth efficiency is similar to the value determined for the region βv ex < 1.0 and lower than that of region βv ex ≥ 1.0, the value of the excluded volume interaction parameter satisfying x

185
Regarding the motor pulling scenario, we implicitly assumed that a single 186 condensin complex is present on the chromatin fiber during looping. However, more 187 than one condensin can be involved in loop formation [3,14]. In fact, the numbers of 188 condensin I and condensin II can reach the hundreds and the tens of thousands, 189 respectively [14]. In this case, the motor activity of each condensin may cooperate or 190 compete with each other. According to Figure. 6  describe these initial stages, should be focus of future work.

Materials and methods
213 Path integral of model chain 214 We derived the path integral expression eqn. (5) as follows: where we took into account that the i-th segment vector b i can be any of the 12 basic lattice vectors. Therefore, where,T

215
Here is shown in detail how we derived eqn. (20). The expression (10) was rewritten aŝ Fig 8. Behavior of G ori (j) in the case of δ −1 = 8.00 (corresponding to l p /b = 5.00). The purple symbol and the green line represent the same as in Fig. 7.
These numerical results are fitted by By using eqns. (41) and (42), the persistence length l p was described as When the value of b was not strictly determined, the constant contribution to l p is 219 renormalized in the segment size b (for example, in the case where the value of b 220 simply specifies the order of the size of the monomer in the polymer [9]). Therefore, in 221 such a case, the persistence length can be defined as l p = b/δ. However, in this study The square of the gyration radius R 2 g was defined as where r g is the center of mass: Equation (44)  Chromosome structure reported by Gibcus et al. [5], based on Hi-C data and simulation studies. The red line represents a chromosome with a helical conformation including a nested loop structure. The blue and purple circles depict condensin I and II, respectively.
The number of loops interacting with each other α were estimated based on the chromatin conformation shown in Fig. 9. We denoted the distance between the centers of mass of the nearest neighbor loops by ∆. As the gyration radius of the loop is R g , We calculates the number of loops including the spatial range ∆. During prometaphase, the contour length of each loop is 80kbp. Additionally, 1 bp=0.3 nm which leads to As 150 loops exist in one pitch of the helical conformation shown in Fig. 9, The number of interacting loops α is calculated as Therefore, we fixed α = 20.