## Abstract

The condensation of several mega base pair human chromosomes in a small cell volume is a spectacular phenomenon in biology. This process, involving the formation of loops in chromosomes, is facilitated by ATP consuming motors (condensin and cohesin), that interact with chromatin segments thereby actively extruding loops. Motivated by real time videos of loop extrusion (LE), we created an analytically solvable model, which yields the LE velocity as a function of external load acting on condensin. The theory fits the experimental data quantitatively, and suggests that condensin must undergo a large conformational change, triggered by ATP binding and hydrolysis, that brings distant parts of the motor to proximity. Simulations using a simple model confirm that a transition between an open and closed states is necessary for LE. Changes in the orientation of the motor domain are transmitted over ~ 50 nm, connecting the motor head and the hinge, thus providing a plausible mechanism for LE. The theory and simulations are applicable to loop extrusion in other structural maintenance complexes.

How chromosomes are structurally organized in the tight cellular space is a long standing problem in biology. Remarkably, these information carrying polymers in humans containing more than 100 million base pairs, depending on the chromosome number, are densely packed (apparently with negligible knots) in the 5 – 10 *μ*m cell nucleus [1, 2]. In order to accomplish this herculean feat nature has evolved a family of SMCs (Structural Maintenance of Chromosomes) complexes [3, 4] (bacterial SMC, cohesin, and condensin) to facilitate large scale compaction of chromosomes in all living systems. Compaction is thought to occur by active generation of a large array of loops, which are envisioned to form by extrusion of the genomic material [5, 6] driven by ATP-consuming motors. The SMC complexes have been identified as a major component of the loop extrusion (LE) process [3, 4].

Of interest here is condensin, which has motor activity as it translocates on DNA [7], resulting in active extrusion loops in an ATP-dependent manner [8]. We first provide a brief description of the architecture of condensin (drawn schematically in Fig.1) because the theory is based on this picture. Condensin is a ring shaped dimeric motor to which a pair of SMC proteins (Smc2 and Smc4) are attached. Smc2 and Smc4, which have coiled coil (CC) structures, are connected at the hinge domain. The ATP binding domains are in the motor heads [4, 9]. The CCs have kinks roughly in the middle of the CCs [9]. The relative flexibility in the elbow region (located near the kinks) could be the key to the conformational transitions in the CC that are powered by ATP binding and hydrolysis [4, 10]. At present, there is no direct experimental evidence that this is so.

Previous studies using simulations [6, 11, 12], which build on the pioneering insights by Nasmyth [5], suggested that multiple condensins concertedly translocate along the chromosome extruding loops of increasing length. In this mechanism two condensin heads move away from each other extruding loops in a symmetric manner. Cooperative action of many condensins [13, 14] might be necessary to account for the ~ (1, 000 – 10, 000) fold compaction of human chromosomes [15]. In the only available theoretical study thus far [16], a plausible catalytic cycle for the condensin is coupled to loop extrusion. The present theory may be viewed as complementary to the earlier study but differs not only in details but also in the envisioned LE mechanism.

We were inspired by a real time video of LE in *λ*-DNA by a single condensin [8], which functions by extruding loops through one head while the other head is likely fixed. To describe the experimental outcomes quantitatively, we created an analytically solvable model that produces excellent agreement with experiments for the LE velocity as a function of external load. The theory suggests that in order for LE to occur there has to be ATP-powered allosteric transition in condensin involving a large conformational change that brings distant parts of the motor to proximity. Simulations using a simple model confirmed this finding, and further strongly suggest that the conformational transitions are driven by a scrunching mechanism, discovered in the context of transcription initiation by RNA polymerase resulting in bubble formation in promoter DNA [17], and further illustrated in molecular simulations [18].

In order to develop a model applicable to condensin (and cohesin), we assume that condensin is attached to two loci (**A** and **B**) on the DNA. To retain the generality of the theory, we do not explicitly describe the nature of the attachment points at this juncture. However, the picture in Fig.1 could be mapped onto a couple of working models proposed in the literature. In the scrunching model [19] the blue (red) sphere would be the motor heads (hinge). In the so-called pumping model [16], location **A** might correspond to the two heads of SMC complex, which trap one end of the DNA. The red sphere in Fig.1 would be localized on the coiled coil to which the genome is transiently attached. In state 1, the spatial distance between the condensin attachment points is, (Fig.1), and the genomic length between **A** and **B** is *l*_{1}. Due to the polymeric nature of the DNA, the captured length *l*_{1} can exceed *R*_{1}. However, *R*_{1} cannot be greater than the overall dimension of the SMC motor, which is on the order of ~ 50 nm. Once a loop in the DNA is captured, condensin undergoes a conformational change triggered by either ATP binding and/or ATP hydrolysis, shrinking the distance from *R*_{1} to *R*_{2} (where *R*_{2} < *R*_{1}). As a result, the captured genomic length between **A** and **B** reduces to *l*_{2} (state 2). Consequently, the loop grows by *l*_{1} − *l*_{2}. We define the step size of con-densin as Δ*R* = *R*_{1} − *R*_{2}, and extrusion length per step is Δ*l* = *l*_{1} − *l*_{2}.

In order to derive the velocity of loop extrusion, we first estimate the loop length of DNA captured by condensin when the attachment points are spatially separated by . We show that on the length scale of the size of condensin (~ 50 nm), it is reasonable to approximate . To calculate the LE velocity it is necessary to estimate the total work done to bend the DNA as well as account for the work associated when an external load is applied [8]. Based on these considerations, we derive an expression for the LE velocity, given by *k*_{0} exp(−*f*Δ*R/k _{B}T*)Δ

*R*, where

*k*

_{0}is the rate of mechanical step at zero load,

*f*is the external load,

*k*

_{B}is Boltzmann constant and

*T*is temperature.

We examined the possibility that the loop extrusion length per step can be considerably larger than the size of condensin [7–9, 20] by calculating, , the conditional probability that for realizing the contour length for a given end-to-end distance, . An exact expression for the radial distance of end-to-end probability for a fixed contour length for a semi-flexible polymer has been derived [21] but it is complicated. We have calculated using a mean-field theory that gives excellent approximation [22, 23] to the exact expression, which suffices for our purposes here. The expression for , which has the same form as up to a normalization constant, is given by, (Supplementary Information),
where , *l _{p}* is the persistence length of the polymer, and with . In Eq.(1),

*C*is a normalization constant that does not depend on . The distribution , which scales as for large , has a heavy tail and does not have a well defined mean (see Fig. 2a for the plots of for different ). Therefore, we evaluated the location of the peak in , and solved the resulting equation numerically. The dependence of on , which is almost linear (Fig.2b), is well fit using with

*a*= 0.003 nm

^{−1}at the length scale . Therefore, with negligible corrections, we used the approximation on the length scales corresponding to the size of condensin or the DNA persistence length. Note that the probability that , for a given

*l*

_{p}(= 50 nm in Fig.2), is small for large . Indeed, the location of the largest probability is at , which is similar to what was found for proteins as well [24]. Furthermore, the presence of an external load would stretch the DNA, further justifying the assumption . Thus, LE of DNA loop that is much larger than the size of condensin is unlikely, at least as the principal mechanism. This suggests that the step size of condensin is nearly equal to the extrusion length of DNA, Δ

*R*≈ Δ

*l*.

Just like other motors, condensin hydrolyzes ATP, generating *μ* ≈ 20 *k _{B}T* chemical energy that is converted into mechanical work, which in this case results in extrusion of DNA loop [8]. To arrive at an expression for LE velocity, we calculated the thermodynamic work required for LE. The required work

*W*modulates the rate of mechanical process by the exponential factor exp(−

*W/k*). In our model,

_{B}T*W*has two contributions. The first is the work needed (

*W*) to bend the DNA. Condensin bends the DNA by decreasing the spatial distance between the attachment points from to (Fig.1). The associated genomic length of DNA in this process is

_{bend}*L*

_{Σ}=

*L*

_{0}+

*l*

_{1}(Fig.1). Note that

*L*

_{0}could be large or small. The second contribution is

*W*, which comes from application of an externally applied load (

_{step}*f*). Condensin resists

*f*up to a threshold value [8]. The mechanical work done during the step size Δ

*R*=

*R*

_{1}−

*R*

_{2}is

*W*=

_{step}*f*Δ

*R*.

We calculated *W _{bend}* as the free energy change for bringing a semi-flexible polymer with contour length

*L*

_{Σ}, from the end-to-end distance

*R*

_{1}to

*R*

_{2}. It can be estimated using the relation,

*W*≈ −

_{bend}*k*log(

_{B}T*P*(

*R*

_{2}|

*L*

_{Σ})) +

*k*log(

_{B}T*P*(

*R*

_{1}|

*L*

_{Σ})), where is given by Eq.(1) without the factor

*C*. Although is a distribution, implying that there is a distribution for

*W*, for illustrative purposes, we plot

_{bend}*W*for a fixed

_{bend}*R*

_{1}= 50 nm at different values of

*R*

_{2}in Fig.3. It is evident that condensin has to overcome the highest bending penalty in the first step of extrusion, and subsequently

*W*is flat. If

_{bend}*R*

_{1}= 50 nm, which is approximately the size of condensin, we estimate that condensin pays 3

*k*to initiate the extrusion process (blue line in Fig.3).

_{B}TOnce the energetic costs for LE are known, we can calculate the LE velocity as a function of an external load applied to condensin. From energy conservation, we obtain the equality, *nμ* = *W _{bend}* +

*W*{

_{step}*f*} +

*Q*, where

*n*is the number of ATP molecules consumed per mechanical step,

*μ*is the energy released by ATP hydrolysis, and

*Q*is the heat dissipated during the extrusion process. The maximum force is obtained when the equality

*nμ*=

*W*+

_{bend}*W*{

_{step}*f*} holds. If we denote the rate of mechanical transition as

_{max}*k*

^{+}and reverse rate as

*k*

^{−}, fluctuation theorem [25–27] with conservation of energy gives the following relation:

Once the details of the catalytic cycle of the SMC motor are identified it is possible to extend Eq.(2) to multiple intermediate steps to include ATP dependence, ([27] for theoretical descriptions in the context of molecular machines). We can extract the load dependent term in the expression, which is written as,
where is the rate of the mechanical transition at 0 load. Thus, with the assumption that Δ*R* is the extruded length per reaction cycle, the velocity of LE, Ω, may be written as,

In the experimental set up in Ganji et al. [8] *f* is related to the relative DNA extension, which can be calculated using the expression [28, 29],
*χ* = *R/L*, where *R* is end-to-end distance of the whole DNA and *L* is the contour length.

We used Eq.(4) to fit the experimentally measured LE velocity as a function of DNA extension [8]. The fitting parameters are Δ*R*, and *k*_{0}, the step size for condensin, and the rate of extrusion at 0 load, respectively. In principle, Δ*R* could be determined experimentally once the structures of motors in different nucleotide binding states are known. For now, excellent fit of theory to experiments, especially considering the dispersion in the data, gives *k*_{0} = 20 s^{−1} and Δ*R* = 26 nm. This indicates that condensin undergoes a conformational change, with Δ*R* ~ 26 nm, during each extrusion cycle. We note that *k*_{0} = 20 s^{−1} is one order of magnitude faster than the hydrolysis rate estimated from experiments, 2 s^{−1} [7, 8]. The value of *k*_{0} obtained here is the same as was the value assume elsewhere [16], who also obtained an expression for LE velocity as a function. The shape of the curve in [16] is similar to the one calculated using our theory.

Next we tested whether the predicted value of Δ*R* ~ 26 nm is reasonable using simulations of a simple model. Because the ATPase domains are located at the heads of condensin, it is natural to assume that the head domain undergoes conformational transitions upon ATP binding and/or hydrolysis. The structure of prokaryotic SMC suggests that there is a change in the angle between the two heads upon ATP binding [9]. Furthermore, images of the CCs of the yeast condensin (Smc2-Smc4) using liquid atomic force microscopy (AFM) show they could adopt a few distinct shapes [19, 30]. Based on these experiments, we hypothesize that the conformational changes initiated at the head domain results in changes in the angle at the junction connecting the motor head to the CC that propagates through the whole condensin via the CC by an allosteric mechanism. The open (O-shaped in [9]), with the hinge that is ≈ 45 nm away from the motor domain, and the closed (B-shaped in Fig.1 in [9] in which the hinge domain is in proximity to the motor domain) are the two relevant allosteric states for LE. To capture the reaction cycle (O → B → O), we model the CCs as kinked semi-flexible polymers (two moderately stiff segments connected by a flexible elbow), generalizing a similar description of stepping of Myosin V on actin [31]. By altering the angle between the two heads the allosteric transition between the open (O-shaped) and closed (B-shaped) states could be simulated (SI contains the details).

We tracked the head-hinge distance in open state (*R*_{1}) and closed state (*R*_{2}), and Δ*R _{s}* =

*R*

_{1}−

*R*

_{2}, in the simulations. The sample trajectory in Fig.5a, monitor-ing the conformational transition between the open and closed states, shows that Δ

*R*changes by ~ 21 nm for

_{s}*l*= 70 nm, which roughly coincides with the value extracted by fitting the theory to experimental data. Higher (smaller) values of Δ

_{p}*R*may be obtained using larger (smaller) values of (see section III in the SI). Fig.5b, shows the distributions,

_{s}*P*(

*R*

_{1}) and

*P*(

*R*

_{2}) obtained from multiple trajectories as condensin undergoes a transition between the O and B states. The distributions are broad suggestive of high degree of heterogeneity in the structural transition. The large dispersion in

*P*(

*R*

_{1}) and

*P*(

*R*

_{2}) found in the simulations is in agreement with experiments [19], which report that the distance between the peaks is Δ = 17 ± 7 nm whereas we find that it is 21 ± 7 nm where uncertainty is calculated using standard deviation of the distributions. In the SI we show that Δ

*R*depends on the value of of the isolated Smc2 or Smc4 (Fig.1). Note that cannot be too small because a minimum rigidity in the elements transmitting allosteric signals is required [32]. Overall the simulations not only clarify the physical basis of the theory but also lend support to recent single molecule experiments [19] on a single condensin extruding loops in DNA.

_{s}In summary, we have created a theory that quantitatively explains the experimental data on a single condensin mediated loop extrusion, which is a major event in compacting chromosomes. A key prediction of our theory is that during the reaction cycle there is an allosteric transition that changes the hinge-head distance by about Δ*R* ~ 26 nm, which is realized if the persistence length of the isolated kinked CC exceeds ~ 70 nm. We conclude with a few additional remarks. (1) We focused only on one-sided loop extrusion (asymmetric process) scenario for a single condensin, as demonstrated in the *in vitro* experiment [8]. Whether symmetric LE could occur when more than one condensin loads onto DNA producing Z-loop structures [13] and if the LE mechanism depends on the species [33] is yet to be settled. Similar issues likely exist in loop extrusion mediated by cohesins [14, 34]. We believe that our work, which only relies on the polymer characteristics of DNA and implicitly on an allosteric mechanism for loop extrusion, provides a framework for theoretical investigation of LE by different scenarios. (2) If our estimate that 76 bps (Δ*R* ≈ 26 nm) is taken literally, the theoretical value of the ideal stall force would be, . A naive estimate yields *f _{max}* ≈ 3 pN, exceeding 1 pN (see Fig.S7 in the SI), which implies that SMC complexes are inefficient motors. (3) Finally, if LE occurs by scrunching, as gleaned from simulations, and advocated through experimental studies [19], it implies that the location of the motor is relatively fixed on the DNA and the loop is extruded by transitions that occur in the coiled coils.

## Acknowledgements

We thank Rasika Harshey, Changbong Hyeon, and Mauro Mugnai for useful comments. This work was supported by NSF (CHE 19-00093), NIH (GM - 107703) and the Welch Foundation Grant F-0019 through the Collie-Welch chair.