The physical chemistry of interphase loop extrusion

Loop extrusion constitutes a universal mechanism of genome organization, whereby structural maintenance of chromosomes (SMC) protein complexes load onto the chromatin fiber and generate DNA loops of increasingly-larger sizes until their eventual release. In mammalian interphase cells, loop extrusion is mediated by the cohesin complex, which is dynamically regulated by the interchange of multiple accessory proteins. Although these regulators bind the core cohesin complex only transiently, their disruption can dramatically alter cohesin dynamics, gene expression, chromosome morphology and contact patterns. Still, a theory of how cohesin regulators and their molecular interplay with the core complex modulate genome folding remains at large. Here we derive a model of cohesin loop extrusion from first principles, based on in vivo measurements of the abundance and dynamics of cohesin regulators. We systematically evaluate potential chemical reaction networks that describe the association of cohesin with its regulators and with the chromatin fiber. Remarkably, experimental observations are consistent with only a single biochemical reaction cycle, which results in a unique minimal model that may be fully parameterized by quantitative protein measurements. We demonstrate how distinct roles for cohesin regulators emerge simply from the structure of the reaction network, and how their dynamic exchange can regulate loop extrusion kinetics over time-scales that far exceed their own chromatin residence times. By embedding our cohesin biochemical reaction network within biophysical chromatin simulations, we evidence how variations in regulatory protein abundance can alter chromatin architecture across multiple length- and time-scales. Predictions from our model are corroborated by biophysical and biochemical assays, optical microscopy observations, and Hi-C conformation capture techniques. More broadly, our theoretical and numerical framework bridges the gap between in vitro observations of extrusion motor dynamics at the molecular scale and their structural consequences at the genome-wide level.

Using the notations of the main text, the chemical reaction network outlined in Fig. 1c explicitly reads as: where the "free" subscript denotes species not bound to chromatin.In the framework of the law of mass action, the kinetic equations describing the time evolution of the free concentration of each cohesin subunit thus take the form of a system of coupled ODEs, Let us consider the chromatin association/dissociation reaction of a generic species X, In the limit of large excess of chromatin binding sites, the corresponding kinetic equation for the free X population reads as Let us denote by X tot ≡ X bound + X free the total nuclear content of X.Assuming X tot to be constant throughout the G1 stage of the cell cycle, to which we restrict our current study, the equilibrium bound fraction f X may be obtained by solving Eq. (5) at steady state, while the unbinding rate k X off is related to the chromatin residence time τ X via Substituting for species X the cohesin subunits RAD21, NIPBL, PDS5 & WAPL, a direct term-by-term comparison of Eqs. ( 1)-(4) with Eq. ( 5) yields where we used N bound = RN , P bound = RP and W bound = RW .Plugging in Eqs. ( 6) and (7), Eqs. ( 8)-(15) may be recast in the form, at chemical equilibrium, TABLE S1.Equilibrium rates of the five-state model for wild-type HeLa cells (c.f.Fig. 1 and Table 1 of the main text).
State transition rates Using the experimentally-determined bound fractions f X and residence times τ X estimated from FRAP for each of the 4 relevant cohesin subunits, as well as the absolute G1 protein numbers X tot obtained as described in the main text, Eqs. ( 16)-( 23) yield a linear system of 8 coupled equations involving the 8 unknown rates k on , k off , k NR , k RN , k RP , k PR , k PW , k WP governing the chemical reaction network (Fig. 1c).Eqs. ( 16)-( 23) were inverted symbolically using the SymPy library, and the computed values for the rate k were plugged into Eqs.( 1)-(4), which were then integrated numerically as described in the main text.For the combinatorial exploration of reaction networks, all possible permutations of bound cohesin state sequences were systematically generated using the itertools library, and the corresponding kinetic (Eqs.( 1)-( 4)) and rate-mapping (Eqs.( 16)-( 23)) equations were derived programmatically and similarly solved using SymPy.

FIG. S1. Chromatin entry and exit involves distinct cohesin molecular pathways.
To systematically explore and rule out the possibility of cohesin loading and unloading via a single pathway, we apply the same decimation procedure as used in Fig. 1 of the main text to all possible acyclic, fully reversible networks with minimal number of edges (8).While 21 networks can be found with non-negative rates, none of these networks lead to an increase in the bound fraction of RAD21 upon depletion of WAPL -or, more generally, of any of the other cohesin regulators.
Thus, no fully reversible networks are consistent with experimental observations FIG.S2.Alternate topologies are inconsistent with experiments.
a. NIPBL excursion model, where NIPBL is not involved with loading but instead binds reversibly after the core complex has bound chromatin.For this topology, NIPBL depletion does not substantially lower the bound fraction, unlike experimental observations.The inconsistency of this alternate topology provides mathematical support for the role of NIPBL in productive cohesin loading in an extrusion cycle.b.PDS5 excursion model, where PDS5 reversibly binds the core complex and does not promote WAPL binding.For this topology, PDS5 depletion actually slightly lowers RAD21 residence time, instead of increasing RAD21 residence time as observed experimentally.The inconsistency of this alternate topology argues that PDS5 is positioned along the reaction cycle in such a way to influence unloading rates.

Bound fraction
In vivo [3] Residence time In vitro [2] In vivo [3] PDS5 Bound fraction Residence time

WAPL Bound fraction
Residence time signify that the model predicts an increase (resp.decrease) of the corresponding quantity in the different mutants relative to its magnitude in wild-type HeLa cells.Gray shades mark a lack of significant deviation from the wild-type value.References point to experimental studies reporting in vivo or in vitro validation of the predicted changes, wherever available.We note that NIPBL depletion leads to a drastic reduction in the bound fractions of PDS5 and WAPL, consistent with a significant inhibition of cohesin loading, but is associated with a more moderate drop (10-20%) in their respective cohesin residence times (Fig. 3c of the main text).

Bound fraction
In vivo [3] Residence time In vitro [2] In vivo [3] PDS5 Bound fraction Residence time
Note that the main qualitative difference with the five-state model lies in the residence time of PDS5 in WAPL-depleted cells, which decreases relative to wild-type in the PDS5-WAPL co-bound model, but increases in the case of the strict subunit exchange assumed by the five-state model (Fig. S5d).

FIG. S3 .
FIG. S3.Time mapping & influence of loop extrusion rate.a: Mean-squared displacement (MSD) of individual monomers as predicted by the five-state model at wild-type HeLa protein expression levels.The correspondence between model and experimental time units (1 MD step ∼ 5 ms) is obtained by comparing the slopes of the longand short-time asymptotes to the respective experimental values D Rouse ≃ 0.01 µm 2 /s 0.5 and D extrusion ≃ 0.0075 µm 2 /s 0.675 , as estimated in budding yeast and CTCF-depleted mESCs.b: Computational contact-vs-distance curves (P (s)) at different ratios of 3D-to-1D steps (N 3D/1D ).Dashed line: experimental profile obtained in ∆CTCF mutants [1].c: Mean-squared R 2 coefficient in the model vs. experimental P(s) curves, averaged over the distance range [50 kb: 5,000 kb].Inset: Numerical correspondence between N 3D/1D and mean extrusion rate (v) in wild-type HeLa cells.

TABLE S2 .
Predicted impacts of accessory protein depletions on their chromatin as- sociation dynamics in the five-state model.Recapitulative table of the role of various cohesin accessory protein depletions (columns) on the chromatin-associated fraction and residence time of other proteins (rows).Red (resp.blue) colors