## Abstract

From proteins to chromosomes, polymers fold into specific conformations that control their biological function. Polymer folding has long been studied with equilibrium thermodynamics, yet intracellular organization and regulation involve energy-consuming, active processes. Signatures of activity have been measured in the context of chromatin motion, which shows spatial correlations and enhanced subdiffusion only in the presence of adenosine triphosphate (ATP). Moreover, chromatin motion varies with genomic coordinate, pointing towards a heterogeneous pattern of active processes along the sequence. How do such patterns of activity affect the conformation of a polymer such as chromatin? We address this question by combining analytical theory and simulations to study a polymer subjected to sequence-dependent correlated active forces. Our analysis shows that a local increase in activity (larger active forces) can cause the polymer backbone to bend and expand, while less active segments straighten out and condense. Our simulations further predict that modest activity differences can drive compartmentalization of the polymer consistent with the patterns observed in chromosome conformation capture experiments. Moreover, segments of the polymer that show correlated active (sub)diffusion attract each other through effective long-ranged harmonic interactions, whereas anticorrelations lead to effective repulsions. Thus, our theory offers non-equilibrium mechanisms for forming genomic compartments, which cannot be distinguished from affinity-based folding using structural data alone. As a first step toward disentangling active and passive mechanisms of folding, we discuss a data-driven approach to discern if and how active processes affect genome organization.

The folding of various biopolymers into specific conformations is vital for cellular function. Decades of research on equilibrium polymer theory have revealed basic principles of sequence-controlled folding (1–6). Specifically, polymers composed of sequences of chemically distinct monomers can, via affinity-based monomer-monomer and monomer-solvent interactions, fold into particular shapes despite the dramatic loss of conformational entropy. These ideas range from simple hydrophobic effects that explain the positioning of residues within globular proteins (7–9), to complex free energy landscapes that precisely predict protein structure (1, 5). More recently, analogous concepts have been applied to the folding of interphase chromatin—a complex heteropolymer consisting of genomic DNA and associated proteins (10).

Advances in imaging and sequencing technology have revealed that transcriptionally active euchromatin, which typically resides in the nuclear interior, spatially segregates from transcriptionally silent heterochromatin at the nuclear periphery (11). Chromosome conformation capture (3C) experiments measure this segregation via an enrichment of contact frequencies within euchromatic (A) and heterochromatic (B) regions and depletion of contacts between them (12). These “A/B-compartments” are thought to form because of pairwise attraction between chromatin segments with similar histone modifications, which leads to equilibrium microphase separation (13). Borrowing thermodynamic ideas from protein folding, increasingly sophisticated equilibrium models have designed interaction landscapes to simulate genomic structures that recapitulate 3C data (14–22). This success is remarkable, yet somewhat surprising, since the intracellular environment is far from equilibrium.

Many chromatin-associated proteins are enzymes (23–25) that break detailed balance by turning over chemical energy (26), such as ATP or metabolites, to perform nonequilibrium reactions and/or exert mechanical forces (27), cf. Fig. 1. Such active processes characteristically lead to faster motion that cannot be explained by thermal fluctuations alone (28). Indeed, the subdiffusion of chromosomal loci slows down in the absence of ATP in both bacteria and eukaryotes (29). In addition, nucleus-wide tracking of fluorescently labeled histones has unveiled micron-scale regions of correlated chromatin motion, which disappear after ATP depletion or ATPase inhibition (30, 31). These active processes also vary along the genome, thereby driving heterogeneous motion (32). The apparent diffusion coefficient of individual loci depends on their location in sequence space (33) and on transcriptional state (34), although the precise effect of transcription is debated (35, 36). Similarly, histone tracking experiments have measured faster motion in active euchromatin in the nuclear interior than in heterochromatin at the nuclear periphery (36, 37). These observations point towards the presence of sequence-controlled active forces that affect the polymeric genome’s mobility. To study how such active processes may contribute to shaping genome structure, we need a theory that can link active polymer dynamics to folding patterns (38).

By modeling activity via persistent monomer motion, past work has predicted nonequilibrium phenomena such as coherent motion and polymer collapse or swelling (39–49). However, these studies consider uniform activity along the polymer and thus cannot explain heterogeneous folding patterns. More recent models have incorporated non-uniform activity via active forces that vary in magnitude along the chain (50–55), akin to a local effective temperature (56). Simulations have shown that large (30-fold) activity differences can drive phase separation between different polymer regions (50), analogous to mixtures of active and passive particles (57, 58), although smaller activity differences are sufficient in polymeric mixtures (59). Finer grained models have incorporated motor activity via force dipoles (60–62) that align tangentially with the chain (63–66), or through explicit simulation of translocating proteins (67). However, an analytic framework that explains why and how non-uniform activity can fold polymers is still lacking. Moreover, prior work has assumed that active processes at distinct genomic loci are statistically independent. In the context of chromatin, however, we hypothesize that correlations could arise from the coordinated transcriptional activation of different regions, such as enhancers and corresponding promoters (68, 69) or coregulation of genes by common transcription factors (70–72).

To address these open issues, we study a model of a polymer that is driven by correlated active forces with non-uniform magnitude. Our continuum theory shows that active (A) regions of the polymer expand and bend, whereas inactive (B) regions contract and straighten out. Therefore, increased activity within euchromatin could help preserve its expanded state and increase its accessibility to active proteins. Using polymer simulations, we find that even modest activity ratios (two to ten-fold) can recapitulate the degree of A/B compartmentalization observed in Hi-C data. Moreover, we find that distinct loci experiencing correlated active forces will effectively attract, while anticorrelations lead to repulsion. Our results provide a nonequilibrium mechanism that links activitydriven correlated motion to the folding patterns observed in Hi-C data. Finally, we derive an analytical mapping from our active polymer model to an effective equilibrium model where folding is determined by pairwise affinities. These two models are indistinguishable based on structural data alone, raising the need for future dynamic measurements. For example, our model assumptions could be tested via measurements of pairwise velocity correlations of specific loci. Furthermore, given ensemble-averaged conformational data of a polymer, our analytical theory enables us to propose an activity profile that could reproduce the observed steady state. By comparing the inferred activity profile to DNA-binding patterns of chromatin-associating proteins, one could determine whether active processes contribute significantly to certain folding patterns. Taken together, our results provide a new avenue for analyzing and interpreting data on chromosomes, and have broad implications for active polymer systems.

## Model

To study the folding of an active polymer, we combine analytical calculations (of a linear, continuous model) and Brownian dynamics simulations (of a discrete chain). For our theoretical analysis, we idealize the polymer as a space curve ** r**(

*s, t*), where

*s*is a continuous, dimensionless material coordinate along the polymer backbone. The large-scale dynamics of a polymer are well-described by the Rouse model, an extensile chain of material points that interact through Hookean springs

*κ*(39, 73–75), where

*ξ*is the drag friction with the surrounding solution and

**(**

*η**s, t*) is a zero-mean Gaussian random displacement velocity field which we refer to as “excitations”. In general, the covariance of the excitations at different material points is given by on timescales longer than the decorrelation time. For a passive polymer that reaches thermal equilibrium, the excitations are determined by the heat bath, 𝒞 (

*s, s*′) = 6

*k*

_{B}

*Tξ*

^{−1}

*δ*(

*s*−

*s*′), which follows from the fluctuation-dissipation theorem. However, sequence-specific active processes that stir the polymer through random forces will drive the system away from thermal equilibrium. These “athermal excitations” may vary in magnitude along the polymer and exhibit sequence-specific correlations, where

*A*(

*s*) is the activity at locus

*s*(in units of dissipated energy), and ℂ(

*s, s*′) is the “normalized” correlation function. Figure 1 depicts possible molecular drivers of these athermal excitations in the context of the genome. Note that Eq. (3) is directly proportional to the pairwise velocity correlation function (Supporting Information 1 C). Thus, by describing the polymer response to these excitations, our model links patterns of correlated motion with patterns of folding.

### Brownian dynamics simulations of a discrete chain

To numerically test our theoretical analysis, we developed Brownian dynamics simulations of a discretized Rouse chain, Eq. (1), where each of the *n* ∈ [1, *N*] monomers represents a Kuhn segment with characteristic length *b*. In our simulations, we use the Kuhn length *b* and the average diffusion coefficient *D*_{0} of a free monomer, which is related to the average activity *A*_{0} = ∑_{n} *A*_{n}*/N* via *D*_{0} = *A*_{0}*/*(6*ξ*), to define the stiffness of the springs connecting neighboring monomers, *κ/ξ* = 3*D*_{0}*/b*^{2}. Having discretized all fields of our continuum theory, the covariance matrix between the athermal excitations of different beads is then given by
where we have defined the Pearson correlation matrix ℂ_{nm} ∈ [−1, 1]. To implement more realistic polymer simulations, including self-avoidance and a hard confinement, we adapt the “polychrom” software package as described in the Methods. These simulations allow us to test the continued validity of results from our theory in a strongly nonlinear setting that is inaccessible to analytical calculations.

### Steady-state polymer conformation

We analytically solve the linearized active polymer model [Eq. (1)] via a Rouse mode decomposition (79), where and indicate Fourier transforms. For a compact notation, we concatenate all Rouse modes row-wise into a matrix ** R**(

*t*) with rows . Analogously, we define the random velocity mode matrix

**(**

*H**t*) with rows , which has zero mean and covariance ⟨

**(**

*H**t*) ·

*H*^{†}(

*t*′) :=

*C**δ*(

*t*−

*t*′)⟩. In the resulting Rouse mode dynamics, the response matrix

**encodes all material properties (Methods) and is diagonal for a homogeneous Rouse polymer [Eq. (1)],**

*J**J*

_{qk}=

*ξ*

^{−1}

*κq*

^{2}

*δ*

_{qk}. In contrast, athermal excitations that break translational invariance are characterized by off-diagonal entries in their covariance matrix 𝒞 (

*s, s*′) →

*C*

_{qk}. Although this coupling precludes an isolated analysis of individual Rouse modes, one can derive an exact expression for the long time limit of the second Rouse moment (Methods) as

Note, however, that the above equation describes polymer confirmations on average, which are liquid-like in the sense that ⟨** R**(

*t*) ·

*R*^{†}(

*t*) ⟩ ≫ ⟨

**(**

*R**t*) ⟩ · ⟨

*R*^{†}(

*t*) ⟩. The resulting “folding” is thus distinct from most proteins, which form globules with a well defined conformation, ⟨

**(**

*R**t*) ⟩ · ⟨

*R*^{†}(

*t*) ⟩ ∼ ⟨

**(**

*R**t*) ·

*R*^{†}(

*t*) ⟩. Nevertheless, our analysis reveals how inhomogeneous excitations alone can effectively give rise to

*patterned*polymer conformations by coupling different mechanical modes. To quantify these patterns, we transform Eq. (6) back into real space, which yields the spatial correlation between pairs of material points. Subsequently, we determine their pairwise mean squared separation and tangent autocorrelations.

## Results

### Local activity modulations induce long-range correlations akin to bending

To elucidate how active processes affect a polymer’s conformation, we first study a minimal scenario with inhomogeneous activity *A*(*s*) represented by *statistically independent* excitations 𝒞 (*s, s*′) = *ξ*^{−1} *A*(*s*) *δ*(*s* − *s*′). Simulations have shown that less active monomers localize closer to the boundary of a hard confinement than their more active counterparts (50, 51). This positioning trend reverses if the volume packing fraction is small or if the confinement is soft (51), and can be forcibly inverted by introducing selective monomer–boundary interactions (50), or self-attraction between inactive monomers (55). Thus, past theoretical work has shown that activity differences can reproduce nuclear positioning of (active) euchromatin and (inactive) heterochromatin. However, it is not yet clear how and why active processes affect *polymer shape*, a question that we now address.

Equation (6) predicts the preferred polymer conformation in response to any predetermined profile of activity [see Supporting Information 2 A.3 for Green’s function kernels]. As an example, we focus on sinusoidal activity modulations *A*(*s*) = *A*_{0} [1 + *ϵ* cos(*s/λ*)] around an average value *A*_{0}, with amplitude *ϵA*_{0} and characteristic length *λ*. These excitations elicit a spatial correlation between pairs of material points,
which we use to calculate the pairwise mean squared separation in terms of the Kuhn length [below the diagonal in Fig. 2B]. In comparison to a reference polymer with uniform activity (*ϵ* = 0), we observe that active polymer segments locally expand while inactive segments locally condense [above the diagonal in Fig. 2B], resembling the morphology of euchromatin and heterochromatin (81, 82). Thus, our theory suggests that active processes like transcription could lead to chromatin decondensation, which might form a positive feeback loop by further increasing genome accessibility to the transcription machinery (83). Indeed, it was shown experimentally that euchromatin requires ATP and thus dissipation of energy to preserve its expanded state (30).

Consistent with prior work (50, 52, 54, 55), we also find that pairs of active segments get farther apart while inactive segments come closer together [above the diagonal in Fig. 2B]. To investigate the shape changes associated with this segregation, we measure the pairwise alignment of tangent vectors, ** τ** (

*s, t*) := ∂

_{s}

**(**

*r**s, t*), at different material points (tangent autocorrelation),

The last term clearly demonstrates that local activity modulations induce correlations (effective pairwise couplings) between distant material points. Specifically, Figure 2C shows that increased activity within a segment of size *λ* effectively induces bending (anti-correlations between tangent vectors), while decreased activity leads to straightening (positive correlations between tangent vectors). Heuristically, active monomers “run away” from their inactive neighbors, effectively bringing these neighbors together into a loop-like conformation. Thus, we conclude that segmented activity variations lead to the emergence of spontaneous curvature.

The first term in Eq. (8) describes local compaction (or expansion) of the polymer backbone from reduced (or increased) activity. Similar changes in the local contour length could be induced by variations in tension, i.e. the spring constant *κ*(*s*). Therefore, it is natural to inquire if conformational changes due to activity variations can be equivalently achieved in a system in thermal equilibrium through variations in the local mechanical properties of the polymer. However, as shown in the Supporting Information 2 A.2, modulations in tension fail to produce any long-ranged tangent correlations akin to polymer bending. Changes in the spring stiffness only affect the distance between material points (bond length) and have no effect on the bond angles. Similarly, variations of the drag friction coefficient *ξ*(*s*), for example, due to differences in monomer size, do not change the polymer conformation (Supporting Information 2 A.1). This is because in equilibrium with a thermal heat bath, polymer conformations sample a Boltzmann distribution determined by a free energy, independent of dynamic effects such as friction. Thus, in the absence of long-ranged interactions, activity differences, which can act globally, are required to fold a Rouse polymer into specific conformations.

### Activity differences recapitulate A/B-compartments in simulated contact maps

We next apply our model to study the formation of A/B-compartments, a cornerstone of eukaryotic genome organization (12). To make a meaningful comparison to Hi-C data, we model a region of Chromosome 2 in murine erythroblast cells (80) as an active, self-avoiding polymer in a nuclear confinement (Methods), where each monomer corresponds to 25 kbp. We ask whether the compartmentalization observed in Ref. (80) [Fig. 2D, below the diagonal] can be reproduced purely by differences in the magnitude of athermal excitations in active (A) and inactive (B) regions. To that end, we derive the identities of A and B monomers from the data in Ref. (80), as discussed in the Methods. The simulated contact frequencies between pairs of monomers display a checkerboard pattern featuring strong B–B contacts, despite the lack of explicit attractive interactions between monomers [Fig. 2D, above the diagonal]. While A–A contacts in our simulations are weaker than in the data of Ref. (80), this could be remedied by including weak self-attraction among all monomers (13) or, as discussed in the next section, by introducing correlated excitations. Combining active and passive mechanisms of compartmentalization will invariably fit the data better (55), but the existence of A/B-compartments in our minimal model demonstrates that activity differences are capable of contributing to genome organization.

To explore how the degree of compartmentalization varies with the activity difference between A and B regions, we extract a scalar order parameter (“compartment score”, COMP) from both simulated and experimental contact maps (Methods). This compartment score, defined in Ref. (84), measures the contrast in the checkerboard pattern as the normalized contact frequency difference between same-type and differenttype chromatin, namely COMP = (AA + BB − 2AB)*/*(AA + BB+2AB). We find that the compartment score increases in a sigmoidal fashion with the activity difference between A and B regions [Fig. 2E], indicating a typical scale for onset of compartmentalization in our simulations. This activity difference scale depends on many parameters, including the A and B block sizes (59) and the capture radius used to construct the contact frequency map. In this particular example, we find nontrivial compartmentalization for activity differences as small as the average level of activity, *A*_{A} − *A*_{B} = *A*_{0} = (*A*_{A} + *A*_{B})*/*2. Note that in our analytic theory, which describes a phantom chain without volume exclusion, compartmentalization, as detected in the mean squared separation, is simply a linear response to the activity difference, Eq. (7).

Finally, the compartment score curve in Fig. 2E can be used to read off the activity difference required to reproduce A/B compartmentalization in a given Hi-C dataset. While the degree of chromatin compartmentalization will vary by cell type, a whole genome analysis of the murine erythroblast data of Ref. (80) [Fig. 2E] suggests a score of COMP ∼ 0.71, which corresponds to an activity ratio of *A*_{A}*/A*_{B} ≈ 6. We used this inferred activity ratio in the polymer simulations depicted above the diagonal in Fig. 2D.

While the activity ratio cannot be measured directly, one can use the monomer mean squared displacement, MSD(*t*) = *D*^{app} *t*^{α}, as a proxy. On sufficiently short time scales in a phantom Rouse chain, the ratio of anomalous diffusion coefficients in active and inactive regions is identically the activity ratio,. However, the value of *D*_{app} and *α* in active and inactive regions will depend on the observation time window and the microscopic properties of the chain, as shown in our nonlinear simulations (Supporting Information 3 C). Thus, our predicted activity ratio serves as an upper bound for the ratio of MSDs in A and B regions which can be extracted from measurements of euchromatic and heterochromatic motion (36).

### Correlated active processes create compartments

Experiments that track the movement of GFP-tagged histones have demonstrated spatial correlations in chromatin motion that depend on RNA polymerase II activity and on ATP (30, 31). While these experiments cannot relate spatial and genomic proximity, Ref. (85) has observed pairwise correlated movement of specific loci. Motivated by these experiments, we hypothesize that correlated movement could be driven by correlated active forces, which produce athermal excitations with a non-diagonal covariance 𝒞 (*s, s*′). This hypothesis is plausible if the active processes at distinct genomic locations are not completely independent.

To model sequence correlations, we define a heteropolymer with three types of monomers (+, −, and neutral). We then introduce a stochastic process that has opposing effects on + and − monomers, but does not affect neutral monomers. The choice of + and − as labels for the monomer type evokes the analogy of a charged heteropolymer (for example, a polyelectrolyte or polyampholyte) in a fluctuating electric field. In this example, the random electrical forces at a given time will point in the same direction for monomers of the same charge, but in opposite directions for monomers of opposing charge. Here, charge can be interpreted as any biochemical signature of a monomer (such as the methylation status of a chromosomal locus) that mediates a selective response to a long-ranged active control process.

Based on these ideas, we construct a sample Pearson correlation matrix for the excitations of a polymer with an alternating pattern of +, −, and neutral monomers [Fig. 3A]. Using this Pearson correlation matrix as input and assuming a homogeneous level of activity, see Eq. (4), we then perform Brownian dynamics simulations featuring self-avoidance and a spherical confinement. Figure 3B shows, below the diagonal, that contacts between loci driven by correlated excitations are enhanced, whereas contacts between anti-correlated loci are depleted. These folding patterns are accompanied by long-range correlations between the tangent vectors at distant material points, as shown in our theoretical calculations (Supporting Information 2 A.6). In addition to the pattern of correlations, we superimpose activity modulations, *A*_{n}, such that charged (±) monomers are active and neutral monomers are inactive. We find an enhancement of contacts between inactive regions, leading to a full checkerboard pattern [Fig. 3B, above the diagonal]. Thus, correlated active forces and activity differences offer complementary, non-equilibrium mechanisms for the formation of genomic compartments.

### The effects of correlated, non-equilibrium excitations on polymer shape can be recapitulated by an equilibrium model with intersegment interactions

The linear active polymer, driven by athermal Gaussian excitations with covariance , maintains a Gaussian steady-state conformation. The Gaussian steady state is fully characterized by its two-point correlation function ⟨** r**(

*s, t*) ·

**(**

*r**s*′,

*t*) ⟩, which can be computed in terms of the prescribed entries of 𝒞 (

*s, s*′). To illuminate the characteristics of this

*active folding*, we note that any Gaussian steady state can be regarded as the thermal equilibrium weight of a phantom polymer with additional Hookean springs that har-monically couple pairs of material points (18). After setting a temperature scale by

*A*

_{0}= 6

*k*

_{B}

*T*, one can obtain the harmonic couplings by inverting the two-point correlation function ⟨

**(**

*r**s, t*) ·

**(**

*r**s*′,

*t*)⟩, and as shown in the Methods,

The Green’s kernel has the most convenient representation in polar coordinates *G*^{K}(*α* cos *ϕ, α* sin *ϕ*) = − 6 cos(4*ϕ*)*/*(*πα*^{4}). We note that, while the active polymer folds due to constraints on the excitations (i.e. correlations), in the corresponding passive polymer constraints are introduced in the form of springs. These harmonic couplings can be positive or negative, mediating pairwise attraction or repulsion between distant material points. The first term of Eq. (9) indicates that pairs of polymer segments experiencing maximally correlated excitations will show effective pairwise attraction, while maximally anticorrelated excitations lead to repulsion. By using a box-shaped correlation function to study the second term of Eq. (9), we confirm that correlated excitations *in general* induce pairwise attraction [Fig. 3C], whereas anti-correlated excitations lead to repulsion.

These results can be heuristically understood via the dynamics of the end-to-end distance vector of, say, a trimer (*N* = 3). For such a trimer, fluctuations of the end-to-end distance Δ are driven by a noise term with variance ⟨[** η**Δ(

*t*)]

^{2}⟩ = ⟨[

*η*_{1}(

*t*) −

*η*_{3}(

*t*)]

^{2}⟩ = ⟨[

*η*_{1}(

*t*)]

^{2}⟩ + ⟨[

*η*_{3}(

*t*)] ⟩ − 2⟨

*η*_{1}(

*t*) ·

*η*_{3}(

*t*)⟩.

When the excitations are independent and equal in magnitude, one has ⟨ [** η**Δ(

*t*)]

^{2}⟩ = 2 ⟨ [

*η*_{1}(

*t*)]

^{2}⟩. In comparison, anticorrelated excitations (⟨

*η*_{1}(

*t*) ·

*η*_{3}(

*t*) ⟩ < 0) increase the variance of the end-to-end separation, whereas correlated excitations (⟨

*η*_{1}(

*t*) ·

*η*_{3}(

*t*) ⟩ > 0) cause the end points to come closer together. While these effects are diminished for longer polymers, they provide basic insights into the effective harmonic interactions that lead to the contact frequency map shown in Fig. 3B.

The existence of an analytical mapping between active and passive mechanisms of polymer folding suggests that many equivalent models could explain structural data on chromosomes. In this context, our passive linear model can be regarded as a harmonic approximation to a contact energy landscape, which has been exploited by past theoretical work on genome organization (14–17, 19–21). Our results explain the somewhat surprising success of these equilibrium models in reproducing Hi-C data despite the undeniable presence of active processes (86). As such, how can one experimentally disentangle equilibrium and non-equilibrium mechanisms of compartment formation? Based on static snapshots, our theory can be used to propose a candidate pattern of active processes that drive chromatin towards its observed steadystate conformation. Then, one can test if the inferred activity profile matches orthogonal experimental measurements, such as the DNA-binding patterns of active enzymes, or if passive folding mechanisms are more plausible.

### Inferring an activity profile from an ensemble-averaged polymer conformation

In the following, we infer a candidate map of athermal excitations that could fold a polymer into a desired (target) conformation. To that end, we use our linear theory to predict the activity of each monomer given the mean squared separation between all pairs of monomers [cf. workflow depicted in Fig. 4A]. We test our approach on artificial target conformations, which we generate via simulations of (nonlinear) polymers driven by activity modulations, with profiles corresponding to Fig. 2D and E.

As it is not clear how nonlinear constraints such as selfavoidance or a hard confinement translate into our linearized theory, we invoke no prior knowledge on the mechanical properties of the polymer. Instead, we adopt a data-driven approach and first determine an effective response matrix ** J** that approximates the mechanical properties of the simulated polymer (Supporting Information 4 A, 4 B). Next, we set up a numerical optimization scheme whereby we seek the activity profile [Fig. 4B] that minimizes the squared deviation between the mean squared separation map predicted by our theory and our artificial data (Supporting Information 4 C). Inference from linear theory captures the overall block-like structure of the mean squared separation map, but does not account for some of the finer qualitative features observed in the nonlinear simulations (Supporting Information 4 C). We hypothesize that these fitting results could be improved in future studies by introducing constraints on the analytical theory, or by considering excitations that have an effective correlation length in addition to activity modulations.

Correspondingly, the linear model successfully infers the structure of the activity profile used in our nonlinear simula-tions [Fig. 4B,C]. The inferred and original activity profiles sually appear similar and have comparable amplitudes *A*_{A}−*A*_{B} [Fig. 4B]. However, the inferred ratio of activity *A*_{A} */A*_{B} is systematically lower than in our simulations [Fig. 4D]. These results suggest that the lack of constraints in the linearized model makes it easier to create folding patterns than in our simulations. Despite these quantitative differences, this ap-proach successfully identifies active and inactive polymer seg-ments in all simulations that show pronounced folding patterns [Fig. 4C].

In closing, we propose a novel approach for the analysis of experimental data on chromatin. As a first test for plausibility, one could compare inferred activity profiles against orthogonal experimental measurements such as ChIP-Seq for ATPases or histone marks associated with active chromatin. After using our theory to find candidate segments with high predicted activity, it would be interesting to measure the effect of biochemical interventions that locally knock out energydissipating processes. By testing model predictions against experiments, one can then distinguish if a certain folding pattern is dominated by active processes or passive effects. Finally, the linear theory can be generalized to include both active processes and passive interactions. One could then test the signatures of this hybrid model by measuring the fluctuation dynamics of specific polymer segments (87).

## Discussion

Using our nonequilibrium polymer theory, we have demonstrated that differences in activity and correlations between athermal excitations at different loci can fold a polymer into specific shapes. These liquid-like conformations have a large variability, and are therefore only realized as a population average. Our model could thus be applicable to chromatin, which shows a much larger cell-to-cell variability than stable protein structures (88, 89). In this context, we zoom out to large length scales where chromatin behaves like a Rouse polymer (90–92), and molecular drivers of active processes [Fig. 1] produce effective athermal excitations. Our model then makes several testable predictions that could be further investigated with experiments and theory.

We predict that a local increase in activity should lead to chromatin decondensation [Fig. 2A,B]. The resulting increase in chromatin accessibility could further increase transcriptional activity (83), forming a nonlinear positive feedback loop. Such a feedback loop would, over time, stabilize an open chromatin structure in active regions. Furthermore, our model shows that a local decrease of activity leads to straightening of the polymer backbone, whereas a local increase of activity induces bending [Fig. 2C]. These results could explain an observation in recent simulations, which demonstrated that forces exerted by bound molecular motors can bend polymers into hairpins (67). Such zipped structures also arise in Hi-C maps (93–95) and in our simulated contact maps as jet-like, anti-diagonal features originating from small A regions and extending into the neighboring B compartment [Fig. 2D]. Experiments suggest that these structures could be formed by active loop extrusion (93), which is not explicitly accounted for in our model. Nevertheless, it is interesting that the *spontaneous* loops induced by a local hot spot of activity could still create jet-like features.

In addition to locally deforming the polymer backbone, activity differences lead to a “checkerboard” pattern indicative of compartmentalization. We show that the degree of A/B compartmentalization observed in experimental contact maps can be recapitulated in our simulations with modest activity ratios in the two-fold to ten-fold range [Fig. 2E]. To test if these activity ratios are plausible, one could measure the ratio of the anomalous diffusion coefficients of euchromatic (A) and heterochromatic (B) regions, as has been done in histone tracking experiments (36). Since euchromatin has a higher level of transcriptional activity, we may expect that it will exhibit faster (sub)diffusion than heterochromatin (36, 37). However, the effect of transcription on the subdiffusion of individual loci is controversial (34, 35). One way to explain these conflicting results is to note that diffusion is not only proportional to activity, which increases during transcription, but is also inversely proportional to friction, which increases when the transcriptional machinery binds to the promoter. Concomitant with this idea, it was observed that during transcription inhibition, the subdiffusion of DNA-bound histones increases after RNA polymerase II disassociates (36, 37), but decreases if RNA polymerase II remains bound (36). These observations can be reconciled with our theory, which shows that the polymer conformation is independent of friction changes and only responds to activity differences (Supporting Information 2 A.1).

Another test of activity-induced compartmentalization would be to perform Hi-C experiments after knocking out active processes. However, global ATP depletion runs the risk of glassifying the intracellular environment (96). Existing experiments with transcription inhibition show that A/B-compartments remain, but the contrast in the checkerboard pattern decreases (80, 97). It therefore seems plausible that transcription plays some role in compartmentalization, which must be clarified in future studies (98). We hypothesize that activity differences may complement known mechanisms of eu/heterochromatin segregation, including phase separation mediated by linker histone H1 and heterochromatin protein HP1-*α*, as well as association of heterochromatic domains to the nuclear lamina (10).

In addition, whole-nucleus histone tracking has shown that transcription and ATP-dependent processes are required for coherent motion of chromatin (30, 31). The mechanistic origin of correlated motion is under active research (55, 60, 61, 64, 65, 99, 100) and its consequences for chromatin folding are still unknown. Our theory predicts that regions exhibiting correlated motion will, when driven by sequence-specific athermal excitations, form compartments. This raises the question of whether coordinated transcription of enhancers and promoters could lead to contacts, or “micro-compartments”, as has been observed in increasingly high-resolution contact maps (101–103). However, further research is needed to investigate the relationship between coordinated transcriptional programs and correlated motion. To directly test our model assumptions, one could measure pairwise velocity correlations between specific loci, as a function of transcriptional state and sequence coordinate (85). For example, one could measure whether cis regulatory elements show correlated movement, and then test if transcription inhibition at one locus decorrelates this movement and the associated contacts. A similar procedure could be used to test whether the co-localization of co-regulated genes is a consequence of correlated active processes (70, 71, 104). For example, coordinated transcription factor binding events can initiate processes that break detailed balance (105, 106).

Overall, dynamic measurements will play a major role in disentangling active and passive mechanisms of chromatin folding, which we have demonstrated to be indistinguishable based on Hi-C data alone. In future work, we will identify dynamic signatures of activity (87) that can be extracted from trajectories of genomic loci (107) in order to examine the role of active processes in chromatin folding. As a complementary technique, we have here presented a proof of concept that an activity profile can be inferred from structural data on a polymer. In future work, we propose to extend this method to DNA-MERFISH data (108), which measures the pairwise mean squared separation of genetic loci along with other attributes such as transcriptional (108) and epigenetic state (109, 110).

In closing, we note that our theory also has broad implications beyond chromatin folding. We hypothesize that one could, for example, use electromagnetically driven colloidal (Janus) particles to engineer active polymers that fold into desired conformations (111), by applying targeted excitations through dynamic light patterns. Furthermore, on large length scales, our model shares its mathematical structure with mechanical models of membranes. Interestingly, it was recently shown that Min proteins can deform giant unilamellar vesicles by binding to the membrane (112). While it was hypothesized that Min proteins induce spontaneous curvature (113) and that this could even affect their binding kinetics (114), the underlying mechanism remains unknown. Our theory could provide some hints as to how reactions themselves could effectively induce spontaneous curvature in membranes. Overall, our analysis motivates future research into the role that active processes play in determining the conformation of a variety of pattern-forming systems.

## Methods

### Derivation of the steady-state polymer conformation

We now outline a brief derivation of the steady-state polymer conformation, starting from Eq. (5); a more elaborate calculation is provided in the Supporting Information 1 B. Given an unknown matrix-valued function ** H**(

*t*), which encodes a possible trajectory of the excitations, Eq. (5) is formally solved by where the first term vanishes in the limit

*t*−

*t*

_{0}→ ∞. We use the formal solution for a single trajectory of the Rouse matrix, Eq. (10), to determine the second Rouse moment, ⟨

**(**

*R**t*)·

*R*^{†}(

*t*)⟩, in response to athermal excitations with covariance ⟨

**(**

*H**t*)

*H*^{†}(

*t*′) ⟩ :=

*C**δ*(

*t*−

*t*′). To that end, we multiply Eq. (10) with its conjugate transpose and average over many trajectories.

Both in the limit of late times (*t* → ∞ for arbitrary *t*_{0}), or equivalently early reference times (*t*_{0} → − ∞ for arbitrary *t*), one then finds

Here and henceforth, we consider reference times that lie infinitely far in the past (*t*_{0} → − ∞), and therefore omit the time limit for brevity of notation. For polymers whose material properties such as line tension and friction are homogeneous, the response matrix is diagonal (*J*_{qk} ≡ *J*_{qq} *δ*_{qk}) and Eq. (6) evaluates to:

For a Rouse polymer (*J*_{qk} = *ξ*^{−1}*κq*^{2} *δ*_{qk}) that, as discussed in the main text, is driven by correlated athermal excitations with mode covariance , one has:
where the characteristic length is given by . In general, however, the response matrix will be non-diagonal and we need to use a perturbation approach (Supporting Information 1 E) to approximate the matrix exponentials in Eq. (6).

### Steady-state conformation of a passive polymer

We now consider the steady-state conformation of a polymer of length *L*, which is in thermal equilibrium with a heat bath of temperature *T*. The passive polymer is governed by reciprocal interactions and therefore has a Hermitian response matrix, ** J** =

*J*^{†}. In thermal equilibrium, the excitations are statistically independent and homogeneous, with covariance 𝒞 (

*s, s*′) =

*ξ*

^{−1}

*A*

_{0}

*δ*(

*s s*′) and activity

*A*

_{0}= 6

*k*

_{B}

*T*. Because different excitation modes are now independent,

*C*

_{qk}=

*ξ*

^{−1}

*A*

_{0}

*Lδ*

_{qk}, Eq. (6) evaluates to so that polymer folding can only be induced by non-diagonal elements in the response matrix

**. We herein assume that the response matrix is dominated by topological connectivity of neighboring material points and by homogeneous mechanical features of the polymer backbone. Therefore, we decompose the response matrix into a dominant diagonal contribution and a weak off-diagonal contribution , so that**

*J*To enforce polymer folding in our equilibrium model, we introduce additional Hookean springs to Eq. (1),

With as convention for Fourier transforms, the diagonal and off-diagonal components of the response matrix are given by and , respectively. Thus, Eq. (14) evaluates to:
where the characteristic length is given by , as before. Note that the homogeneous Fourier modes of the harmonic interaction map, *q* = 0 and *k* = 0, cancel out in the polymer’s Langevin equation (15), and thus also in the steadystate conformation, Eq. (16). Therefore, in the following we assume . Equating Eq. (16) and Eq. (12), and transforming back into real space, one obtains Eq. (9).

### Rouse polymer simulation details

To test the approximations in our analytical theory and visualize individual polymer conformations, we perform Brownian dynamics of a discretized Rouse chain without self-avoidance or confinement: https://github.com/kannandeepti/active-polymers. Since the average monomer diffusion coefficient *D*_{0} only rescales time and has no effect on the steady state conformation, we arbitrarily set *D*_{0} = 1 and the Kuhn length *b* = 1, such that all simulated length scales are now in units of Kuhn lengths. We integrate the discrete version of Eq. (1) using a first order Stochastic Runge Kutta scheme detailed in Ref. (115) with time step *h* = 0.01 chosen to be an order of magnitude smaller than the time to diffuse a Kuhn length, *b*^{2}*/*6*D*. We then run the simulation for a Rouse time to allow the polymer to reach steady state. The snapshot in Fig. 2A was taken from a simulation with monomer diffusion coefficient *D*_{n} = *D*_{0}[1 + *ϵ* cos(2*πn/λ*)], where *λ* = 25 Kuhn lengths and the total chain is 100 Kuhn lengths.

### Self-avoiding, confined polymer simulation details

We also develop more realistic polymer simulations by adapting the polychrom software package, a thin wrapper around OpenMM (116). The deterministic forces applied to the polymer are given by the following potential energy functions detailed in the `polychrom.forces` module: (1) Polymer connectivity via harmonic bonds with energy 0.5*κ*(*r*_{ij} − 1)^{2}, where *r*_{ij} is the distance between the centers of adjacent monomers with diameter *d* = 1, and *κ* is chosen such that the average extension of the bond is 0.1 when the harmonic energy is *k*_{B} *T*. (2) Spherical confinement with radius *r*_{C} of the form *f*_{C} [((*r*_{n} − *r*_{C})^{2} + *δ*^{2})^{1/2} −*δ*] if *r*_{n} *> r*_{C} and 0 if not, where *r*_{n} is the distance of the *n*^{th} monomer to the origin, *f*_{C} = 5*k*_{B} *T /d* is the confining force, and *δ* = 0.1 is some small number inserted to prevent rounding errors. The confinement energy is thus a smooth version of the function *f*_{C} |*r* − *r*_{c}|, where the radius *r*_{C} is chosen such that the total volume fraction of all monomers within the confinement is 0.117. (3) Repulsive short-ranged interactions via a Lennard-Jones like potential:
where *U*_{0} = 3*k*_{B} *T* represents a finite energy barrier to allow chain passing when *r*_{ij} < 0.6*d*. The stochastic forces are implemented according to Eq. (4) using custom Brownian integrators, as detailed in the `contrib/integrators` module of the GitHub repository: https://github.com/open2c/polychrom.

Steady state was determined by running the simulation until the monomer mean squared displacement plateaus at the squared radius of confinement, i.e. until each monomer has had enough time to explore its volume (10^{5} time steps). We then run each simulation for twice this equilibration time and sample 10 steady-state conformations from each run. For each set of parameters, we repeated this procedure over 200 independent simulation runs and thus computed average contact maps from an ensemble of 2000 snapshots. We define a contact as an inter-monomer separation that is less than two monomer diameters.

### Hi-C data processing and compartment identification

To make a meaningful comparison to Hi-C data, we model a region (35-60 Mbp) of Chromosome 2 in murine erythroblasts, a model eukaryotic cell line (80). We then derive the identities of active (A) and inactive (B) monomers in our simulated chain from the data in the following way. First, we iteratively correct the experimental contact map at 100 kilobase resolution such that the rows and columns sum to one, a process that removes experimental biases and ensures equal visibility of all loci (117). We then divide each diagonal of the experimental contact frequency map by its mean in order to produce an “observed over expected” map, which measures structure in the data beyond the average decay of contacts with genomic separation. Subtracting the mean from this “observed over expected” map yields a matrix where positive entries indicate enrichment of contacts above the mean and negative entries denote depletion of contacts below the mean. The first eigenvector (E1) of the resulting map captures the checkerboard pattern characteristic of A/B compartmentalization, and can thus be used to binarize the genome into active (A) and inactive (B) segments. Since the AB identities are determined up to a sign of the entries of E1, we “align” the E1 track to a binned profile of GC content, such that positive entries correlate with active chromatin and negative entries with inactive chromatin (117). Since compartments are typically measured at the 100kb resolution, it is sufficient to assign 4 monomers to each Hi-C bin, i.e. one monomer per 25 kilobases. Note that this resolution is well beyond the persistence length of chromatin, which is on the order of a kilobase (118), justifying the omission of bending rigidity in our simulations.

### Compartment scores

To quantify the degree of compartmen-talization observed in both experimental and simulated contact maps, we compute an order parameter—the compartment score. Specifically, we use the definition of the “COMP score 2” introduced in Ref. (84). We first process the simulated contact maps in the same way as the experimental data, i.e. via iterative correction and computation of E1 (see previous section). The rows and columns of the observed over expected map are sorted and binned by quantiles of the E1 track, such that the top left quadrant shows B-B contacts, the bottom right quadrant shows A-A contacts, and the off-diagonal quadrants show contacts between A and B regions. The COMP score is defined by averaging over the top 25% of contacts in each of the 4 quadrants and computing (*AA* + *BB* − 2*AB*)*/*(*AA* + *BB* + 2*AB*). The resulting score is 0.0 if there is no difference between the contact frequencies of same-type and different-type chromatin, and 1.0 if A and B regions are perfectly demixed.

## AUTHOR CONTRIBUTIONS

A.G., D.K., M.K., and A.K.C. designed research. A.G. carried out analytical calculations and related numerical implementation. D.K. implemented and carried out Brownian dynamics simulations. All authors discussed and analyzed the results and wrote the manuscript.

## AUTHOR DECLARATION

The authors declare that there are no competing interests. But, for completeness it is noted that A.K.C serves as a consultant (titled Academic Partner) for Flagship Pioneering and its affiliated companies, Apriori Bio and FL72.

## ACKNOWLEDGMENTS

We thank Richard A. Young, Phillip A. Sharp, Anders S. Hansen, and Leonid Mirny for helpful discussions and for critical reading of the manuscript. This work was supported by the National Science Foundation, through the Biophysics of Nuclear Condensates grant (MCB-2044895) and the Graduate Research Fellowship Program under grant No. 2141064.

## References

- 1.↵
- 2.↵
- 3.↵
- 4.
- 5.↵
- 6.↵
- 7.↵
- 8.
- 9.↵
- 10.↵
- 11.↵
- 12.↵
- 13.↵
- 14.↵
- 15.
- 16.
- 17.
- 18.↵
- 19.
- 20.
- 21.
- 22.↵
- 23.↵
- 24.↵
- 25.↵
- 26.↵
- 27.↵
- 28.↵
- 29.↵
- 30.↵
- 31.↵
- 32.↵
- 33.↵
- 34.↵
- 35.↵
- 36.↵
- 37.↵
- 38.↵
- 39.↵
- 40.
- 41.
- 42.
- 43.
- 44.
- 45.
- 46.
- 47.
- 48.
- 49.↵
- 50.↵
- 51.↵
- 52.↵
- 53.
- 54.↵
- 55.↵
- 56.↵
- 57.↵
- 58.↵
- 59.↵
- 60.↵
- 61.
- 62.↵
- 63.↵
- 64.
- 65.
- 66.↵
- 67.↵
- 68.↵
- 69.↵
- 70.↵
- 71.↵
- 72.↵
- 73.↵
- 74.
- 75.↵
- 76.↵
- 77.↵
- 78.↵
- 79.↵
- 80.↵
- 81.↵
- 82.↵
- 83.↵
- 84.↵
- 85.↵
- 86.↵
- 87.↵
- 88.↵
- 89.↵
- 90.↵
- 91.
- 92.↵
- 93.↵
- 94.
- 95.↵
- 96.↵
- 97.↵
- 98.↵
- 99.
- 100.
- 101.↵
- 102.
- 103.↵
- 104.↵
- 105.↵
- 106.↵
- 107.↵
- 108.↵
- 109.↵
- 110.↵
- 111.↵
- 112.↵
- 113.↵
- 114.↵
- 115.↵
- 116.↵
- 117.↵
- 118.↵